Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add article about dependency groups #138

Closed
wants to merge 6 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions articles/102_dependency_groups.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
---
layout: default
title: Dependency groups
permalink: articles/dependency_groups.html
abstract:
This article describes the concept of dependency groups, their use cases, how they can described in the package manifest, and how they can be mapped into binary packages.
author: '[Dirk Thomas](https://github.com/dirk-thomas)'
published: true
categories: Overview
---

{:toc}

# {{ page.title }}

<div class="abstract" markdown="1">
{{ page.abstract }}
</div>

Original Author: {{ page.author }}

## Preface

In the ROS ecosystem a ROS package provides information about its dependencies in the manifest.
The latest format is specified in the [REP 140](http://www.ros.org/reps/rep-0140.html).
It supports different kind of dependencies (`build`, `exec`, `test`, `doc`, etc.).
But each dependency is mandatory and needs to satisfied on its own.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reword as:

It supports different kinds of dependencies (`build`, `exec`, `test`, `doc`, etc.), but each dependency is mandatory and must be satisfied on its own.


This article describes the concept of "dependency groups" which could be used for different purposes.
In terms of terminology a dependency group is a set of dependencies identified by the name of the group.
For such a group a different semantic can then be applied (instead of just satisfy each dependency on its own).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

satisfy -> satisfying


## Use Cases

The following subsections will describe two different use cases how dependencies could be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reword as:

The following subsections will describe how dependencies could be used in two different use cases.


### (1) Need at least one "Provider"

A package might want to declare that at least one dependency from a group needs to be present (called `satisfy-at-least-one`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how's the behavior if two packages declare the same dependency group?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each one is evaluated on its own. Not sure I get what you are aiming for with the question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvmd. I just read the examples which clarified my question.

Compared to "normal" dependencies it is not required that all dependencies of that group need to be present to satisfy the dependency group.

An example for this is the `rmw_implementation` package.
Currently it depends on all RMW implementation packages (e.g. `rmw_fastrtps_cpp`, `rmw_connext_cpp`, etc.) in order to ensure that it may only be built after all RMW implementation packages have been built.
The goal would be that at least one RMW implementation needs to be present in order to build or use the `rmw_implementation` package.
It would be acceptable if more than one or even all RMW implementations are available but that is not required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this imply the responsibility of the user that the semantics of the build does not change when a second item of the group is found afterwards? I am imagining the scenario where we find only one RMW, start building with the "shortcut" version, then figuring out a second RMW impl is available, but rmw_implementation is already setup wo/ dlopen support. Does this make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From source all rmw impl. in the workspace will be build before the rmw_implementation package. If the workspace only contains one and later you build another workspace on top with another rmw implementation that wouldn't change rmw_implementation.

When building Debian packages at least one needs to be available. But more could be available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After getting a bit more context offline, I got a few more insights which resolves my question here. The key point was that still all packages are traversed by ament before the first package is built.


In case multiple package declare to be part of that dependency group the `rmw_implementation` package might want to provide an ordered list of preferred names which should be considered in that order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change this to:

If multiple packages are declared as part of that dependency group the `rmw_implementation` package might want to provide an ordered list of preferred names which should be considered in that order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of prefer the "active" wording "the package declaring" rather than "are declared".


### (2) All packages of a specific "type"

Another use case is that a dependency group doesn't need to be explicitly defined by all dependency names (called `satisfy-all`).
Instead a dependency group is defined by an identifier and the expectation is that all packages which declare that they are being part of that group should be available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am struggling following this. If not by an identifier (and I assume this could be a string here) how do you declare a dependency group in use case 1)? What is the difference between for example a group rmw_implementations and a group message_packages ?

Copy link
Member Author

@dirk-thomas dirk-thomas Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both groups are just defined by the name used in a group_depend tag. Any package can then declare to be a member of it. There is no difference between use case 1 and 2 in terms of how the groups are defined - only the criteria how they are satisfied is different.

Maybe I don't understand your question and you can rephrase it?


An example for this is the `ros1_bridge` package.
Currently it depends on a manually selected set of message packages.
The goal would be that it can declare that it depends on *all* packages which provide messages.

Since the `ros1_bridge` manifest only contains the name of the group dependency, it doesn't control which packages declare that they are part of that group.
It might want to exclude specific packages to be considered even though they are in the group.

## Processes

The following two processes need to be update to consider dependency groups.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buildfarm will also require changes won't it? It uses the rosdistro cache to determine build job ordering separately from the build or release tooling.

Setting aside potential inconsistencies between package group members at bloom-time and group members in the current rosdistro cache:

  • For satisfy-all we would need to trigger a build whenever any upstream group member is rebuilt.

  • For satisfy-at-least-one, in the cases like rmw implementations, we would want all group members to be available at build time (bloom would likely list them all as Build Depends so that support for them is compiled in. Regardless of which is chosen at install-time) which means we can use the same behavior as satisfy-all.

But I don't see how the buildfarm would be able to choose which group members to include or not if there was a reason that all members could not be part of the same build (for example if two dependencies were mutually exclusive).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added mentioning ros_buildfarm in 0534486.

But I don't see how the buildfarm would be able to choose which group members to include or not if there was a reason that all members could not be part of the same build (for example if two dependencies were mutually exclusive).

If some of the group members conflict with each other satisfy-at-least-one could fall back to remove a package from the set (instead of using all). Which one to be removed can be decided based on a preferred list and a deterministic order (e.g. alphabetical).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nuclearsandwich Out of curiousity, what situations are you thinking of with two mutually exclusive dependencies? It's not that I can't imagine it ever happening, but more like I would mostly consider it a bug in the packaging/metadata if it did happen. I'm just wondering in what scenarios you see it happening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: need to be updated


### Build tool

The build tool needs to consider the declared dependency groups when computing the topological order.
Since it operates on a set packages in a workspace it has all manifests for all of these packages available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: set of packages?

Independent of the semantic of the dependency group (satisfy-all vs. satisfy-at-least-one) the build tool can simply expand a dependency group to all packages of the set.

### Release tool

Since the release tool `bloom` operates on a single package it can't resolve a dependency group without further information.
It will need all package manifests available from `rosdistro` to resolve a dependency group to a set of packages.

For the `satisfy-all` semantic it can simply generate dependencies on all packages from the group.
For the `satisfy-at-least-one` semantic the release tool could either use a platform specific mechanism like `provides` in Debian control files or simply expand the group to all packages (same as `satisfy-all`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a concern with using the satisfy semantic to determine whether or not the release tool uses a platform packaging mechanic like provides: From my read of the document thus far, there is no distinction in the group membership declaration between a group which is meant to be used with satisfy-at-least-one (Use case (1)) and a group intended for use with satisfy-all (Use case (2))

The Provides mechanism is used on the group member packages, not on the depending packages which is where the distinction between satisfy-all and satisfy-at-least-one is made.

Groups intended for both use case (1) and (2) are not distinguished and would by extension share the same release template in bloom (or any other per-package release tool) so virtual packages generated by the use of Provides in Debian would be created for both groups of rmw implementations and groups of msg containing packages.

I think this is only a serious problem if creating virtual packages for use case (2) groups is undesirable rather than superfluous.

The Debian policy manual has more info on how virtual packages and provides works. I notice that virtual packages cannot accept version constraints nor can a package provide a specific version of a virtual package. That may or may not be important down the line. It might be worth looking at other package specifications group and provider mechanics, particularly RPM to make sure that overlapping these two concerns will work on other platforms too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my read of the document thus far, there is no distinction in the group membership declaration between a group which is meant to be used with satisfy-at-least-one (Use case (1)) and a group intended for use with satisfy-all (Use case (2))

The group dependency will have an attribute which identifies which satisfy semantic is desired: see example https://github.com/ros2/design/pull/138/files#diff-c8d974466de0d80b2299df802e7b1d7bR132

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependent package specifies the desired semantic but that is unknown the to the group members themselves: https://github.com/ros2/design/blame/0534486af1398a3410159130183ed4c918dcdbbd/articles/102_dependency_groups.md#L141

Also, one package may have <group_depend satisfy="all">i_am_a_rmw_implementation</group_depend> while another may depend on the same group with satisfy-at-least-one: <group_depend satisfy="at-least-one">i_am_a_rmw_implementation</group_depend>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming bloom decides to generate a Provides entry for the member_of_group tag using the name of the dependency name. In that case the package declaring the group_depend tag will still use the satisfy attribute to either expand all group members (in case of all) or depend on the group name (in case of at-least-one) which will then rely on the Provides to be present.

I don't see a problem with this since the package which needs to make the semantic decision does have the satisfy attribute. The members of the group simply do always the same thing (generate Provides with the group name, if anybody uses it with at-least-one or not is not relevant) since they are not aware of the way they are being used.

Maybe I am missing your point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The members of the group simply do always the same thing

Ah, because the mention of the provides behavior was only in the paragraph about any/at-least-one it makes it seem like the release tool needed to know the difference. Perhaps the section about group member behavior should be pulled above of the section about dependency resolution for each use case.

This brings up a secondary concern about whether this overlapping group mechanic will be suitable for packaging systems other than Debian. From my reading of the debian documentation there's no ill effects from a package providing a virtual package that isn't used. I mentioned previously that I'd take a look at Homebrew's mechanisms for this and will do so first thing tomorrow.

This design document doesn't aim to specify what the "best" behavior for the release tool is since the decision is likely platform dependent.

Since the release tool expands the dependency groups when being invoked, packages which join the group after that release won't be used until the package defining the group dependency is being re-released.

### rosinstall_generator

The `rosinstall_generator` already utilized all manifests from the `rosdistro` to decide which packages / repositories should be fetched.
This design document doesn't aim to specify what the "best" behavior for this tool is.
It could either match the behavior of the build tool or of the release tool or even something different.

## Information in the Manifest

To provide the necessary metadata for the build and release tools the following information must be provided by the manifest:

* (A) A specific package declares a group dependency and uses a name to identify the group.
The dependency can be of any type specified in [REP 140](http://www.ros.org/reps/rep-0140.html): e.g. `build`, `exec`, `test`, `doc`, etc.

* (B) The semantic of the group dependency needs to be defined (in case (1) `satisfy-at-least-one`, in case (2) `satify-all`).

* (C) A list of preferred / excluded packages can be enumerated for each dependency group.

* (D) Any package can declare that it is part of a group by name.

### Possible Format Extensions

The group dependency (A) could be specified in two ways:

* (A.1) Using the existing `*_depend` tags and adding an attribute to the tag to make it a group dependency.
* (A.2) Defining new `*_group_depend` tags which specifically identify group dependencies.

Considering the existing API in `catkin_pkg` which exposes the different dependencies as members of the `Package` class (A.1) would require existing code to distinguish the type of the dependency (group vs. no-group).
On the other hand (A.2) will simply add additional members which doesn't require existing code to change until it wants to support handling group dependencies.

The semantic of the group dependency (B) can be modeled as an attribute of the group dependency tag.

Each preferred / excluded package (C) name can be listed in child-tags of the dependency group to keep the information atomic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exclusions and preferences are mentioned here and above, but there are no examples in the Example section, and I didn't see a mention of it in REP 0149. It would be nice to see examples of that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the scope of this document I would rather keep it as-is with a text description. The XML could look like this and will be specified in the REP once we have a consensus about the direction:

<group_depend>
  group_name
  <prefer>try_this_first</prefer>
  <prefer>try_this_second</prefer>
  <exclude>do_not_use_that</exclude>
</group_depend>


The group membership (D) could be specified in two ways:

* (D.1) Adding a new tag to the `export` section: e.g. `member_of_group`.
* (D.2) Defining a new tag in the `package` section: e.g. `member_of_group`.

#### Example for use case (1)

A package declaring a group dependency:

```
<package>
<name>rmw_implementation</name>
<group_depend satisfy="at-least-one">i_am_a_rmw_implementation</group_depend>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since at-least-one is pretty verbose I was considering any (based on the Python function). Does that sound reasonable to native speakers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I like any better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'any' and 'all' would make sense. The other thing not discussed here would be 'one', where it would require one and only one from the group, but we can also scope that out. That's sometimes done where there's multiple implementations but they're not side by side installable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, any is probably fine. I originally advised for the more specific key phrase, as I felt any could mean zero as well, e.g. in regex * is any and + is at least one.

</package>
```

A package declaring membership of a group:

```
<package>
<name>rmw_fastrtps_cpp</name>
<member_of_group>i_am_a_rmw_implementation</member_of_group>
</package>
```

#### Example for use case (2)

A package declaring a group dependency:

```
<package>
<name>ros1_bridge</name>
<group_depend satisfy="all">i_am_a_msg_pkg</group_depend>
</package>
```

A package declaring membership of a group:

```
<package>
<name>std_msgs</name>
<export>
<member_of_group>i_am_a_msg_pkg</member_of_group>
</export>
</package>
```

## Extensions for the Package Manifest

The modified schema for the manifest will be specified in [REP 149](https://github.com/ros-infrastructure/rep/pull/138).