diff --git a/.gitignore b/.gitignore index 4c3bc3c4d..82ae0a379 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,5 @@ rep-0000.rst *.html *.pyc -*~ \ No newline at end of file +*~ +.python-version diff --git a/rep-2011.rst b/rep-2011.rst new file mode 100644 index 000000000..05d1ceb29 --- /dev/null +++ b/rep-2011.rst @@ -0,0 +1,948 @@ +REP: 2011 +Title: Evolving Message Types, and Other ROS Interface Types, Over Time +Author: William Woodall , Brandon Ong , You Liang Tan , Kyle Marcey +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 30-Nov-2021 +Post-History: + + +Abstract +======== + +This REP proposes patterns and approaches for evolving message, service, and action ROS interface types over time. + +The proposed patterns use the existing default serialization technology, i.e. CDR, in combination with new tooling to detect when types have changes and new tooling to convert between versions. +However, it should be possible to use features provided by different, perhaps future, serialization technologies in addition to the approaches proposed in this REP. + +Specifically, this REP proposes that we provide tooling to convert from old versions of types to newer versions of types on demand, using user-defined transfer functions. +This approach is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. + +This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the full type descriptions on the wire, beyond their name and version, and use that description to introspect the values of it at runtime using reflection. +The type description design is described in detail in REP-2016\ [#rep2016]_. + +This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies. +This is true so long as we have the ability to introspect the messages at runtime without prior knowledge of the type description, which is a feature we also need for generic introspection tools like ``rosbag2`` and ``ros2 topic echo``. + +Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support runtime introspection of messages. +It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification\ [1]_, and an interface similar to DDS-XTypes will likely be adopted in the ROS 2 middleware API, without explicitly relying on it. + +Alternatives are also discussed in more detail. + +Motivation +========== + +Evolving types over time is a necessary part of developing a new system using ROS 2, as well as a necessary part of evolving ROS 2 itself over time. +As needs change, types often need to change as well to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. + +To understand this need, it's useful to consider some examples, like this message definition: + +.. code:: + + # Temperature.msg + uint64 timestamp + int32 temperature + +There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from a prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int32 to float64. + +There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields, they may add a new optional field like this (pseudo code): + +.. code:: + + # Temperature.msg + uint64 timestamp + int32 temperature + optional float64 temperature_float + +Updated code could use it like this (pseudo code): + +.. code:: + + void on_temperature(const Temperature & temp) { + // ... + double actual_temp = 0.0; + if (temp.temperature_float_is_set()) { + actual_temp = temp.temperature_float(); + } else { + actual_temp = static_cast(temp.temperature()); + } + } + +This is not uncommon to see in projects, and it has the advantage that old data, whether it is from a program running on an older machine or is from old recorded data, can be interpreted by newer code without additional machinery or features beyond the optional field type. +Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, which is very forgiving for developers in a way. +However, as projects grow and the number of incidents like the one described in the example above increase, this kind of approach can generate a lot of complexity to sending and receiving code. + +You can imagine other serialization features, like inheritance or implicit type conversion, could be used to address this desired change, but in each case it relies on a feature of the serialization technology which may or may not be available in all middleware implementations for ROS 2. + +Each of these approaches also requires a more sophisticated message API for the users than what is currently provided by ROS 2. +At the moment the user can access all fields of the message directly, i.e. "member based access" vs "method based access", and that would need to change in some way to use some of these features. + +A different approach would be to simply update the type as follows: + +.. code:: + + # Temperature.msg + uint64 timestamp + float64 temperature + +And also update any code publishing or subscribing to this type at the same time. +This is typically what happens right now in ROS 2, and also what happened historically in ROS 1. +This approach is simple and requires no additional serialization features, but obviously doesn't do anything on its own to help developers evolve their system while maintaining support for already deployed code and utilizing existing recorded data. + +However, this REP proposes that with additional tooling these cases can be handled without the code of the application knowing about it. +Consider again the above example, but this time the update to the message was done without backwards compatibility in the message definition. +This means that publishing old recorded data, for example, will not work with new code that subscribes to the topic, because the new subscription expects the new version of the message. +For the purpose of this example, let's call the original message type ``Temperature`` and the one using ``float64`` we'll call ``Temperature'``. +So, if you have rosbag2 publishing ``Temperature`` messages and a program consuming ``Temperature'`` messages they will not communicate, unless you have an intermediate program doing the translation. + +.. code:: + + ┌─────────┐ publishes ┌──────────────┐ publishes ┌─────┐ + │ rosbag2 ├───────────────►│transfer func.├──────────────►│ App │ + └─────────┘ Temperature └──────────────┘ Temperature' └─────┘ + +The "transfer function" can be user-defined, or for simple changes, like changing the field type to a compatible type, it can be done automatically. +We already do something like this for the "ROS 1 to ROS 2 bridge" in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. + +.. TODO:: + + cite the ros1_bridge rules and the rosbag migration rules + +The transfer functions require the ability to have a single application which can interact with both the old and the new versions of a message at the same time. +Making this possible requires several new technical features for ROS 2, and some new infrastructure and tooling +However, by keeping the conversion logic contained in these transfer functions, it has the advantage of keeping both the publishing and subscribing code simple. +That is to say, it keeps both the publishing and subscribing code agnostic to the fact that there are other versions of the message, and it keeps the message type from being cluttered with vestigial fields, e.g. having both a ``temperature`` and ``temperature_float`` in the same message. + +As stated before, problems created by changing these ROS interfaces can usually be solved by more than one way, either using some feature like optional fields or by just breaking compatibility directly. +However, the strategy used usually depends on the features that the serialization technology being used offers. +ROS 2 has special considerations on this topic because it can support different serialization technologies, and though CDR is the default and most common right now, others could be used in the future. +Therefore, it is neither desirable to depend on features of a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide. +In either case, that would tie ROS 2 to specific serialization technologies, and that should be avoided if possible. + +That being said, this proposal will require some specific features from the middleware and serialization technology, but the goal is to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suits them. + +With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as rationales and considered alternatives in the Rationale section and its sub-sections. + +Terminology +=========== + +TODO + +- Type Version + + +Specification +============= + +The proposal is to provide tooling to help users: + +- identify when messages have changed +- configure their system to convert between versions of messages on demand +- write the code needed to convert between types when the conversion is not trivial + +In order to do this, this proposal describes how ROS 2 can be changed to support these tools by: + +- enforcing type compatibility by version + + - by detecting type mismatches via type hashes communicated from other endpoints, and + - warning users when type enforcement is preventing two endpoints from communicating + +- making it possible to publish and subscribe to topics using just the type description + + - even when the type was not available at compile time + - and introspecting the values from a serialized message using just the type description + +This Specification section covers the conceptual overview in more detail, then describes each of the technical changes needed in ROS 2, and then finishes by describing the new tooling that will help users in the aforementioned ways. + +Conceptual Overview +------------------- + +ROS 2 will provide a "type hash" for an interface. +This type hash is used by ROS 2 to determine if the same type name with different type descriptions are being used on the same topic, so that a warning may be logged that endpoints that do not match may not communicate. + +.. note:: + + An exception to this rule is that if the underlying middleware has more sophisticated rules for matching types, for example the type has been extended with an optional field, then they may still match + In that case, ROS 2 will defer to the middleware and not produce warnings when the type hashes do not match. + Instead, ROS 2 will rely on the middleware to notify it when two endpoints do not match based on their types not being compatible, so that a warning can be produced. + +When a mismatch is detected, the user can use user-defined or automatically generated generic "transfer functions" to convert between versions of the type until it is in the type version they wish to send or receive. +They can use a tool that will look at a catalogue of available transfer functions to find a single transfer function, or a set of transfer functions, to get from the current type version to the desired type version. + +.. code:: + + ┌───────────────────────┐ + ┌───────────────┐ │ Implicit Conversion │ ┌───────────────┐ + │Message@current├────►│ by ├───►│Message@desired│ + └───────────────┘ │ Generic Transfer Func.│ └───────────────┘ + └───────────────────────┘ + + or + + ┌───────────────────────┐ + ┌───────────────┐ │ Implicit Conversion │ ┌────────────────────┐ + │Message@current├────►│ by ├───►│Message@intermediate│ + └───────────────┘ │ Generic Transfer Func.│ └────────────────────┘ + └───────────────────────┘ + + or + + ┌───────────────────────┐ + ┌───────────────┐ │ User-defined transfer │ ┌───────────────┐ + │Message@current├────►│ function from current ├───...───►│Message@desired│ + └───────────────┘ │to desired/intermediate│ ▲ └───────────────┘ + └───────────────────────┘ │ + │ + possibly other transfer functions + +The tool will start with the current type version and see if it can be automatically converted to the desired type version, or if it is accepted as an input to any user-defined transfer functions or if it can be automatically converted into one of the input type versions for the transfer functions. +It will continue to do this until it reaches the desired type version or it fails to find a path from the current to the desired type version. + +.. code:: + + ┌──────────────────────┐ /topic ┌─────────────────────────┐ + │Publisher├────────X────────►│Subscription│ + └──────────────────────┘ └─────────────────────────┘ + │ │ │ + remap publisher │ │ │ and add transfer function + ▼ ▼ ▼ + ┌──────────────────────┐ ┌─────────────────────────┐ + │Publisher│ │Subscription│ + └─┬────────────────────┘ └─────────────────────────┘ + │ ▲ + │ ┌─────────────────────────────────┐ │ + ✓ /topic/ABC │ Transfer Functions for ABC->XYZ │ /topic ✓ + │ │ │ │ + │ ┌──────────┴──────────────┐ ┌──────────────┴───────┐ │ + └─►│Subscription│ │Publisher├──┘ + └──────────┬──────────────┘ └──────────────┬───────┘ + │ │ + └─────────────────────────────────┘ + +Once the set of necessary transfer functions has been identified, the ROS graph can be changed to have one side of the topic be remapped onto a new topic name which indicates it is of a different version than what is desired, and then the transfer function can be run as a component node which subscribes to one version of the message, performs the conversion using the chain of transfer functions, and then publishes the other version of the message. +Tools will assist the user in making these remappings and running the necessary component nodes with the appropriate configurations, either from their launch file or from the command line. + +.. TODO:: + + discuss the implications for large messages and the possibility of having the transfer functions be colocated with either the publisher or subscription more directly than with component nodes and remapping. + +Once the mismatched messages are flowing through the transfer functions, communication should be possible and neither the publishing side nor the subscribing side have any specific knowledge of the conversions taking place or that any conversions are necessary. + +.. TODO:: + + Extend conceptual overview to describe how this will work with Services and Actions. + Services, since they are not sensitive to the many-to-many (many publisher) issue, unlike Topics, and because they do not have as many QoS settings that apply to them, they can probably have transfer functions that are plugins, rather than separate component nodes that repeat the service call, like the ros1_bridge. + Actions will be a combination of topics and services, but will have other considerations in the tooling. + +In order to support this vision, these missing features will need to be added into ROS 2 (which were also mentioned in the introduction): + +- enforcing type compatibility by version +- making it possible to publish and subscribe to topics using just the type description + +These features are described in the following sections. + +Type Version Enforcement +------------------------ + +Enforcing Type Version +^^^^^^^^^^^^^^^^^^^^^^ + +The Type Hash may be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. +Again, whether or not it is used will depend on the underlying middleware and how it determines if types are compatible between endpoints. +Simpler middlewares may not do anything other than check that the type names match, in which case the hash will likely be used as an extra comparison to determine compatibility. +However, in more sophisticated middlewares type compatibility can be determined using more complex rules and by looking at the type descriptions themselves, and in those cases the type hash may not be used to determine matching. + +When creating a publisher or subscription, the caller normally provides: + +- a topic name, +- QoS settings, and +- a topic type name + +Where the topic type name is represented as a string and is automatically deduced based on the type given to the function that creates the entity. +The type may be passed as a template parameter in C++ or as an argument to the function in Python. + +For example, creating a publisher for the C++ type ``std_msgs::msg::String`` using ``rclcpp`` may result in a topic type name ``"std_msgs/msg/String"``. + +All of the above items are used by the middleware to determine if two endpoints should communicate or not, and this REP proposes that the type version be added to this list of provided information. + +Nothing needs to change from the user's perspective, as the type version can be extracted automatically based on the topic type given, either at the ``rcl`` layer or in the ``rmw`` implementation itself. +That is to say, how users create things like publishers and subscription should not need to change, no matter which programming language is being used. + +However, the type hash would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription. +The decision of whether or not to use that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. + +The method for implementing the detection and enforcement of type version mismatches is left up to the middleware. +Some middlewares will have tools to handle this without the type hash and others will implement something like what would be possible in the ``rcl`` and above layers using the type hash. +By keeping this a detail of the ``rmw`` implementation, we allow the ``rmw`` implementations to make optimizations where they can. + +Recommended Strategy for Enforcing that Type Versions Match +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If the middleware has a feature to handle type compatibility already, as is the case with DDS-XTypes which is discussed later, then that can be used to enforce type safety, and then the type hash would only be used for debugging and for storing in recordings. + +However, if the middleware lacks this kind of feature, then the recommended strategy for accomplishing this in the ``rmw`` implementation is to simply concatenate the type name and the type hash with double underscores and then use that as the type name given to the underlying middleware. +For example, a type name using this approach may look like this: + +.. code:: + + sensor_msgs/msg/Image__RIHS01_XXXXXXXXXXXXXXXXXXXX + +This has the benefit of "just working" for most middlewares which at least match based on the name of the type, and it is simple, requiring no further custom hooks into the middleware's discovery or matchmaking process. + +However, one downside with this approach is that it makes interoperation between ROS 2 and the "native" middleware more difficult, as appending the type hash to the type name is just "one more thing" that you have to contend with when trying to connect non-ROS endpoints to a ROS graph. + +Alternative Strategy for Enforcing that Type Versions Match +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes the recommended strategy would interfere with features of the middleware that allow for more complex type compatibility rules, or otherwise interferes with the function of the underlying middleware. +In these cases, it is appropriate to not take the recommended strategy and delegate the matching process and notification entirely to the middleware. + +However, in order to increase compatibility between rmw implementations, it is recommended to fulfill the recommended approach whenever possible, even if the type name is not used in determining if two endpoints will match, i.e. in the case that the underlying middleware does something more sophisticated to determine type compatibility but ignores the type name itself. + +This recommendation is particularly useful in the case where a DDS based ROS 2 endpoint is talking to another DDS-XTypes based ROS 2 endpoint. +The DDS-XTypes based endpoint then has a chance to "gracefully degrade" to interoperate with the basic DDS based ROS 2 endpoint. +This would not be the case if the DDS-XTypes based ROS 2 endpoint did not include the type hash in the type name, as is suggested with the recommended strategy. + +.. TODO:: + + We need to confirm whether or not having a different type name prevents DDS-XTypes from working properly. + We've done some experiments, but we need to summarize the results and confirm the recommendation in the REP specifically geared towards DDS. + +Notifying the ROS Client Library +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One potential downside to delegating type matching to the rmw implementation is that detecting the mismatch is more complicated. +If ROS 2 is to provide users a warning that two endpoints will not communicate due to their types not matching, it requires there to be a way for the middleware to notify the ROS layer when a topic is not matched due to the type incompatibility. +As some of the following sections describe, it might be that the rules by which the middleware decides on type compatibility are unknown to ROS 2, and so the middleware has to indicate when matches are and are not made. + +If the middleware just uses the type name to determine compatibility, then the rmw implementation can compare type hashes, and if they do not match between endpoints then the rmw implementation can notify ROS 2, and a warning can be produced. + +Either way, to facilitate this notice, the ``rmw_event_type_t`` shall be extended to include a new event called ``RMW_EVENT_OFFERED_TYPE_INCOMPATIBLE``. +Related functions and structures will also be updated so that the event can be associated with specific endpoints. + +Notes for Implementing the Recommended Strategy with DDS +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The DDS standard provides an ``on_inconsistent_topic()`` method on the ``ParticipantListener`` class as a callback function which is called when an ``INCONSISTENT_TOPIC`` event is detected. +This event occurs when the topic type does not match between endpoints, and can be used for this purpose, but at the time of writing (July 2022), this feature is not supported across all of the DDS vendors that can be used by ROS 2. + +Interactions with DDS-XTypes or Similar Implicit Middleware Features +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When using DDS-Xtypes, type compatibility is determined through sophisticated and configurable rules, allowing for things like extensible types, optional fields, implicit conversions, and even inheritance. +Which of these features is supported for use with ROS is out of scope with this REP, but if any of them are in use, then it may be possible for two endpoints to match and communicate even if their ROS 2 type hashes do not match. + +In this situation the middleware is responsible for communicating to the rmw layer when an endpoint will not be matched due to type incompatibility. +The ``INCONSISTENT_TOPIC`` event in DDS applies for DDS-XTypes as well, and should be useful in fulfilling this requirement. + +Run-Time Interface Reflection +----------------------------- + +Run-Time Interface Reflection allows access to the data in the fields of a serialized message buffer when given: + +- the serialized message as a ``rmw_serialized_message_t``, basically just an array of bytes as a ``rcutils_uint8_array_t``, +- the message's type description, e.g. received from the ``GetTypeDescription`` service or from a bag file, and +- the serialization format, name and version, which was used to create the serialized message, e.g. ``XCDR2`` for ``Extended CDR encoding version 2`` + +From these inputs, we should be able to access the fields of the message from the serialized message buffer using some programmatic interface, which allows you to: + +- list the field names +- list the field types +- access fields by name or index + +.. code:: + + message_buffer ─┐ ┌───────────────────────────────┐ + │ │ │ + message_description ─┼──►│ Run-Time Interface Reflection ├───► Introspection API + │ │ │ + serialization_format ─┘ └───────────────────────────────┘ + +Given that the scope of inputs and expected outputs is so limited, this feature should ideally be implemented as a separate package, e.g. ``rcl_serialization``, that can be called independently by any downstream packages that might need Run-Time Interface Reflection, e.g. introspection tools, rosbag transport, etc. +This feature can then be combined with the ability to detect type mismatches and obtain type descriptions to facilitate communication between nodes of otherwise incompatible types. + +Additionally, it is important to note that this feature is distinct from ROS 2's existing "dynamic" type support (``rosidl_typesupport_introspection_c`` and ``rosidl_typesupport_introspection_cpp``). +The ``rosidl_typesupport_introspection_c*`` generators generate code at compile time for known types that provides reflection for those types. +This new feature, Run-Time Interface Reflection, will support reflection without generated code at compile time, instead dynamically interpreting the type description to provide this reflection. + +.. TODO:: + + Determine if the generated introspection API needs to continue to exist. + It might be that we just remove it entirely in favor of the new system described here, or at least merge them, as their API (after initialization) will probably be the same. + +.. TODO:: + + (wjwwood) I'm realizing now that we probably need to separate this section into two parts, first the reflection API used by the user and the rmw implementation, and then second the rmw implementation specific part, which we could call "Run-Time Type Support"? + It would be called such because it would be something like "TypeDescription in -> ``rosidl_message_type_support_t`` out"... + This second part is needed to make truly generic subscriptions and publishers, which until now has been kind of assumed in this section about reflection. + +.. TODO:: + + This section needs to be updated to be inclusive to at least Services, and then we can also mention Actions, though they are a combination of Topics and Services and so don't need special support probably. + +Plugin Interface +^^^^^^^^^^^^^^^^ + +As Run-Time Interface Reflection is expected to work across any serialization format, the Run-Time Interface Reflection interface needs to be extensible so that the necessary serialization libraries can be loaded to process the serialized message. +Serialization format support in this case will be provided by writing plugins that wrap the serialization libraries that can then provide the Run-Time Interface Reflection feature with the needed facilities. +Therefore, when initializing a Run-Time Interface Reflection instance it will: + +- enumerate the supported serialization library plugins in the environment, +- match the given serialization format to appropriate plugins, +- select a plugin if more than one matches the criteria, and then +- dynamically load the plugin for use + +These serialization library plugins must implement an interface which provides methods to: + +- determine if the plugin can be used for a given serialization type and version, +- provide an API for reflection of a specific type, given the type's description, + + - mainly including the parsed ``TypeDescription`` instance, but also + - perhaps the original ``.idl`` / ``.msg`` text, and + - perhaps other extra information, + +- provide a ``rosidl_message_type_support_t`` instance given similar information from the previous point + +In particular, providing information beyond the ``TypeDescription``, like the ``.idl`` / ``.msg`` text and the serialization library that was used, may be necessary because there might be serialization library or type support specific steps or considerations (e.g. name mangling or ROS specific namespacing) that would not necessarily be captured in the ``.idl`` / ``.msg`` file. + +Dealing with Multiple Applicable Plugins for A Serialization Format +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +In the case where there exists multiple applicable plugins for a particular serialization format (e.g. when a user's machine has a plugin for both RTI Connext's CDR library and Fast-CDR), the plugin matching should follow this priority order: + +- a user specified override, passed to the matching function, or +- a default defined in the plugin matching function, or else +- the first suitable plugin in alphanumeric sorting order + +.. TODO:: + + (wjwwood) We should mention here that the (implicitly read-only) reflection described here is just one logical half of what we could do with the type description. + We could also provide the ability to dynamically define types and/or write to types using the reflection. + + For example, imagine a tool that subscribes to any type (that's just using the RTIR) but then modifies one of the fields by name and then re-publishes it. + That tool would need the ability to mutate an instance using reflection then serialize it to a buffer. + You couldn't just iterate with an existing buffer easily, because it would potentially need to grow the buffer. + + Also, consider a "time stamping" tool that subscribes to any message, and then on-the-fly defines a new message that has a timestamp field and a field with the original message in it, fills that out and publishes it. + That would need the ability to both create a new type from nothing (maybe no more than creating a ``TypeDescription`` instance) as well as a read-write interface to the reflection. + We don't have to support these things in this REP, but we should at least mention them and state if it is in or out of the scope of the REP. + +Tools for Evolving Types +------------------------ + +The previous sections of the specification have all be working up to supporting the new tools described in this section. + +The goal of these new tools is to help users reason about type versions of ROS interfaces, define transfer functions between two versions of a type when necessary, and put the transfer functions to use in their system to handle changes in types that have occurred. + +These tools will come in the form of new APIs, integrations with the build system, command-line tools, and integration with existing concepts like launch files. + +This section will describe a vision of what is possible with some new tools, but it should not be considered completely prescriptive. +That is to say, after this REP is accepted, tools described in this section may continue to evolve and improve, and even more tools not described here may be added to enable more things. +As always, consider the latest documentation for the various pieces of the reference implementation to get the most accurate details. + +Notifying Users when Types Do Not Match +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There should be a warning to users when two endpoints (publisher/subscription or Service server/client, etc.) on the same topic (or service/action) are using different type names or versions and therefore will not communicate. +This warning should be a console logging message, but it should also be possible for the user to get a callback in their own application when this occurs. +How this should happen in the API is described in the Type Version Enforcement section, and it uses the "QoS Event" part of the API. + +The warning should include important information, like the GUID of the endpoints involved, the type name(s) and type hash(es), and of course the topic name. +These pieces of information can then be used by the user, with the above mentioned changes to the command line tools, to investigate what is happening and why the types do not match. +The warning may even suggest how to compare the types using the previously described ``ros2 topic compare`` command-line tool. + +Tools for Determining if Two Type Versions are Convertible +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There should be a way for a user to determine if it is possible to convert between two versions of a type without writing a transfer function. +This tool would look at the two type descriptions and determine if a conversion can be done automatically with a generated transfer function. +This will help the user know if they need to write a transfer function or check if one already exists. + +This tool might look something like this: + +- ``ros2 interface convertible --from-remote-node --to-local-type `` + +This tool would query the type description from a remote node, and compare it to the type description of the second type version found locally. +If the two types can be converted without writing any code, then this tool would indicate that on the ``stdout`` and by return code. + +There could be ``--from-local-type`` and ``--to-remote-node`` options as well as others too. + +.. TODO:: + + (wjwwood) we need to come back to this section when we start working on the reference implementation, as more details will be clearer then + +Tools for Writing User-Defined Transfer Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To help users write their own transfer functions, when conversions cannot be done automatically, we need to provide them with build system support and API support in various languages. + +The transfer functions will be defined (at least) as a C++ plugin, in the style of component nodes, rviz display plugins, and other instances. +These transfer function plugins can then be loaded and chained together as needed inside of a component node. + +.. TODO:: + + (wjwwood) we need to figure out if this is how we're gonna handle Services, or if instead they will be plugins rather than component nodes, but component nodes with remapping solves a lot of QoS related issues that arise when you think about instead making the conversions happen as a plugin on the publisher or subscription side with topics. + +At the very least we should: + +- provide a convenient way to build and install transfer functions from CMake projects +- provide an API to concisely define transfer functions in C/C++ +- document how transfer functions are installed, discovered, and run + +Documenting how this works will allow support for other build systems and programming languages can be added in the future. +It will also allow users that do not wish to use our helper functions (and the dependencies that come with them) in their projects, e.g. if they would like to use plain CMake and not depend on things like ``ament_cmake``. + +The process will look something like this: + +- compile the transfer function into a library +- put an entry into the ``ament_index`` which includes the transfer functions details, like which type and versions it converts between, and which library contains it + +Putting it into a library allows us to later load the transfer function, perhaps along side other transfer functions, into a single component node which subscribes to the input type topic and publishes to the output type topic. + +The interface for a transfer function will look like a modified Subscription callback, receiving a single ROS message (or request or response in the case of Services) as input and returning a single ROS message as output. +The input will use the Run-Time Interface Reflection API and therefore will not be a standard message structure, e.g. ``std_msgs::msg::String`` in C++, but the output may be a concrete structure if that type is available when compiling the transfer function. + +The transfer function API for C++ may look something like this: + +.. code:: + + void + my_transfer_function( + const DynamicMessage & input, + const MessageInfo & input_info, // type name/hash, etc. + DynamicMessage & output, + const MessageInfo & output_info) + { + // ... conversion details + } + + REGISTER_TRANSFER_FUNCTION(my_transfer_function) + +Registering a transfer function in an ``ament_cmake`` project might look like this: + +.. code:: + + create_transfer_function( + src/my_transfer_function.cpp + FUNCTION_NAME my_transfer_function + FROM sensor_msgs/msg/Image RIHS01_XXXXXXXXXXXXXXXXX123 + TO sensor_msgs/msg/Image RIHS01_XXXXXXXXXXXXXXXXXabc + ) + +This CMake function would create the library target, link the necessary libraries, and install any files in the ``ament_index`` needed to make the transfer function discoverable by the tools. +This example is the for the simplest case, which we should make easy, but other cases like supporting multiple transfer functions per library, creating the target manually, or even supporting other programming languages would require more complex interfaces too. + +.. TODO:: + + (wjwwood) come back with more details of this interface as the reference implementation progresses + +Tools for Interacting with Available Transfer Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There should be a way to list and get information about the available transfer functions in your local workspace. +This tool would query the ``ament_index`` and list the available transfer functions found therein. + +The tool might look like this: + +- ``ros2 interface transfer_functions list`` +- ``ros2 interface transfer_functions info `` + +The tool might also have options to filter based on the type name, package name providing the type, package name providing the transfer function (not necessarily the same as the package which provides the type itself), etc. + +.. TODO:: + + (wjwwood) not clear to me if the transfer functions should have names or not. What if you have more than one transfer function for a type versions pair, e.g. more than one conversion for sensor_msgs/msg/Image from ABC to XYZ. I'm not sure why this would happen, but it is technically possible since the packages register them separately. + +Tools for Using Transfer Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Additionally, there should be a tool which will look at the available transfer functions, including the possible automatic conversions, and start the node that will do the conversions if there exists a set of transfer function which would convert between two different versions of a type. + +It might look like this: + +.. code:: + + ros2 interface convert_topic_types \ + --from /topic/old \ + --to /topic \ + --component-container + +This tool would find the set of transfer functions between the types on ``/topic/old`` and ``/topic``, assuming they are the same type at different versions and that there exists the necessary transfer functions, and load them into a component container as a component node. + +The tool could run a stand-alone node instance if provided a ``--stand-alone`` option instead of the ``--component-container`` option. + +It could also have options like ``--check`` which would check if it could convert the types or not, giving detailed information about why it cannot convert between them if it is not possible. +This could help users understand what transfer functions they need to write to bridge the gap. + +This tool would also potentially need to let the user decide which path to take if there exists multiple chains of transfer functions from one version to the other. +For example, if the types are ``Type@ABC`` and ``Type@XYZ``, and there are transfer functions for ``Type@ABC -> Type@DEF``, ``Type@DEF -> Type@XYZ``, ``Type@ABC -> Type@XYZ``, and possibly other paths, then it might need help from the user to know which path to take. +By default it might choose the shortest path and then if there's a tie, one arbitrarily. + +There would be similar versions of this tool for Services and Actions as well. + +Integration with Launch +^^^^^^^^^^^^^^^^^^^^^^^ + +With the previously described tool ``ros2 interface convert_topic_types``, you could simply remap topics and execute an instance of it with launch: + +.. code:: + + + + + + + + + +However, we can provide some "syntactic sugar" to make it easier, which might look like this: + +.. code:: + + + + + + + + +That syntactic sugar can do a lot of things, like: + +- remap the topic to something new, like ``/topic/XXX`` +- run the node its enclosed in, wait for the topic to come up to check the version +- run any transfer functions in a node to make the conversion from ``/topic/XXX`` to ``/topic`` + +This could also be extended to support component nodes, rather than stand alone nodes, etc. + +Integration with ros2bag +^^^^^^^^^^^^^^^^^^^^^^^^ + +Similar to the integration with launch, you could run ``ros2 bag play ...`` and the ``ros2 interface convert_topic_types ...`` separately, but we could also provide an option to rosbag itself, which might look like this: + +.. code:: + + ros2 bag play /path/to/bag --convert-to-local-type /topic + +That option would handle the conversion on the fly using the same mechanisms as the command line tool. + +Rationale +========= + +This section captures further rationales for why the specification is the way it is, and also offers summaries of alternatives considered but not selected for various parts of the specification. + +Type Version Enforcement +------------------------ + +Alternatives +^^^^^^^^^^^^ + +Evolving Message Definition with Extensible Type +"""""""""""""""""""""""""""""""""""""""""""""""" + +When defining the ``.idl`` msg file, user can choose to apply annotations to the message definition (DDS XTypes spec v1.3: 7.3.1.2 Annotation Language). +Evolving message type can be achieved by leveraging optional fields and inheritance. +For example, the ``Temperature.idl`` below uses ``@optional`` and ``@extensibility`` in the message definition. + +.. code:: + + @extensibility(APPENDABLE) + struct Temperature + { + unsigned long long timestamp + long long temperature + @optional double temperature_float + }; + +Furthermore, an initial test evolving messages with FastDDS, Cyclone, and Connext middleware implementations show that ``@appendable`` and ``@optional`` are implemented in Cyclone and Connext, but not FastDDS (as of Jul 2022). + +.. TODO:: + + (wjwwood) we need to follow up on this + +Handle Detection of Version Mismatch "Above" rmw Layer +"""""""""""""""""""""""""""""""""""""""""""""""""""""" + +We can choose to utilize ``USER_DATA`` QoS to distribute the type hash during the discovery phase. +The type hash for each participant will then be accessible across all available nodes. By getting the hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. + +Prevent Communication of Mismatched Versions "Above" rmw Layer +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +TODO + +Run-Time Interface Reflection +----------------------------- + +TODO + +Plugin Matching and Loading API Example +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following is an example of how this plugin matching and loading interface could look like, defining new ``rcl`` interfaces; with a plugin wrapping FastCDR v1.0.24 for serialization of ``sensor_msgs/msg/LaserScan`` messages: + +.. code:: + + // Suppose LaserScanDescription reports that it uses FastCDR v1.0.24 for its serialization + rcl_runtime_introspection_description_t LaserScanDescription = node->get_type_description("/scan"); + + rcl_type_introspection_t * introspection_handle; + introspection_handle->init(); // Locate local plugins here + + // Plugin name: "fastcdr_v1_0_24" + const char * plugin_name = introspection_handle->match_plugin(LaserScanDescription->get_serialization_format()); + rcl_serialization_plugin_t * plugin = introspection_handle->load_plugin(plugin_name); + + // If we wanted to force the use of MicroCDR instead + introspection_handle->match_plugin(LaserScanDescription->get_serialization_format(), "microcdr"); + +Run-Time Interface Reflection API Example +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following is an example for how the introspection API could look like. +This example will show a read-only interface. + +It should comprise several components +- a handler for the message buffer, to handle pre-processing (e.g. decompression) +- a handler for the message description, to keep track of message field names of arbitrary nesting level +- handler functions for message buffer introspection + +Also, this example uses the LaserScan message definition: https://github.com/ros2/common_interfaces/blob/foxy/sensor_msgs/msg/LaserScan.msg + +.. TODO:: + + (methylDragon) Add a reference somehow? + +First, the message buffer handler: + +.. code:: + + struct rcl_buffer_handle_t { + const void * buffer; // The buffer should not be modified + + const char * serialization_type; + rcl_serialization_plugin_t * serialization_plugin; + rcl_runtime_introspection_description_t * description; // Convenient to have + + // And some examples of whatever else might be needed to support deserialization or introspection... + void * serialization_impl; + } + +The message buffer handler should allocate new memory if necessary, or store a pointer to the message buffer otherwise in its ``buffer`` member. + +Then, functions should be written that allow for convenient traversal of the type description tree. +These functions should allow a user to get the field names and field types of the top level type, as well as from any nested types. + +.. code:: + + struct rcl_field_info_t { // Mirroring Field + const char * field_name; // This should be an absolute address (e.g. "header.seq", instead of "seq") + + uint8_t type; + const char * nested_type_name; // Populated if the type is not primitive + }; + + // Get descriptions + rcl_runtime_introspection_description_t LaserScanDescription = node->get_type_description("/scan"); + rcl_runtime_introspection_description_t HeaderDescription = node->get_referenced_description(LaserScanDescription, "Header"); + + // All top-level fields from description + rcl_field_info_t ** fields = get_field_infos(&LaserScanDescription); + + // A single field from description + rcl_field_info_t * header_field = get_field_info(&LaserScanDescription, "header"); + + // A single field from a referenced description + rcl_field_info_t * stamp_field = get_field_info(&HeaderDescription, "stamp"); + + // A nested field from top-level description + rcl_field_info_t * stamp_field = get_field_info(&LaserScanDescription, "header.stamp"); + +Finally, there should be functions to obtain the data stored in the message fields. +This could be by value or by reference, depending on what the serialization library supports, for different types. + +There minimally needs to be a family of functions to obtain data stored in a single primitive message field, no matter how deeply nested it is. +These need to be created for each primitive type. + +The rest of the type introspection machinery can then be built on top of that family of functions, in layers higher than the C API. + +.. code:: + + rcl_buffer_handle_t * scan_buffer = node->get_processed_buffer(some_raw_buffer); + + // Top-level primitive field + get_primitive_field_float32(scan_buffer, "scan_time"); + + // Nested primitive field + get_primitive_field_uint32_seq(scan_buffer, "header.seq"); + + // Nested primitive field sequence element (overloaded) + get_field_seq_length(scan_buffer, "header.seq"); // Support function + get_primitive_field_uint32(scan_buffer, "header.seq", 0); + +If we attempt to do the same by reference, the plugin might decide to allocate new memory for the pointer, or return a pointer to existing memory. + +.. code:: + + // Nested primitive field + get_primitive_field_uint32_seq_ptr(scan_buffer, "header.seq"); + + // Be sure to clean up any dangling pointers + finalize_field(some_field_data_ptr); + +The following should be error cases: + +- accessing field data as incorrect type +- accessing or introspecting incorrect/nonexistent field names + +.. TODO: (methylDragon) Are there more cases? It feels like there are... + +- the raw message buffer should outlive the ``rcl_buffer_handle_t``, since it is not guaranteed that the buffer handle will allocate new memory +- the ``rcl_buffer_handle_t`` should outlive any returned field data pointers, since it is not guaranteed that the serialization plugin will allocate new memory +- however, ``rcl_field_info_t`` objects **do not** have any lifecycle dependencies, since they are merely descriptors + +Alternatives +^^^^^^^^^^^^ + +Name of Run-Time Interface Reflection +""""""""""""""""""""""""""""""""""""" + +Other names were considered, like "Runtime Type Introspection", but this name was selected for three main reasons: + +- to avoid confusion with C++'s Run-time type information (RTTI)\ [3]_, +- to show it was more than RTTI, but instead was also reflection, + + - like how the RTTR C++ library\ [4]_ differs from normal RTTI, and + +- to show that it deals not just with any "type" but specifically ROS's Interfaces + + +Backwards Compatibility +======================= + +TODO + + +Feature Progress +================ + +Supporting Feature Development: + +- Type Version Enforcement + + - [ ] Add rmw API for notifying user when a topic endpoint was not matched due to type mismatch + + - [ ] Detect when Message versions do not match, prevent matching, notify user: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Detect when Service versions do not match, prevent matching, notify user: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Detect when Action versions do not match, prevent matching, notify user: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + +- Run-time Interface Reflection: + + - [ ] Prototype "description of type" -> "accessing data fields from buffer" for each rmw vendor: + + - [ ] Fast-DDS + - [ ] CycloneDDS + - [ ] RTI Connext Pro + + - [ ] Prototype "description of type" -> "create datareaders/writers" for each rmw vendor: + + - [ ] Fast-DDS + - [ ] CycloneDDS + - [ ] RTI Connext Pro + + - [ ] Implement reflection API in ``rcl_serialization`` package, providing (de)serialization with just a description + - [ ] Implement single "plugin" using Fast-CDR (tentatively) + - [ ] Implement plugin system, allowing for multiple backends for (X)CDR + - [ ] Implement at least one other (X)CDR backend, perhaps based on RTI Connext Pro + + - [ ] Add "description of type" -> ``rosidl_message_type_support_t`` to ``rmw`` API + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Add "description of type" -> ``rosidl_service_type_support_t`` to ``rmw`` API + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Update APIs for rclcpp to support using TypeDescription instances to create entities: + + - [ ] Topics + - [ ] Services + - [ ] Actions + + - [ ] Update APIs for rclpy to support using TypeDescription instances to create entities: + + - [ ] Topics + - [ ] Services + - [ ] Actions + +Tooling Development: + +- [ ] Command-line tool for determining if two versions of a type are convertible + +- [ ] CMake and C++ APIs for writing transfer functions, including usage documentation +- [ ] Command-line tools for interacting with user-defined transfer functions +- [ ] Command-line tool for starting a conversion node between two versions of a type + +- [ ] Integration with launch, making it easier to set up conversions in launch files + +- [ ] Integration with rosbag2 + + - [ ] Use recorded type description metadata to create subscriptions without needing the type to be built locally + - [ ] Use recorded metadata to create publishers from descriptions in bag + + +References +========== + +.. [#rep2016] REP 2016: ROS 2 Interface Type Descriptions + (https://www.ros.org/reps/rep-2016.html) + +.. [1] DDS-XTYPES 1.3 + (https://www.omg.org/spec/DDS-XTypes/1.3/About-DDS-XTypes/) + +.. [2] IDL - Interface Definition and Language Mapping + (http://design.ros2.org/articles/idl_interface_definition.html) + +.. [3] Run-time type information + (https://en.wikipedia.org/wiki/Run-time_type_information) + +.. [4] RTTR (Run Time Type Reflection): An open source library, which adds reflection to C++ + (https://www.rttr.org/) + + +Copyright +========= + +This document has been placed in the public domain. + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: