From df394d97c8433b125a722de8b2456ac80b0ee34d Mon Sep 17 00:00:00 2001 From: William Woodall Date: Tue, 15 Feb 2022 15:55:21 -0800 Subject: [PATCH 01/22] WIP Signed-off-by: William Woodall --- rep-2010.rst | 162 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 162 insertions(+) create mode 100644 rep-2010.rst diff --git a/rep-2010.rst b/rep-2010.rst new file mode 100644 index 000000000..362e380bc --- /dev/null +++ b/rep-2010.rst @@ -0,0 +1,162 @@ +REP: 2010 +Title: Evolving Message Types, and Other ROS Interface Types, Over Time +Author: William Woodall +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 30-Nov-2021 +Post-History: + + +Abstract +======== + +This REP proposes patterns and approaches for evolving message, service, and action ROS interface types over time. + +The proposed patterns use the existing default serialization technology, i.e. CDR, in combination with new tooling. +However, technologies provided by different, perhaps future, serialization technologies can always be used in addition. + +Specifically, this REP proposes that interpreting older versions of the message at runtime, using its type information, and converting it to the current version of the message using user defined transfer functions, is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. +This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the type information of the messages on the wire, beyond their name and version, and use that to dynamically interpret it at runtime. + +This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies so long as we have the ability to introspect the messages at runtime without prior knowledge of the type information, which a feature we also need for generic introspection tools. + +Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support dynamic interpretation of messages at runtime. +It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification, and will likely use it to implement the needed interfaces in the ROS 2 middleware API. + +Alternatives are also discussed in more detail. + +Motivation +========== + +Evolving types over time is a necessary part of developing a new system using ROS 2 as well as a necessary part of evolving ROS 2 itself over time. +As needs change types often need to change to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. + +To understand this need, it's useful to look at a few common scenarios, for example consider this message definition (using the "rosidl" ``.msg`` format for convenience, but it could just as easily be the OMG IDL ``.idl`` format or something like protobuf): + +.. code:: + + # Temperature.msg + uint64 timestamp + int64 temperature + +There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int64 to float64. +There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields they may add a new optional field like this (pseudo code): + +.. code:: + + # Temperature.msg + uint64 timestamp + int64 temperature + optional float64 temperature_float + +Updated code could use it like this: + +.. code:: + + void on_temperature(const Temperature & temp) { + // ... + double actual_temp = 0.0; + if (temp.temperature_float_is_set()) { + actual_temp = temp.temperature_float(); + } else { + actual_temp = static_cast(temp.temperature()); + } + } + +This is not uncommon to see in projects, and it has the advantage that old data, whether it is from a program running on an older machine or is from old recorded data, can be interpreted by newer code without additional machinery or features beyond the optional field type. +Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, though that wouldn't be good long term most likely, so it is very forgiving to the code that uses the message definition. + +You can imagine other serialization features, like inheritance or type coercion, could be used to address this desired change, but in each case it relies on a feature of the serialization technology which may or may not be available in all middleware implementations for ROS 2. + +Each of these approaches also requires a more sophisticated message API for the users than what is currently provided by ROS 2. +At the moment the user can access all fields of the message directly, i.e. "member based access" vs "method based access", and that would need to change in some way to use these features. + +A different approach would be to simply update the type as follows: + +.. code:: + + # Temperature.msg + uint64 timestamp + float64 temperature + +And also update any code publishing or subscribing to this type at the same time. +This is typically what happens right now in ROS 2, and also what happened historically in ROS 1. +This approach is simple and requires no additional serialization features, but obviously doesn't do anything on its own to help developers evolve their system while maintaining support for already deployed code and utilizing existing recorded data. + +However, which additional tooling these cases can be handled without the code of the application knowing about. +Consider again the above example, but this time the update to the message was done without backwards compatibility in the message definition. +This means that publishing old recorded data, for example, will not work with new code that subscribes to the topic but expects the new version of the message. +For the purpose of this example, let's call the original message type ``Temperature`` and the one using ``float64`` we'll call ``Temperature'``. +So, if you have rosbag2 publishing ``Temperature`` messages and a program consuming ``Temperature'`` messages they will not communicate, unless you have an intermediate program doing the translation. + +.. code:: + + ┌─────────┐ publishes ┌──────────────┐ publishes ┌─────┐ + │ rosbag2 ├───────────────►│transfer func.├──────────────►│ App │ + └─────────┘ Temperature └──────────────┘ Temperature' └─────┘ + +The "transfer function" can be user-defined, or for simple takes (like changing the field type) it can be done automatically. +We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and we used to do something like this for rosbags in ROS 1 with the "bag migration rules" feature. + +.. TODO:: cite the above + +This approach requires a few features, like the ability to have a single application read old and new versions of a message at the same time, and it requires more infrastructure and tooling to make it work, but it has the advantage of keeping both the publishing and subscribing code simple (agnostic to the fact that there are other versions of the message) and it keeps the message type from being cluttered with vestigial fields. + +Either way, a problem can usually be solved by changing a message in some of, if not all, of the of the above mentioned ways, and is often influenced by what the underlying technology allows for or encourages. +ROS 2 has special considerations on this topic because it can support different serialization technologies, though CDR is the default and most common right now, and those technologies have different capabilities. +It is neither desirable to depend on features of a specific technology, therefore tying ROS 2 to a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide, again tying ROS 2 to some specific technologies through their features. + +We will require some kind of features from the middleware and serialization technology, however, to handle evolving interfaces, but we should try to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. + +With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as a rationale in the Rationale section and a discussion of alternatives in the Alternatives section and its sub-sections. + +Terminology +=========== + +TODO + + +Specification +============= + +TODO + + +Rationale +========= + +TODO + +Backwards Compatibility +======================= + +TODO + +Feature Progress +================ + +TODO + + +References +========== + +.. [1] DDS-XTYPES 1.3 + (https://www.omg.org/spec/DDS-XTypes/1.3/About-DDS-XTypes/) + + +Copyright +========= + +This document has been placed in the public domain. + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: From c355ca6cf77f99ad1ef6ba6d18600f692a31140a Mon Sep 17 00:00:00 2001 From: William Woodall Date: Wed, 9 Mar 2022 09:07:21 -0800 Subject: [PATCH 02/22] still working Signed-off-by: William Woodall --- rep-2010.rst | 90 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 78 insertions(+), 12 deletions(-) diff --git a/rep-2010.rst b/rep-2010.rst index 362e380bc..a19ff999d 100644 --- a/rep-2010.rst +++ b/rep-2010.rst @@ -19,7 +19,7 @@ However, technologies provided by different, perhaps future, serialization techn Specifically, this REP proposes that interpreting older versions of the message at runtime, using its type information, and converting it to the current version of the message using user defined transfer functions, is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the type information of the messages on the wire, beyond their name and version, and use that to dynamically interpret it at runtime. -This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies so long as we have the ability to introspect the messages at runtime without prior knowledge of the type information, which a feature we also need for generic introspection tools. +This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies so long as we have the ability to introspect the messages at runtime without prior knowledge of the type information, which is a feature we also need for generic introspection tools. Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support dynamic interpretation of messages at runtime. It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification, and will likely use it to implement the needed interfaces in the ROS 2 middleware API. @@ -30,9 +30,9 @@ Motivation ========== Evolving types over time is a necessary part of developing a new system using ROS 2 as well as a necessary part of evolving ROS 2 itself over time. -As needs change types often need to change to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. +As needs change, types often need to change to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. -To understand this need, it's useful to look at a few common scenarios, for example consider this message definition (using the "rosidl" ``.msg`` format for convenience, but it could just as easily be the OMG IDL ``.idl`` format or something like protobuf): +To understand this need, it's useful to look at a few common scenarios, for example consider this message definition (using the "rosidl" ``.msg`` format for convenience, but it could just as easily be the OMG IDL ``.idl`` format or something else like protobuf): .. code:: @@ -40,7 +40,7 @@ To understand this need, it's useful to look at a few common scenarios, for exam uint64 timestamp int64 temperature -There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int64 to float64. +There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from a prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int64 to float64. There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields they may add a new optional field like this (pseudo code): .. code:: @@ -65,12 +65,12 @@ Updated code could use it like this: } This is not uncommon to see in projects, and it has the advantage that old data, whether it is from a program running on an older machine or is from old recorded data, can be interpreted by newer code without additional machinery or features beyond the optional field type. -Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, though that wouldn't be good long term most likely, so it is very forgiving to the code that uses the message definition. +Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, which is very forgiving for developers in a way, but likely wouldn't be good long term solution. You can imagine other serialization features, like inheritance or type coercion, could be used to address this desired change, but in each case it relies on a feature of the serialization technology which may or may not be available in all middleware implementations for ROS 2. Each of these approaches also requires a more sophisticated message API for the users than what is currently provided by ROS 2. -At the moment the user can access all fields of the message directly, i.e. "member based access" vs "method based access", and that would need to change in some way to use these features. +At the moment the user can access all fields of the message directly, i.e. "member based access" vs "method based access", and that would need to change in some way to use some of these features. A different approach would be to simply update the type as follows: @@ -84,7 +84,7 @@ And also update any code publishing or subscribing to this type at the same time This is typically what happens right now in ROS 2, and also what happened historically in ROS 1. This approach is simple and requires no additional serialization features, but obviously doesn't do anything on its own to help developers evolve their system while maintaining support for already deployed code and utilizing existing recorded data. -However, which additional tooling these cases can be handled without the code of the application knowing about. +However, with additional tooling these cases can be handled without the code of the application knowing about. Consider again the above example, but this time the update to the message was done without backwards compatibility in the message definition. This means that publishing old recorded data, for example, will not work with new code that subscribes to the topic but expects the new version of the message. For the purpose of this example, let's call the original message type ``Temperature`` and the one using ``float64`` we'll call ``Temperature'``. @@ -96,18 +96,18 @@ So, if you have rosbag2 publishing ``Temperature`` messages and a program consum │ rosbag2 ├───────────────►│transfer func.├──────────────►│ App │ └─────────┘ Temperature └──────────────┘ Temperature' └─────┘ -The "transfer function" can be user-defined, or for simple takes (like changing the field type) it can be done automatically. -We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and we used to do something like this for rosbags in ROS 1 with the "bag migration rules" feature. +The "transfer function" can be user-defined, or for simple changes (like changing the field type to a compatible type) it can be done automatically. +We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. .. TODO:: cite the above -This approach requires a few features, like the ability to have a single application read old and new versions of a message at the same time, and it requires more infrastructure and tooling to make it work, but it has the advantage of keeping both the publishing and subscribing code simple (agnostic to the fact that there are other versions of the message) and it keeps the message type from being cluttered with vestigial fields. +This approach requires a few features, like the ability to have a single application read old and new versions of a message at the same time, and it requires more infrastructure and tooling to make it work, but it has the advantage of keeping both the publishing and subscribing code simple, i.e agnostic to the fact that there are other versions of the message, and it keeps the message type from being cluttered with vestigial fields. Either way, a problem can usually be solved by changing a message in some of, if not all, of the of the above mentioned ways, and is often influenced by what the underlying technology allows for or encourages. ROS 2 has special considerations on this topic because it can support different serialization technologies, though CDR is the default and most common right now, and those technologies have different capabilities. It is neither desirable to depend on features of a specific technology, therefore tying ROS 2 to a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide, again tying ROS 2 to some specific technologies through their features. -We will require some kind of features from the middleware and serialization technology, however, to handle evolving interfaces, but we should try to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. +We will require some features from the middleware and serialization technology, however, to handle evolving interfaces, but we should try to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as a rationale in the Rationale section and a discussion of alternatives in the Alternatives section and its sub-sections. @@ -120,7 +120,73 @@ TODO Specification ============= -TODO +The proposal is to provide tooling to help users identify when messages have changed, help users configure their system to convert between versions of messages on the fly, and help users write the code needed to convert between types when the conversion is not trivial. + +Conceptual Overview +------------------- + +Users will be able to calculate the "type version hash" for an interface (e.g. a message, service, or action) using the ``ros2 interface hash `` command. +Additionally, if a topic has two types being used on it with the same type name, but different type versions, a warning will be logged and the endpoints that do not match will not communicate. + +.. TODO:: how does this interact with serialization features like optional fields and inheritance? Is there a way to override this behavior when the hashes don't match but communication will work due to optional fields or inheritance? + +When a mismatch is detected, the user can use predefined, or user-defined, "transfer functions" to convert between versions of the type until it is in the type they wish to send or receive. +They can use a tool that will look at a catalogue of available transfer functions to find a single transfer function, or a set of transfer functions, to get from the current type version to the desired type version. +The tool will start with the current type version and see if it can be automatically converted to the desired type version, or if it is accepted as an input to any user-defined transfer functions or if it can be automatically converted into one of the input type versions for the transfer functions. +It will continue to do this until it reaches the desired type version or it fails to find a path from the current to the desired type version. + +Once the set of necessary transfer functions has been identified, the ROS graph can be changed to have one side of the topic be remapped onto a new topic name which indicates it is of a different version that what is desired, and then the transfer function can be run as a component node which subscribes to one version of the message, performs the conversion using the chain of transfer functions, and then publishes the other version of the message. +Tools will assist the user in making these remappings and running the necessary component nodes with the appropriate configurations, either from their launch file or from the command line. + +.. TODO:: discuss the implications for large messages and the possibility of having the transfer functions be colocated with either the publisher or subscription more directly than with component nodes and remapping. + +Once the mismatched messages are flowing through the transfer functions, communication should be possible and neither the publishing side nor the subscribing side have any specific knowledge of the conversions taking place or that any conversions are necessary. + +In order to support this vision, three missing features will need to be added into ROS 2: controlling matching based on the type version hash (interface type enforcement), communicating the interface type description between nodes (inter-process type description distribution), and (de)serializing messages and services given only a type description of the interface and a buffer of bytes (runtime type introspection). + +Interface Type Enforcement +-------------------------- + +In order to detect type version mismatches and enforce them, a way to uniquely identify versions is required, and this proposal uses type version hashes. + +Type Version Hash +~~~~~~~~~~~~~~~~~ + +The type version hashes are not sequential and do imply any rank among versions of the type, i.e. given two version hashes of a type there is no way to tell which is "newer". +There is only whether or not the type versions are equal and whether or not there exists transfer functions that be chained together in order to convert between them. +Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. + +In order to calculate the type version hashes so that they are stable and are not sensitive to trivial changes like changes in the comments or whitespace in the IDL file, the IDL file given by the user, which for example may be a ``.msg`` or ``.idl`` or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. +The data structure includes: + +- a list of field names and types, but not default values +- the serialization format +- the serialization format version +- an optional user-defined interface version, or 0 if not provided + +The resulting data structure is hashed using a standard SHA-1 method, resulting in a standard 160-bit (20-byte) hash value which is also generally known as a "message digest". +This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. +This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. + +.. TODO: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more + +Enforcing Type Compatibility +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +With the type version hash available, it can be used to determine if two endpoints on a topic should communicate. + +.. TODO: move the draft section with details here after one of these options is selected: USER_DATA+ignore, USER_DATA+discovery "plugin", append type hash to type name in DDS, use type hash in DDS partition + +Inter-process Type Description Distribution +------------------------------------------- + +.. TODO: move the draft section over, and make a selection between: telnet/http server per context/participant under rmw layer, service per context/participant under rmw layer, telnet/http server per node under rmw layer, service per node above rmw layer, or some combination with way to configure which should be used. + +Runtime Type Introspection +-------------------------- + +.. TODO: terminology could be better? nothing off the top of my head, just deserves more bike-shedding. + Rationale From 05ac47eb45c82a8d64104aa5de7da4d473a3b681 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Wed, 13 Apr 2022 13:40:39 -0700 Subject: [PATCH 03/22] added more about the enforcing of the type versions Signed-off-by: William Woodall --- rep-2010.rst | 66 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 59 insertions(+), 7 deletions(-) diff --git a/rep-2010.rst b/rep-2010.rst index a19ff999d..c4980c81c 100644 --- a/rep-2010.rst +++ b/rep-2010.rst @@ -168,24 +168,56 @@ The resulting data structure is hashed using a standard SHA-1 method, resulting This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. -.. TODO: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more +.. TODO:: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more -Enforcing Type Compatibility -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Enforcing Type Version +~~~~~~~~~~~~~~~~~~~~~~ -With the type version hash available, it can be used to determine if two endpoints on a topic should communicate. +With the type version hash available, it can be used as an additional constraint to determine if two endpoints, i.e. publishers and subscriptions, on a topic should communicate. -.. TODO: move the draft section with details here after one of these options is selected: USER_DATA+ignore, USER_DATA+discovery "plugin", append type hash to type name in DDS, use type hash in DDS partition +When creating a publisher or subscription, the caller normally provides: a topic name, QoS settings, and a topic type. +The last of which is represented as a string and is automatically deduced based on the type given to the create function, e.g. as a template parameter in C++ or the message type as an argument in Python. +For example, creating a publisher for ``std_msgs::msg::String`` in C++, may result in a topic type like ``std_msgs/msg/String``. +All of these items are used by the middleware to determine if two endpoints should communicate or not, and this rep proposes that the type version be added to this list of provided information. +From the user's perspective nothing needs to change, as the type version can be extracted based on the topic type given either at the ``rcl`` layer or in the ``rmw`` implementation itself. +However, the type version would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription, and therefore the job of using that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. + +The method for implementing the detection and enforcement of type version mismatches is left up to the middleware, as some middlewares will have tools to make this efficient and others will implement something like what would be possible in the ``rcl`` and above layers. +By keeping this a detail of the ``rmw`` implementation, we allow the ``rmw`` implementations to make optimizations where they can. + +Recommended Strategy for Enforcing that Type Versions Match +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If the middleware has a feature to handle type compatibility already, as is the case with DDS-XTypes which is discussed later, then that can be used to enforce type safety, and then the type version hash can be used to warn the user when the communication may not happen due to a version mismatch, and also if can be put into recordings for future comparison. + +However, if the middleware lacks this kind of feature, then the recommended strategy for accomplishing this in the ``rmw`` implementation is to simply concatenate the type name and the type version hash and then use that as the type name given to the underlying middleware. +This has the benefit of "just working" for most middlewares which at least match based on the name of the type, and it is simple, requiring no further custom hooks into the middleware's discovery or match making process. +However, the downside is that detecting the mismatch is more difficult and it also makes interoperating with ROS using the native middleware more difficult, as appending the version hash to the type name is just "one more thing" that you have to contend with when trying to connect non-ROS endpoints to a ROS graph. + +.. TODO:: figure out if mismatched types produces a IncompatibleQoSOffered callback or not, then document the recommended way to detect type version mismatches, also look into ``DDS XTypes spec v1.3: 7.6.3.4.2: INCONSISTENT_TOPIC`` as a possible alternative + +Notes for Implementing the Recommended Strategy with DDS +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +TODO + +Interactions with DDS-XTypes or Similar Implicit Middleware Features +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +TODO + + +.. TODO:: move the draft section with details here after one of these options is selected: USER_DATA+ignore, USER_DATA+discovery "plugin", append type hash to type name in DDS, use type hash in DDS partition Inter-process Type Description Distribution ------------------------------------------- -.. TODO: move the draft section over, and make a selection between: telnet/http server per context/participant under rmw layer, service per context/participant under rmw layer, telnet/http server per node under rmw layer, service per node above rmw layer, or some combination with way to configure which should be used. +.. TODO:: move the draft section over, and make a selection between: telnet/http server per context/participant under rmw layer, service per context/participant under rmw layer, telnet/http server per node under rmw layer, service per node above rmw layer, or some combination with way to configure which should be used. Runtime Type Introspection -------------------------- -.. TODO: terminology could be better? nothing off the top of my head, just deserves more bike-shedding. +.. TODO:: terminology could be better? nothing off the top of my head, just deserves more bike-shedding. @@ -194,6 +226,26 @@ Rationale TODO +Alternatives +------------ + +TODO + +Use Type Hash from Middleware, e.g. from DDS-XTypes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TODO + +Handle Detection of Version Mismatch "Above" rmw Layer +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TODO + +Prevent Communication of Mismatched Versions "Above" rmw Layer +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +TODO + + Backwards Compatibility ======================= From 6dcb876dfadd877b25c6fed66afea48a856487b9 Mon Sep 17 00:00:00 2001 From: methylDragon Date: Thu, 12 May 2022 17:00:24 -0700 Subject: [PATCH 04/22] Increase readability, fix typos, and reduce ambiguities Signed-off-by: methylDragon --- rep-2010.rst | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/rep-2010.rst b/rep-2010.rst index c4980c81c..f3f58c170 100644 --- a/rep-2010.rst +++ b/rep-2010.rst @@ -16,13 +16,18 @@ This REP proposes patterns and approaches for evolving message, service, and act The proposed patterns use the existing default serialization technology, i.e. CDR, in combination with new tooling. However, technologies provided by different, perhaps future, serialization technologies can always be used in addition. -Specifically, this REP proposes that interpreting older versions of the message at runtime, using its type information, and converting it to the current version of the message using user defined transfer functions, is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. +Specifically, this REP proposes that the following is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools: + +- Interpreting older versions of the message at runtime; +- Using its type information; +- And converting it to the current version of the message using user defined transfer functions + This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the type information of the messages on the wire, beyond their name and version, and use that to dynamically interpret it at runtime. This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies so long as we have the ability to introspect the messages at runtime without prior knowledge of the type information, which is a feature we also need for generic introspection tools. Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support dynamic interpretation of messages at runtime. -It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification, and will likely use it to implement the needed interfaces in the ROS 2 middleware API. +It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification, and an interface similar to DDS-XTypes will likely be adopted in the ROS 2 middleware API, without explicitly relying on it. Alternatives are also discussed in more detail. @@ -41,7 +46,7 @@ To understand this need, it's useful to look at a few common scenarios, for exam int64 temperature There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from a prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int64 to float64. -There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields they may add a new optional field like this (pseudo code): +There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields, they may add a new optional field like this (pseudo code): .. code:: @@ -152,17 +157,18 @@ In order to detect type version mismatches and enforce them, a way to uniquely i Type Version Hash ~~~~~~~~~~~~~~~~~ -The type version hashes are not sequential and do imply any rank among versions of the type, i.e. given two version hashes of a type there is no way to tell which is "newer". -There is only whether or not the type versions are equal and whether or not there exists transfer functions that be chained together in order to convert between them. +The type version hashes are not sequential and do not imply any rank among versions of the type. That is, given two version hashes of a type, there is no way to tell which is "newer". +The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. -In order to calculate the type version hashes so that they are stable and are not sensitive to trivial changes like changes in the comments or whitespace in the IDL file, the IDL file given by the user, which for example may be a ``.msg`` or ``.idl`` or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. +In order to calculate the type version hashes so that they are stable and are not sensitive to trivial changes like changes in the comments or whitespace in the IDL file, the IDL file given by the user, which may be a ``.msg`` file, ``.idl`` file, or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. + The data structure includes: -- a list of field names and types, but not default values -- the serialization format -- the serialization format version -- an optional user-defined interface version, or 0 if not provided +- A list of field names and types, but not default values +- The serialization format +- The serialization format version +- An optional user-defined interface version, or 0 if not provided The resulting data structure is hashed using a standard SHA-1 method, resulting in a standard 160-bit (20-byte) hash value which is also generally known as a "message digest". This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. @@ -173,13 +179,13 @@ This allows the tooling to know if a hash mismatch is due to a change in this st Enforcing Type Version ~~~~~~~~~~~~~~~~~~~~~~ -With the type version hash available, it can be used as an additional constraint to determine if two endpoints, i.e. publishers and subscriptions, on a topic should communicate. +If the type version hash is available, it can be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. When creating a publisher or subscription, the caller normally provides: a topic name, QoS settings, and a topic type. -The last of which is represented as a string and is automatically deduced based on the type given to the create function, e.g. as a template parameter in C++ or the message type as an argument in Python. +The topic type is represented as a string and is automatically deduced based on the type given to the create function, e.g. as a template parameter in C++ or the message type as an argument in Python. For example, creating a publisher for ``std_msgs::msg::String`` in C++, may result in a topic type like ``std_msgs/msg/String``. -All of these items are used by the middleware to determine if two endpoints should communicate or not, and this rep proposes that the type version be added to this list of provided information. -From the user's perspective nothing needs to change, as the type version can be extracted based on the topic type given either at the ``rcl`` layer or in the ``rmw`` implementation itself. +All of these items are used by the middleware to determine if two endpoints should communicate or not, and this REP proposes that the type version be added to this list of provided information. +From the user's perspective, nothing needs to change, as the type version can be extracted based on the topic type given either at the ``rcl`` layer or in the ``rmw`` implementation itself. However, the type version would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription, and therefore the job of using that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. The method for implementing the detection and enforcement of type version mismatches is left up to the middleware, as some middlewares will have tools to make this efficient and others will implement something like what would be possible in the ``rcl`` and above layers. From 42cf0de562399e205331a913583a633515a17eed Mon Sep 17 00:00:00 2001 From: methylDragon Date: Mon, 16 May 2022 12:45:51 -0700 Subject: [PATCH 05/22] Add clarification for field semantics Signed-off-by: methylDragon --- rep-2010.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/rep-2010.rst b/rep-2010.rst index f3f58c170..ca89b43b0 100644 --- a/rep-2010.rst +++ b/rep-2010.rst @@ -104,12 +104,15 @@ So, if you have rosbag2 publishing ``Temperature`` messages and a program consum The "transfer function" can be user-defined, or for simple changes (like changing the field type to a compatible type) it can be done automatically. We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. +Furthermore, the "transfer function" approach also allows for runtime transformation of messages that only change in field semantics (e.g., a change of ``float distance`` to mean centimeters instead of meters, but with no change in field name or type.) Although in those cases, users will likely have to define the transfer function themselves. + .. TODO:: cite the above This approach requires a few features, like the ability to have a single application read old and new versions of a message at the same time, and it requires more infrastructure and tooling to make it work, but it has the advantage of keeping both the publishing and subscribing code simple, i.e agnostic to the fact that there are other versions of the message, and it keeps the message type from being cluttered with vestigial fields. Either way, a problem can usually be solved by changing a message in some of, if not all, of the of the above mentioned ways, and is often influenced by what the underlying technology allows for or encourages. ROS 2 has special considerations on this topic because it can support different serialization technologies, though CDR is the default and most common right now, and those technologies have different capabilities. + It is neither desirable to depend on features of a specific technology, therefore tying ROS 2 to a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide, again tying ROS 2 to some specific technologies through their features. We will require some features from the middleware and serialization technology, however, to handle evolving interfaces, but we should try to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. @@ -158,6 +161,7 @@ Type Version Hash ~~~~~~~~~~~~~~~~~ The type version hashes are not sequential and do not imply any rank among versions of the type. That is, given two version hashes of a type, there is no way to tell which is "newer". + The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. @@ -174,6 +178,8 @@ The resulting data structure is hashed using a standard SHA-1 method, resulting This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. +Notably, the user-defined interface version being included in the hash allows for messages that only change in field semantics (i.e., without changing field names or types) to be picked up as discrepancies when they would not before, allowing users to be prompted to write "transfer functions" to resolve them. + .. TODO:: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more Enforcing Type Version From d3449f0a9807e18e7fcacb0c136a123fbc6eb110 Mon Sep 17 00:00:00 2001 From: methylDragon Date: Tue, 31 May 2022 18:39:14 -0700 Subject: [PATCH 06/22] Implement more fixups Signed-off-by: methylDragon Co-authored-by: William Woodall --- rep-2010.rst | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/rep-2010.rst b/rep-2010.rst index ca89b43b0..c931d6346 100644 --- a/rep-2010.rst +++ b/rep-2010.rst @@ -16,11 +16,8 @@ This REP proposes patterns and approaches for evolving message, service, and act The proposed patterns use the existing default serialization technology, i.e. CDR, in combination with new tooling. However, technologies provided by different, perhaps future, serialization technologies can always be used in addition. -Specifically, this REP proposes that the following is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools: - -- Interpreting older versions of the message at runtime; -- Using its type information; -- And converting it to the current version of the message using user defined transfer functions +Specifically, this REP proposes that interpreting older versions of the message at runtime can be achieved by using its type information and then converting it to the current version of the message using user-defined transfer functions. +This approach is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the type information of the messages on the wire, beyond their name and version, and use that to dynamically interpret it at runtime. @@ -104,7 +101,8 @@ So, if you have rosbag2 publishing ``Temperature`` messages and a program consum The "transfer function" can be user-defined, or for simple changes (like changing the field type to a compatible type) it can be done automatically. We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. -Furthermore, the "transfer function" approach also allows for runtime transformation of messages that only change in field semantics (e.g., a change of ``float distance`` to mean centimeters instead of meters, but with no change in field name or type.) Although in those cases, users will likely have to define the transfer function themselves. +Furthermore, the "transfer function" approach also allows for conversions between messages which are only semantically different (e.g., a change in semantics for a field ``float distance`` to mean centimeters instead of meters, but with no change in field name or type). +However, the user would have to write the transfer function for these changes, because it could not be made automatically as they are not captured in a structured way. .. TODO:: cite the above @@ -112,7 +110,6 @@ This approach requires a few features, like the ability to have a single applica Either way, a problem can usually be solved by changing a message in some of, if not all, of the of the above mentioned ways, and is often influenced by what the underlying technology allows for or encourages. ROS 2 has special considerations on this topic because it can support different serialization technologies, though CDR is the default and most common right now, and those technologies have different capabilities. - It is neither desirable to depend on features of a specific technology, therefore tying ROS 2 to a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide, again tying ROS 2 to some specific technologies through their features. We will require some features from the middleware and serialization technology, however, to handle evolving interfaces, but we should try to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. @@ -169,29 +166,29 @@ In order to calculate the type version hashes so that they are stable and are no The data structure includes: -- A list of field names and types, but not default values -- The serialization format -- The serialization format version -- An optional user-defined interface version, or 0 if not provided +- a list of field names and types, but not default values +- the serialization format +- the serialization format version +- an optional user-defined interface version, or 0 if not provided The resulting data structure is hashed using a standard SHA-1 method, resulting in a standard 160-bit (20-byte) hash value which is also generally known as a "message digest". This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. -Notably, the user-defined interface version being included in the hash allows for messages that only change in field semantics (i.e., without changing field names or types) to be picked up as discrepancies when they would not before, allowing users to be prompted to write "transfer functions" to resolve them. +The optional user-defined interface version included in the hash allows the user to indicate that there was a change in message semantics (i.e. no change in the field names or types), causing the versions to mismatch and allowing the user to write a "transfer functions" to resolve the semantic differences. .. TODO:: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more Enforcing Type Version ~~~~~~~~~~~~~~~~~~~~~~ -If the type version hash is available, it can be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. +The type version hash can be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. When creating a publisher or subscription, the caller normally provides: a topic name, QoS settings, and a topic type. The topic type is represented as a string and is automatically deduced based on the type given to the create function, e.g. as a template parameter in C++ or the message type as an argument in Python. For example, creating a publisher for ``std_msgs::msg::String`` in C++, may result in a topic type like ``std_msgs/msg/String``. All of these items are used by the middleware to determine if two endpoints should communicate or not, and this REP proposes that the type version be added to this list of provided information. -From the user's perspective, nothing needs to change, as the type version can be extracted based on the topic type given either at the ``rcl`` layer or in the ``rmw`` implementation itself. +Nothing needs to change from the user's perspective, as the type version can be extracted automatically based on the topic type given, either at the ``rcl`` layer or in the ``rmw`` implementation itself. However, the type version would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription, and therefore the job of using that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. The method for implementing the detection and enforcement of type version mismatches is left up to the middleware, as some middlewares will have tools to make this efficient and others will implement something like what would be possible in the ``rcl`` and above layers. From 98081d920b3077879a6cab6b0eed9e7171df1506 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Thu, 9 Jun 2022 18:01:17 -0700 Subject: [PATCH 07/22] Methyl dragon/type description distribution (#2) * Increase readability, fix typos, and reduce ambiguities Signed-off-by: methylDragon * Add clarification for field semantics Signed-off-by: methylDragon * Add first draft of inter-process type description distribution Signed-off-by: methylDragon * Add additional notes for tick-tocking the metatype Signed-off-by: methylDragon * Add rationale section for inter-process type description distribution Signed-off-by: methylDragon * Add additional note about sending type description metadata Signed-off-by: methylDragon * Add alternatives section for type description distribution Signed-off-by: methylDragon * Fix broken TODO:: directives Signed-off-by: methylDragon * Apply rewording suggestions for type description distribution Signed-off-by: methylDragon Co-authored-by: William Woodall * Apply rewording suggestions from code review Signed-off-by: methylDragon Co-authored-by: William Woodall Co-authored-by: William Woodall * remove trailing whitespace Signed-off-by: William Woodall * fixups while walking through it with Brandon Signed-off-by: William Woodall Co-authored-by: methylDragon --- rep-2010.rst | 177 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 171 insertions(+), 6 deletions(-) diff --git a/rep-2010.rst b/rep-2010.rst index c931d6346..639e93d25 100644 --- a/rep-2010.rst +++ b/rep-2010.rst @@ -101,8 +101,8 @@ So, if you have rosbag2 publishing ``Temperature`` messages and a program consum The "transfer function" can be user-defined, or for simple changes (like changing the field type to a compatible type) it can be done automatically. We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. -Furthermore, the "transfer function" approach also allows for conversions between messages which are only semantically different (e.g., a change in semantics for a field ``float distance`` to mean centimeters instead of meters, but with no change in field name or type). -However, the user would have to write the transfer function for these changes, because it could not be made automatically as they are not captured in a structured way. +Furthermore, the "transfer function" approach also allows for runtime transformation of messages that only change in field semantics (e.g., a change of ``float distance`` to mean centimeters instead of meters, but with no change in field name or type). +Although in those cases, users will likely have to define the transfer function themselves. .. TODO:: cite the above @@ -175,7 +175,7 @@ The resulting data structure is hashed using a standard SHA-1 method, resulting This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. -The optional user-defined interface version included in the hash allows the user to indicate that there was a change in message semantics (i.e. no change in the field names or types), causing the versions to mismatch and allowing the user to write a "transfer functions" to resolve the semantic differences. +The user-defined interface version makes it possible to change the version of a message that only changed in "field semantics" (i.e. without changing field names or types), and therefore makes it possible to write "transfer functions" to handle semantic-only conversions between versions. .. TODO:: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more @@ -218,10 +218,105 @@ TODO .. TODO:: move the draft section with details here after one of these options is selected: USER_DATA+ignore, USER_DATA+discovery "plugin", append type hash to type name in DDS, use type hash in DDS partition -Inter-process Type Description Distribution -------------------------------------------- +Type Description Distribution +----------------------------- + +For some use cases the type version hash is insufficient and instead the full type description is required. + +One of those use cases, which is also described in this REP, is "runtime type introspection", which is the ability to introspect the contents of a message at runtime when the description for that message, or that version of that message, was unavailable at compile time. +In this use case the type description is used to interpret the serialized data dynamically. + +Another use case, which is not covered in this REP, is using the type description in tooling to either display the type description to the user or to include it in recordings like rosbags. + +In either case, where the type description comes from doesn't really matter, and so, for example, it could be looked up on the local filesystem or read from a rosbag file. +However, in practice, the correct type description may not be found locally, especially in cases where you have different versions of messages in the same system, either because it's on another computer or perhaps because it is from a different distribution of ROS or was built in a different workspace. + +So, it is useful to have a mechanism to convey the type descriptions from the source of the data to other nodes, which we describe here as "type description distribution". +Furthermore, this feature should be agnostic to the underlying middleware and serialization library. + +Sending the Type Description +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. TODO:: should each publisher be contacted for each type/version pair or should we assume that the type/version pair guarantees a unique type description? + +Type descriptions will be provided by a ROS Service (``~/_get_type_description``) on each node. +There will be a single ROS service per node, regardless of the number of publishers or subscriptions on that node. + +.. TODO:: should this service be required/optional? + +A service request to this ROS Service will comprise of the type name and the type version hash, which is distributed during discovery of endpoints and will be accessible through the ROS API. +The service server will respond with the type description and any necessary metadata needed to do runtime type introspection. +This service is not expected to be called frequently, and is likely to only occur when new topic or service endpoints are created, and even then, only if the endpoint type hashes do not match. + +Type Description Contents and Format +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The response sent by the service server will contain a combination of the original ``idl`` or ``msg`` file's content, as well as any necessary information to serialize and deserialize the raw message buffers sent on the topic. +The response will contain a version of the description that contains comments from the original type description, as those might be relevant to interpreting the semantic meaning of the message fields. + +Additionally, the response could include the serialization library used, its version, or any other helpful information from the original producer of the data. + +All of this type description information will be sent as a ROS 2 service response, as the nodes in the previous section will be queried on the ROS layer. + +.. TODO:: What happens if the message consumer doesn't have access to the serialization library stated in the meta-type? + +The ROS 2 message that defines the type description must be able to describe any message type, including itself, and since it is describing the message format, it should work independently from any serialization technologies used. +This "meta-type description" message would then be used to communicate the structure of the type as part of the "get type description" service response. +The final form of these interfaces should be found in the reference implementation, but such a Service interface might look like this: + +.. code:: + + string type_name + string version_hash + --- + bool successful # True if the type description information is available and populated in the response + string failure_reason # Empty if 'successful' was true, otherwise contains details on why it failed + + string type_description_raw # The idl or msg file, with comments and whitespace + TypeDescription type_description # The parse type description which can be used programmatically + + string serialization_library + string serialization_version + # ... other useful meta data + +Again, the final form of these interfaces should be referenced from the reference implementation, but the ``TypeDescription`` message type might look like this: + +.. code:: + + IndividualTypeDescription type_description + IndividualTypeDescription[] referenced_type_descriptions + +And the ``IndividualTypeDescription`` type: + +.. code:: + + FIELD_TYPE_INT = 0 + FIELD_TYPE_DOUBLE = 1 + # ... and so on + + uint8_t[] field_types + string[] field_names + +These naive examples of the interfaces just give an idea of the structure but perhaps do not yet consider some other complications like field annotations and more advanced IDL (generically all "interface description languages" not just the OMG-IDL that DDS uses) have. + +.. TODO:: Should we use strings instead of integer enum for the ``field_types``? + + +Additional Notes for TypeDescription Message Type +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Given that the type description message interface has to be generic enough to support anything described in the ROS interfaces, there will be a need to add or remove fields over time in the type description message itself. +This should be done in such a way that the fields are tick-tocked and deprecated properly, possibly by having explicitly named versions of this interface, e.g. ``TypeDescriptionV1`` and ``TypeDescriptionV2`` and so on. + +Implementation in the `rcl` Layer +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The implementation of the type description distribution feature will be made in the ``rcl`` layer as opposed to the ``rmw`` layer to take advantage of the abstraction away from the middleware and to allow for compatibility with the client libraries. + +A hook will be added to ``rcl_node_init()`` to initialize the type description distribution service with the appropriate ``rcl_service_XXX()`` functions. +This hook should also keep a map of published and subscribed types which will be populated on each initialization of a publisher or subscription in the respective ``rcl_publisher_init()`` and ``rcl_subscription_init()`` function calls. +The passed ``rosidl_message_type_support_t`` in the init call can be introspected with getter functions to obtain the relevant information, alongside any new methods added to support type version hashing. -.. TODO:: move the draft section over, and make a selection between: telnet/http server per context/participant under rmw layer, service per context/participant under rmw layer, telnet/http server per node under rmw layer, service per node above rmw layer, or some combination with way to configure which should be used. Runtime Type Introspection -------------------------- @@ -235,6 +330,45 @@ Rationale TODO +Type Description Distribution +----------------------------- + +Using a Single ROS Service per Node +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The node that is publishing the data must already have access to the correct type description, at the correct version, in order to publish it, and therefore it is natural to get the data from that node. +Similarly, a subscribing node also knows what type they are wanting to receive, both in name and version, and therefore it is again natural to get that information from the subscribing node. +The type description for a given type, at a given version, could have been retrieved from other places, e.g. a centralized database, but the other alternatives considered would have had to take care to ensure that it had the right version of the message, which is not the case for the node publishing the data. + +Because the interface for getting a type description is generic, it is not necessary to have this interface on a per entity, i.e. publisher, subscription, etc, basis, but instead to offer the ROS Service on a per node basis to reduce the number of ROS Services. +Therefore, the specification dictates that the type description is distributed by single ROS Service for each individual node. + +There were also multiple alternatives for how to get this information from each node, but the use of a single ROS Service was selected because the task of requesting the type description from a node is well suited to a request-response style ROS Service. +Some of the alternatives offered other benefits, but using a ROS Service introduced the fewest dependencies, feature-wise, while accomplishing the task. + +.. TODO:: cite the above, https://en.wikipedia.org/wiki/Request%E2%80%93response + +Combining the Raw and Parsed Type Description in the Service Response +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The contents of the "get type description" service response should include information that supports both aforementioned use cases (i.e. tools and runtime type introspection). +These use cases have orthogonal interests, with the former requiring human-readable descriptions, and the latter preferring machine-readable descriptions. + +Furthermore, the type description should be useful even across middlewares and serialization libraries and that makes it especially important to send at least the original inputs to the "type support pipeline" (i.e. the process of taking user-defined types and generating all supporting code). +In this case, because the "type support pipeline" is a lossy process, there is a need to ensure that enough information is sent to completely reproduce the original definition of the type, and therefore it makes sense to just send the original ``idl`` or ``msg`` file. + +At the same time, it is useful to send information with the original description that makes it easier to process data at the receiving end, as it is often not trivial to get to the "parsed" version of the type description from the original text description. + +Finally, while there could be an argument for sending a losslessly compressed version of the message file, the expected low frequency of queries to the type description service incurs a negligible overhead that heavily reduces the benefit. + +Implementing in ``rcl`` versus ``rmw`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +While it is true that implementing the type description distribution on the ``rmw`` layer would allow for much lower level optimization, removing the layer of abstraction avoids having to implement this feature in each rmw implementation. + +Given that the potential gains from optimization will be small due to how infrequently the service is expected to be called, this added development overhead was determined to not be worth it. +Instead the design prefers to have a unified implementation of this feature in ``rcl`` so it is agnostic to any middleware implementations and client libraries. + Alternatives ------------ @@ -254,6 +388,37 @@ Prevent Communication of Mismatched Versions "Above" rmw Layer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TODO +Type Description Distribution +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Other Providers of Type Description +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Several other candidate strategies for distributing the type descriptions were considered but ultimately discarded for one or more reasons like: causing a strong dependency on a particular middleware or a third-party technology, difficulties with resolving the message type description locally, difficulties with finding the correct entity to query, or causing network throughput issues. + +These are some of the candidates that were considered, and the reasons for their rejection: + +- Store the type description as a ROS parameter + * Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem +- Store the type description on a centralized node per machine + * Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces. +- Send type description alongside discovery with middlewares + * Works very well if supported, but is only supported by some DDS implementations (which support XTypes or some other way to attach discovery metadata), but causes a strong dependency on DDS. +- Send type description using a different network protocol + * Introduces additional third-party dependencies separate from ROS and the middleware. + +Alternative Type Description Contents and Format +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A combination of the original ``idl`` / ``msg`` file and any other information needed for serialization and deserialization being sent allows for one to cover the weaknesses of the other. +Specifically, given that certain use-cases (e.g., ``rosbag``) might encounter situations where consumers of a message are using a different middleware or serialization scheme the message was serialized with, it becomes extremely important to send enough information to both reconstruct the type support, and also allow the message fields to be accessed in a human readable fashion to aid in the writing of transfer functions. +As such, it is not a viable option to only send one or the other. + +Additionally, the option to add a configuration option to choose what contents to receive from the service server was disregarded due to how infrequently the type description query is expected to be called. + +As for the format of the type description, using the ROS interfaces to describe the type, as opposed to an alternative format like XML, JSON, or something like the TypeObject defined by DDS-XTypes, makes it easier to embed in the ROS Service response. +It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. + Backwards Compatibility ======================= From d7628e724b6df3096342324e0a4fea98e3245395 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Thu, 9 Jun 2022 18:11:25 -0700 Subject: [PATCH 08/22] rename from REP-2010 to REP-2011 Signed-off-by: William Woodall --- rep-2010.rst => rep-2011.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename rep-2010.rst => rep-2011.rst (99%) diff --git a/rep-2010.rst b/rep-2011.rst similarity index 99% rename from rep-2010.rst rename to rep-2011.rst index 639e93d25..681678abe 100644 --- a/rep-2010.rst +++ b/rep-2011.rst @@ -1,4 +1,4 @@ -REP: 2010 +REP: 2011 Title: Evolving Message Types, and Other ROS Interface Types, Over Time Author: William Woodall Status: Draft From 22f6cb79bf9ea8b163bd112bf0f4c466f3afd3bc Mon Sep 17 00:00:00 2001 From: methylDragon Date: Thu, 7 Jul 2022 17:48:51 -0700 Subject: [PATCH 09/22] Add runtime type introspection section (#3) * Add section on runtime type introspection Signed-off-by: methylDragon * Apply REP rephrases Signed-off-by: methylDragon Co-authored-by: William Woodall * Apply more REP rephrases Signed-off-by: methylDragon Co-authored-by: William Woodall --- rep-2011.rst | 161 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 160 insertions(+), 1 deletion(-) diff --git a/rep-2011.rst b/rep-2011.rst index 681678abe..581dedfd4 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -321,9 +321,168 @@ The passed ``rosidl_message_type_support_t`` in the init call can be introspecte Runtime Type Introspection -------------------------- -.. TODO:: terminology could be better? nothing off the top of my head, just deserves more bike-shedding. +Runtime type introspection allows access to the data in the fields of a serialized message buffer when given: +- the serialized message as a ``rmw_serialized_message_t``, a.k.a. a ``rcutils_uint8_array_t`` +- the message's type description, e.g. received from the aforementioned type description distribution or from a bag file +- the serialization format, name and version, which was used to create the serialized message, e.g. ``XCDR2`` for ``Extended CDR encoding version 2`` +From these inputs, we should be able to access the data as fields in the message buffer through some programmatic interface, including: + +- a list of field names +- a list of field types +- access to the field by name or index + +.. code:: + + message_buffer ─┐ ┌────────────────────────────┐ + │ │ │ + message_description ─┼──►│ Runtime Type Introspection ├───► Introspection Interface + │ │ │ + serialization_format ─┘ └────────────────────────────┘ + +Given that the scope of inputs and expected outputs is so limited, this feature should ideally be implemented as a separate package, e.g. ``rcl_serialization``, that can be called independently by any downstream packages that might need runtime type introspection, e.g. introspection tools, rosbag transport, etc. +This feature can then be combined with the ability to detect type mismatches and obtain type descriptions in the previous two sections to facilitate communication between nodes of otherwise incompatible types. + +Additionally, it is important to note that this feature is distinguished from ROS 2's "dynamic" type support (``rosidl_typesupport_introspection_c``) in that "dynamic" type support generates type support code at compile time that then is not expected to change, whereas this feature aims to support any arbitrary message type descriptions at runtime. +Crucially, this means that a node will not need access to a message's type description at compile time in order to be able to support it, since it will be able to receive and utilize the type description with this feature, in contrast to "dynamic" type support, which would require it. + +Requirements +^^^^^^^^^^^^ + +In order for runtime type introspection to work, message buffers still need to be able to be transmitted and received, as such, this feature is expected to work if and only if: + +- the consumer of a message has access to a compatible version of the serialization library described in the message type description +- the serialization libraries expected to be used with this feature support at least sequential access to message buffer data members with a runtime programmatic interface +- in the case where messages are being sent over a middleware layer (e.g. one of the DDS implementations), the producers and consumers of the message are using compatible middleware implementations + +Furthermore, this feature will be implemented as a C library to allow for maximum interoperability with the ROS client libraries (e.g. C++, Python, Rust, etc.) + +Plugin Interface +^^^^^^^^^^^^^^^^ + +As runtime type introspection is expected to work across any serialization format, the runtime type introspection interface needs to be extensible so that the necessary serialization libraries can be loaded to process the serialized message. +Serialization format support in this case will be provided by writing plugins that wrap the serialization libraries that can then provide the runtime type introspection feature with the needed facilities. +Then, runtime type introspection will: + +- enumerate the supported serialization library plugins on the machine +- match, if possible, the serialization format specified in the message type description to an appropriate plugin (throwing a warning otherwise) +- dynamically load the plugin + +These serialization library plugins must wrap functionality to: + +- determine suitability for use with a message's reported serialization version and type +- parse the ``.idl`` / ``.msg`` component of the message description into a serialization library usable form +- given that parsed description (which provides type reflection information about the message buffer), deserialize the message buffer using the serialization library + +In particular, the step of parsing the ``.idl`` / ``.msg`` file is necessary because there might be serialization library or type support specific steps or considerations (e.g. field re-ordering, endianness) that would not be captured in the ``.idl`` / ``.msg`` file, because they would have been assumed to have been applied during type support. + +Dealing with Multiple Applicable Plugins for A Serialization Format +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +In the case where there exists multiple applicable plugins for a particular serialization format (e.g. when a user's machine has both RTI Connext's CDR library and FastCDR), the plugin matching should follow this priority order: + +- user specified overrides passed to the matching function +- defaults defined in the plugin matching function, if applicable, otherwise +- the first suitable plugin in alphanumeric sorting order + + +Example Plugin Interface +^^^^^^^^^^^^^^^^^^^^^^^^ + +Plugin Matching and Loading +""""""""""""""""""""""""""" + +The following is an example of how this plugin matching and loading interface could look like, defining new ``rcl`` interfaces; with a plugin wrapping FastCDR v1.0.24 for serialization of ``sensor_msgs/msg/LaserScan`` messages: + +.. code:: + + // Suppose LaserScanDescription reports that it uses FastCDR v1.0.24 for its serialization + rcl_message_description_t LaserScanDescription = node->get_type_description("/scan"); + + rcl_type_introspection_t * introspection_handle; + introspection_handle->init(); // Locate local plugins here + + // Plugin name: "fastcdr_v1_0_24" + const char * plugin_name = introspection_handle->match_plugin(LaserScanDescription->get_serialization_format()); + rcl_serialization_plugin_t * plugin = introspection_handle->load_plugin(plugin_name); + + // If we wanted to force the use of MicroCDR instead + introspection_handle->match_plugin(LaserScanDescription->get_serialization_type(), "microcdr"); + +Then, the plugin should be able to use the description to deserialize the message buffer using the plugin: + +.. code:: + + rcl_deserialized_message_t * scan_msg; + introspection_handle->deserialize(&plugin, message_buffer, &LaserScanDescription, scan_msg); + + // Where, internally... + // rcl_type_introspection_t->(*deserialize) + void deserialize(rcl_serialization_plugin_t *plugin, + void *buffer, + rcl_message_description_t *description, + rcl_deserialized_message_t *msg) + { + ... + + // Do something serialization specific + plugin_internal_description_t parsed_description = plugin->impl->parse_description(description); + plugin->impl->deserialize(buffer, parsed_description, msg); + + ... + } + +Similar interfaces could be created to allow access a message's fields by name or index without deserializing the whole message. + +Example Introspection API +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Once the serialization library plugins are able to deserialize the raw message buffer, downstream programs can then introspect the constructed deserialized message object, which should be laid out as such: + +.. code:: + + struct rcl_deserialized_field_t { + void * value; + char * type; + } + + struct rcl_deserialized_message_t { + int message_field_count; + const char** message_field_names; + const char** message_types; + + // Some dynamically allocated key->value associative map type storing void * field values + rcl_associative_array message_fields; + + // Function pointers + rcl_deserialized_field_t * (*get_field_by_index)(int index); + }; + +Now, for a given message description `Foo.msg`: + +.. code:: + + // Foo.msg + bool bool_field + char char_field + float32 float_field + +The corresponding `rcl_deserialized_message_t` can be queried accordingly: + +.. code:: + + rcl_deserialized_message_t * foo_msg; + foo_msg->message_field_names[0]; // "bool_field" + foo_msg->message_types[0]; // "bool" + + // Get the field + if (strcmp("bool", foo_msg->get_field_by_index(0)->type) == 0) + { + *((bool*)foo_msg->get_field_by_index(0)->value); + } + +.. TODO:: Create pseudocode/definitions for sequences and arbitrarily nested message descriptions. Rationale ========= From 8dccc0f8d2c7e69cd3e2f00768d5592580c505e9 Mon Sep 17 00:00:00 2001 From: methylDragon Date: Sat, 23 Jul 2022 15:34:05 -0700 Subject: [PATCH 10/22] Add nested type description section (#4) * Add nested type description section Signed-off-by: methylDragon * Use Field type in message descriptions Signed-off-by: methylDragon * Fix unmatched parenthesis Signed-off-by: methylDragon * Implement nested introspection API Signed-off-by: methylDragon * Rework nested introspection API Signed-off-by: methylDragon --- rep-2011.rst | 285 +++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 229 insertions(+), 56 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 581dedfd4..fc7ab8b2b 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -273,7 +273,7 @@ The final form of these interfaces should be found in the reference implementati string failure_reason # Empty if 'successful' was true, otherwise contains details on why it failed string type_description_raw # The idl or msg file, with comments and whitespace - TypeDescription type_description # The parse type description which can be used programmatically + TypeDescription type_description # The parsed type description which can be used programmatically string serialization_library string serialization_version @@ -290,17 +290,107 @@ And the ``IndividualTypeDescription`` type: .. code:: - FIELD_TYPE_INT = 0 - FIELD_TYPE_DOUBLE = 1 + string type_name + Field[] fields + +And the ``Field`` type: + +.. code:: + + NESTED_TYPE = 0 + FIELD_TYPE_INT = 1 + FIELD_TYPE_DOUBLE = 2 # ... and so on - uint8_t[] field_types - string[] field_names + uint8_t field_type + string field_name + string nested_type_name # If applicable (when field_type is 0) These naive examples of the interfaces just give an idea of the structure but perhaps do not yet consider some other complications like field annotations and more advanced IDL (generically all "interface description languages" not just the OMG-IDL that DDS uses) have. -.. TODO:: Should we use strings instead of integer enum for the ``field_types``? +Nested TypeDescription Example +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``TypeDescription`` message type shown above also supports the complete description of a type that contains other types (a nested type), up to an arbitrary level of nesting. +Consider the following example: + +.. code:: + + # A.msg + B b + C c + + # B.msg + bool b_bool + + # C.msg + D d + + # D.msg + bool d_bool + +The corresponding ``TypeDescription`` for ``A.msg`` will be as follows, with the referenced type descriptions accessible as ``IndividualTypeDescription`` types in the ``referenced_type_descriptions`` field of ``A``: + +.. code:: + + # A: TypeDescription + type_description: A_IndividualTypeDescription + referenced_type_descriptions: [B_IndividualTypeDescription, + C_IndividualTypeDescription, + D_IndividualTypeDescription] + +Note that the type description for ``A`` itself is found in the ``type_description`` field instead of the ``referenced_type_descriptions`` field. +Additionally, in the case where a type description contains no referenced types (i.e., when it has no fields, or all of its fields are primitive types), the ``referenced_type_descriptions`` array will be empty. + +.. code:: + + # A: IndividualTypeDescription + type_name: "A" + fields: [A_b_Field, A_c_Field] + # B: IndividualTypeDescription + type_name: "B" + fields: [B_b_bool_Field] + + # C: IndividualTypeDescription + type_name: "C" + fields: [C_d_Field] + + # D: IndividualTypeDescription + type_name: "D" + fields: [D_d_bool_Field] + +With the corresponding ``Field`` fields: + +.. code:: + + # A_b_Field + field_type: 0 + field_name: "b" + nested_type_name: "B" + + # A_c_Field + field_type: 0 + field_name: "c" + nested_type_name: "C" + + # B_b_bool_Field + field_type: 9 # Suppose 9 corresponds to a boolean field + field_name: "b_bool" + nested_type_name: "" # Empty if primitive type + + # C_d_Field + field_type: 0 + field_name: "d" + nested_type_name: "D" + + # D_d_bool_Field + field_type: 9 + field_name: "d" + nested_type_name: "" + +In order to handle the type of a nested type such as ``A``, the receiver can use the ``referenced_type_descriptions`` array as a lookup table keyed by the value of ``Field.nested_type_name`` or ``IndividualTypeDescription.type_name`` (which will be identical for a given type) to obtain the type information of a referenced type. +This type handling process can also support any recursive level of nesting (e.g. while handling A, C is encountered as a nested type, C can then be looked up using the top level ``referenced_type_descriptions`` array). Additional Notes for TypeDescription Message Type ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -398,7 +488,7 @@ The following is an example of how this plugin matching and loading interface co .. code:: // Suppose LaserScanDescription reports that it uses FastCDR v1.0.24 for its serialization - rcl_message_description_t LaserScanDescription = node->get_type_description("/scan"); + rcl_runtime_introspection_description_t LaserScanDescription = node->get_type_description("/scan"); rcl_type_introspection_t * introspection_handle; introspection_handle->init(); // Locate local plugins here @@ -408,81 +498,123 @@ The following is an example of how this plugin matching and loading interface co rcl_serialization_plugin_t * plugin = introspection_handle->load_plugin(plugin_name); // If we wanted to force the use of MicroCDR instead - introspection_handle->match_plugin(LaserScanDescription->get_serialization_type(), "microcdr"); + introspection_handle->match_plugin(LaserScanDescription->get_serialization_format(), "microcdr"); -Then, the plugin should be able to use the description to deserialize the message buffer using the plugin: +Example Introspection API +^^^^^^^^^^^^^^^^^^^^^^^^^ -.. code:: +The following is an example for how the introspection API could look like. +This example will show a read-only interface. - rcl_deserialized_message_t * scan_msg; - introspection_handle->deserialize(&plugin, message_buffer, &LaserScanDescription, scan_msg); +Overview +"""""""" - // Where, internally... - // rcl_type_introspection_t->(*deserialize) - void deserialize(rcl_serialization_plugin_t *plugin, - void *buffer, - rcl_message_description_t *description, - rcl_deserialized_message_t *msg) - { - ... +It should comprise several components +- a handler for the message buffer, to handle pre-processing (e.g. decompression) +- a handler for the message description, to keep track of message field names of arbitrary nesting level +- handler functions for message buffer introspection - // Do something serialization specific - plugin_internal_description_t parsed_description = plugin->impl->parse_description(description); - plugin->impl->deserialize(buffer, parsed_description, msg); +Also, this example uses the LaserScan message definition: https://github.com/ros2/common_interfaces/blob/foxy/sensor_msgs/msg/LaserScan.msg - ... - } +.. TODO:: (methylDragon) Add a reference somehow? -Similar interfaces could be created to allow access a message's fields by name or index without deserializing the whole message. +API Example +""""""""""" -Example Introspection API -^^^^^^^^^^^^^^^^^^^^^^^^^ - -Once the serialization library plugins are able to deserialize the raw message buffer, downstream programs can then introspect the constructed deserialized message object, which should be laid out as such: +First, the message buffer handler: .. code:: - struct rcl_deserialized_field_t { - void * value; - char * type; + struct rcl_buffer_handle_t { + const void * buffer; // The buffer should not be modified + + const char * serialization_type; + rcl_serialization_plugin_t * serialization_plugin; + rcl_runtime_introspection_description_t * description; // Convenient to have + + // And some examples of whatever else might be needed to support deserialization or introspection... + void * serialization_impl; } - struct rcl_deserialized_message_t { - int message_field_count; - const char** message_field_names; - const char** message_types; +The message buffer handler should allocate new memory if necessary, or store a pointer to the message buffer otherwise in its ``buffer`` member. + +Then, functions should be written that allow for convenient traversal of the type description tree. +These functions should allow a user to get the field names and field types of the top level type, as well as from any nested types. - // Some dynamically allocated key->value associative map type storing void * field values - rcl_associative_array message_fields; +.. code:: - // Function pointers - rcl_deserialized_field_t * (*get_field_by_index)(int index); + struct rcl_field_info_t { // Mirroring Field + const char * field_name; // This should be an absolute address (e.g. "header.seq", instead of "seq") + + uint8_t type; + const char * nested_type_name; // Populated if the type is not primitive }; -Now, for a given message description `Foo.msg`: + // Get descriptions + rcl_runtime_introspection_description_t LaserScanDescription = node->get_type_description("/scan"); + rcl_runtime_introspection_description_t HeaderDescription = node->get_referenced_description(LaserScanDescription, "Header"); + + // All top-level fields from description + rcl_field_info_t ** fields = get_field_infos(&LaserScanDescription); + + // A single field from description + rcl_field_info_t * header_field = get_field_info(&LaserScanDescription, "header"); + + // A single field from a referenced description + rcl_field_info_t * stamp_field = get_field_info(&HeaderDescription, "stamp"); + + // A nested field from top-level description + rcl_field_info_t * stamp_field = get_field_info(&LaserScanDescription, "header.stamp"); + +Finally, there should be functions to obtain the data stored in the message fields. +This could be by value or by reference, depending on what the serialization library supports, for different types. + +There minimally needs to be a family of functions to obtain data stored in a single primitive message field, no matter how deeply nested it is. +These need to be created for each primitive type. + +The rest of the type introspection machinery can then be built on top of that family of functions, in layers higher than the C API. .. code:: - // Foo.msg - bool bool_field - char char_field - float32 float_field + rcl_buffer_handle_t * scan_buffer = node->get_processed_buffer(some_raw_buffer); + + // Top-level primitive field + get_primitive_field_float32(scan_buffer, "scan_time"); + + // Nested primitive field + get_primitive_field_uint32_seq(scan_buffer, "header.seq"); -The corresponding `rcl_deserialized_message_t` can be queried accordingly: + // Nested primitive field sequence element (overloaded) + get_field_seq_length(scan_buffer, "header.seq"); // Support function + get_primitive_field_uint32(scan_buffer, "header.seq", 0); + +If we attempt to do the same by reference, the plugin might decide to allocate new memory for the pointer, or return a pointer to existing memory. .. code:: - rcl_deserialized_message_t * foo_msg; - foo_msg->message_field_names[0]; // "bool_field" - foo_msg->message_types[0]; // "bool" + // Nested primitive field + get_primitive_field_uint32_seq_ptr(scan_buffer, "header.seq"); - // Get the field - if (strcmp("bool", foo_msg->get_field_by_index(0)->type) == 0) - { - *((bool*)foo_msg->get_field_by_index(0)->value); - } + // Be sure to clean up any dangling pointers + finalize_field(some_field_data_ptr); + +Error cases +""""""""""" + +The following should be error cases: + +- accessing field data as incorrect type +- accessing or introspecting incorrect/nonexistent field names + +.. TODO: (methylDragon) Are there more cases? It feels like there are... + +Pointer lifecycles +"""""""""""""""""" + +- the raw message buffer should outlive the ``rcl_buffer_handle_t``, since it is not guaranteed that the buffer handle will allocate new memory +- the ``rcl_buffer_handle_t`` should outlive any returned field data pointers, since it is not guaranteed that the serialization plugin will allocate new memory +- however, ``rcl_field_info_t`` objects **do not** have any lifecycle dependencies, since they are merely descriptors -.. TODO:: Create pseudocode/definitions for sequences and arbitrarily nested message descriptions. Rationale ========= @@ -578,6 +710,47 @@ Additionally, the option to add a configuration option to choose what contents t As for the format of the type description, using the ROS interfaces to describe the type, as opposed to an alternative format like XML, JSON, or something like the TypeObject defined by DDS-XTypes, makes it easier to embed in the ROS Service response. It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. +TypeDescription Structure +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Representing Fields as An Array of Field Types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The use of an array of ``Field`` messages was balanced against using two arrays in the ``IndividualTypeDescription`` type to describe the field types and field names instead, e.g.: + +.. code:: + + # Rejected IndividualTypeDescription Variants + + # String variant + string type_name + string field_types[] + string field_names[] + + # uint8_t Variant + string type_name + uint8_t field_types[] + string field_names[] + +The string variant was rejected because using strings to represent primitive types wastes space, and will lead to increased bandwidth usage during the discovery and type distribution process. +The uint8_t variant was rejected because uint8_t enums are insufficiently expressive to support nested message types. + +The use of the ``Field`` type, with a ``nested_type_name`` field that defaults to an empty string mitigates the space issue while allowing for support of nested message types. +Furthermore, it allows the fields to be described in a single array, which is easier to iterate through and also reduces the chances of any errors from mismatching the array lengths. + +Using an Array to Store Referenced Types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Some alternatives to using an array of type descriptions to store referenced types in a nested type were considered, including: + +- Storing the referenced types inside the individual type descriptions and accessing them by traversing the type description tree recursively instead of using a lookup table. + + - Rejected because the IDL spec does not allow for a type description to store itself, and also because it could possibly introduce duplicate, redundant type descriptions in the tree, using up unnecessary space. + +- Sending referenced types in a separate service call or message. + + - Rejected because needing to collate all of the referenced types on the receiver end introduces additional implementation complexity, and also increases network bandwidth with all the separate calls that must be made. + Backwards Compatibility ======================= From b6a708652cd2c54b485fa6c512e9b32f6d5b5113 Mon Sep 17 00:00:00 2001 From: youliang Date: Sun, 24 Jul 2022 06:34:24 +0800 Subject: [PATCH 11/22] Add alternatives and on_inconsistent_topic section (#5) Signed-off-by: youliang --- rep-2011.rst | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index fc7ab8b2b..b310b3c11 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -205,6 +205,8 @@ However, the downside is that detecting the mismatch is more difficult and it al .. TODO:: figure out if mismatched types produces a IncompatibleQoSOffered callback or not, then document the recommended way to detect type version mismatches, also look into ``DDS XTypes spec v1.3: 7.6.3.4.2: INCONSISTENT_TOPIC`` as a possible alternative +DDS standard provides an ``on_inconsistent_topic()`` function interface in the ``ParticipantListener`` for user to implement a callback funtion when an inconsistent topic event is detected. An initial investigation has been done to validate the implementation of the ``on_inconsistent_topic()`` for different middleware vendors, specifically FastDDS, Cyclone, and Connext. As a result, the working implementation is only available in RTI Connext (as of Jul 2022). + Notes for Implementing the Recommended Strategy with DDS ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -668,12 +670,28 @@ TODO Use Type Hash from Middleware, e.g. from DDS-XTypes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -TODO +Type hash can be obtained by the native middleware api. For example, with fastDDS, the type hash can be obtained with ``TypeIdentifier->equivalence_hash()`` during the ``on_type_discovery()`` callback. the rmw layer can choose to use the provided hash to impose the aforementioned type enforcement. + +Evolving Message Definition with Extensible Type +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When defining the ``.idl`` msg file, user can choose to apply annotations to the message definition (DDS XTypes spec v1.3: 7.3.1.2 Annotation Language). Evolving message type can be achieved by leveraging optional fields and inheritance. For example, the ``Temperature.idl`` below uses ``@optional`` and ``@extensibility`` in the message definition. + +.. code:: + @extensibility(APPENDABLE) + struct Temperature + { + unsigned long long timestamp + long long temperature + @optional double temperature_float + }; + +Furthermore, an initial evolving message test with FastDDS, Cyclone and Connext middleware implementation shows that ``@appendable`` and ``@optional`` are implemented in Cyclone and Connext, but not FastDDS (as of Jul 2022). Handle Detection of Version Mismatch "Above" rmw Layer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -TODO +We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. The message version for each participant will then be accessable across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. Prevent Communication of Mismatched Versions "Above" rmw Layer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From a030033f589f7715cde4003b71d317c87ce2743a Mon Sep 17 00:00:00 2001 From: William Woodall Date: Mon, 25 Jul 2022 17:35:52 -0700 Subject: [PATCH 12/22] fix rst errors, revise a few sections Signed-off-by: William Woodall --- .gitignore | 3 +- rep-2011.rst | 173 ++++++++++++++++++++++++++++++++++++++------------- 2 files changed, 132 insertions(+), 44 deletions(-) diff --git a/.gitignore b/.gitignore index 4c3bc3c4d..82ae0a379 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,5 @@ rep-0000.rst *.html *.pyc -*~ \ No newline at end of file +*~ +.python-version diff --git a/rep-2011.rst b/rep-2011.rst index b310b3c11..27ba3b1a3 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -1,6 +1,6 @@ REP: 2011 Title: Evolving Message Types, and Other ROS Interface Types, Over Time -Author: William Woodall +Author: William Woodall , Brandon Ong , You Liang Tan , Kyle Marcey Status: Draft Type: Standards Track Content-Type: text/x-rst @@ -13,17 +13,18 @@ Abstract This REP proposes patterns and approaches for evolving message, service, and action ROS interface types over time. -The proposed patterns use the existing default serialization technology, i.e. CDR, in combination with new tooling. -However, technologies provided by different, perhaps future, serialization technologies can always be used in addition. +The proposed patterns use the existing default serialization technology, i.e. CDR, in combination with new tooling to detect when types have changes and new tooling to convert between versions. +However, it should be possible to use features provided by different, perhaps future, serialization technologies in addition to the approaches proposed in this REP. -Specifically, this REP proposes that interpreting older versions of the message at runtime can be achieved by using its type information and then converting it to the current version of the message using user-defined transfer functions. +Specifically, this REP proposes that we provide tooling to convert from old versions of types to newer versions of types on demand, using user-defined transfer functions. This approach is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. -This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the type information of the messages on the wire, beyond their name and version, and use that to dynamically interpret it at runtime. +This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the full type descriptions on the wire, beyond their name and version, and use that description to introspect the values of it at runtime. -This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies so long as we have the ability to introspect the messages at runtime without prior knowledge of the type information, which is a feature we also need for generic introspection tools. +This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies. +This is true so long as we have the ability to introspect the messages at runtime without prior knowledge of the type description, which is a feature we also need for generic introspection tools like ``rosbag2`` and ``ros2 topic echo``. -Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support dynamic interpretation of messages at runtime. +Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support runtime introspection of messages. It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification, and an interface similar to DDS-XTypes will likely be adopted in the ROS 2 middleware API, without explicitly relying on it. Alternatives are also discussed in more detail. @@ -31,8 +32,8 @@ Alternatives are also discussed in more detail. Motivation ========== -Evolving types over time is a necessary part of developing a new system using ROS 2 as well as a necessary part of evolving ROS 2 itself over time. -As needs change, types often need to change to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. +Evolving types over time is a necessary part of developing a new system using ROS 2, as well as a necessary part of evolving ROS 2 itself over time. +As needs change, types often need to change as well to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. To understand this need, it's useful to look at a few common scenarios, for example consider this message definition (using the "rosidl" ``.msg`` format for convenience, but it could just as easily be the OMG IDL ``.idl`` format or something else like protobuf): @@ -43,6 +44,7 @@ To understand this need, it's useful to look at a few common scenarios, for exam int64 temperature There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from a prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int64 to float64. + There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields, they may add a new optional field like this (pseudo code): .. code:: @@ -52,7 +54,7 @@ There are a few ways the developers could approach this, e.g. if the serializati int64 temperature optional float64 temperature_float -Updated code could use it like this: +Updated code could use it like this (pseudo code): .. code:: @@ -67,9 +69,10 @@ Updated code could use it like this: } This is not uncommon to see in projects, and it has the advantage that old data, whether it is from a program running on an older machine or is from old recorded data, can be interpreted by newer code without additional machinery or features beyond the optional field type. -Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, which is very forgiving for developers in a way, but likely wouldn't be good long term solution. +Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, which is very forgiving for developers in a way. +However, as projects grow and the number of incidents, like the one described in the example above, increase, this kind of approach can generate a lot of complexity to sending and receiving code. -You can imagine other serialization features, like inheritance or type coercion, could be used to address this desired change, but in each case it relies on a feature of the serialization technology which may or may not be available in all middleware implementations for ROS 2. +You can imagine other serialization features, like inheritance or implicit type conversion, could be used to address this desired change, but in each case it relies on a feature of the serialization technology which may or may not be available in all middleware implementations for ROS 2. Each of these approaches also requires a more sophisticated message API for the users than what is currently provided by ROS 2. At the moment the user can access all fields of the message directly, i.e. "member based access" vs "method based access", and that would need to change in some way to use some of these features. @@ -86,9 +89,9 @@ And also update any code publishing or subscribing to this type at the same time This is typically what happens right now in ROS 2, and also what happened historically in ROS 1. This approach is simple and requires no additional serialization features, but obviously doesn't do anything on its own to help developers evolve their system while maintaining support for already deployed code and utilizing existing recorded data. -However, with additional tooling these cases can be handled without the code of the application knowing about. +However, this REP proposes that with additional tooling these cases can be handled without the code of the application knowing about. Consider again the above example, but this time the update to the message was done without backwards compatibility in the message definition. -This means that publishing old recorded data, for example, will not work with new code that subscribes to the topic but expects the new version of the message. +This means that publishing old recorded data, for example, will not work with new code that subscribes to the topic, because the new subscription expects the new version of the message. For the purpose of this example, let's call the original message type ``Temperature`` and the one using ``float64`` we'll call ``Temperature'``. So, if you have rosbag2 publishing ``Temperature`` messages and a program consuming ``Temperature'`` messages they will not communicate, unless you have an intermediate program doing the translation. @@ -98,21 +101,21 @@ So, if you have rosbag2 publishing ``Temperature`` messages and a program consum │ rosbag2 ├───────────────►│transfer func.├──────────────►│ App │ └─────────┘ Temperature └──────────────┘ Temperature' └─────┘ -The "transfer function" can be user-defined, or for simple changes (like changing the field type to a compatible type) it can be done automatically. -We already do something like this for the ROS 1 to ROS 2 bridge in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. - -Furthermore, the "transfer function" approach also allows for runtime transformation of messages that only change in field semantics (e.g., a change of ``float distance`` to mean centimeters instead of meters, but with no change in field name or type). -Although in those cases, users will likely have to define the transfer function themselves. +The "transfer function" can be user-defined, or for simple changes, like changing the field type to a compatible type, it can be done automatically. +We already do something like this for the "ROS 1 to ROS 2 bridge" in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. -.. TODO:: cite the above +.. TODO:: cite the ros1_bridge rules and the rosbag migration rules -This approach requires a few features, like the ability to have a single application read old and new versions of a message at the same time, and it requires more infrastructure and tooling to make it work, but it has the advantage of keeping both the publishing and subscribing code simple, i.e agnostic to the fact that there are other versions of the message, and it keeps the message type from being cluttered with vestigial fields. +The transfer functions require the ability to have a single application which can interact with both the old and the new versions of a message at the same time. +Making this possible requires several new technical features for ROS 2, and some new infrastructure and tooling +However, by keeping the conversion logic contained in these transfer functions, it has the advantage of keeping both the publishing and subscribing code simple. +That is to say, it keeps both the publishing and subscribing code agnostic to the fact that there are other versions of the message, and it keeps the message type from being cluttered with vestigial fields, e.g. having both a ``temperature`` and ``temperature_float`` in the same message. Either way, a problem can usually be solved by changing a message in some of, if not all, of the of the above mentioned ways, and is often influenced by what the underlying technology allows for or encourages. ROS 2 has special considerations on this topic because it can support different serialization technologies, though CDR is the default and most common right now, and those technologies have different capabilities. It is neither desirable to depend on features of a specific technology, therefore tying ROS 2 to a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide, again tying ROS 2 to some specific technologies through their features. -We will require some features from the middleware and serialization technology, however, to handle evolving interfaces, but we should try to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. +That being said, this proposal will require some specific features from the middleware and serialization technology, but the goal is to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as a rationale in the Rationale section and a discussion of alternatives in the Alternatives section and its sub-sections. @@ -125,21 +128,91 @@ TODO Specification ============= -The proposal is to provide tooling to help users identify when messages have changed, help users configure their system to convert between versions of messages on the fly, and help users write the code needed to convert between types when the conversion is not trivial. +The proposal is to provide tooling to help users: + +- identify when messages have changed +- configure their system to convert between versions of messages on demand +- write the code needed to convert between types when the conversion is not trivial + +In order to do this, this proposal describes how ROS 2 can be changed to support these tools by: + +- enforcing type compatibility by version + + - by providing type versions as hashes for types automatically + - and providing access to the type versions used by remote endpoints + +- providing access to the complete type description of types being used + + - and providing remote access of the type description used by remote endpoints + +- making it possible to publish and subscribe to topics using just the type description + + - even when the type was not available at compile time + - and introspecting the values from a serialized message using just the type description + +This Specification section covers the conceptual overview in more detail, then describes each of the technical changes needed in ROS 2, and then finishes by describing the new tooling that will help users in the aforementioned ways. Conceptual Overview ------------------- Users will be able to calculate the "type version hash" for an interface (e.g. a message, service, or action) using the ``ros2 interface hash `` command. -Additionally, if a topic has two types being used on it with the same type name, but different type versions, a warning will be logged and the endpoints that do not match will not communicate. - -.. TODO:: how does this interact with serialization features like optional fields and inheritance? Is there a way to override this behavior when the hashes don't match but communication will work due to optional fields or inheritance? +Additionally, if a topic has two types being used on it with the same type name, but different type versions, a warning may be logged and the endpoints that do not match may not communicate. +An exception to this rule is that if the underlying middleware has more sophisticated rules for matching types, for example the type has been extended with an optional field they may still match, then ROS 2 will defer to it and not produce warnings when the type version hashes do not match. +Instead, ROS 2 will rely on the middleware to notify it when two endpoints do not match based on their types not being compatible, so that a warning can be produced. -When a mismatch is detected, the user can use predefined, or user-defined, "transfer functions" to convert between versions of the type until it is in the type they wish to send or receive. +When a mismatch is detected, the user can use predefined, or user-defined, "transfer functions" to convert between versions of the type until it is in the type version they wish to send or receive. They can use a tool that will look at a catalogue of available transfer functions to find a single transfer function, or a set of transfer functions, to get from the current type version to the desired type version. + +.. code:: + + ┌───────────────────────┐ + ┌───────────────┐ │ Implicit Conversion │ ┌───────────────┐ + │Message@current├────►│ by ├───►│Message@desired│ + └───────────────┘ │ Generic Transfer Func.│ └───────────────┘ + └───────────────────────┘ + + or + + ┌───────────────────────┐ + ┌───────────────┐ │ Implicit Conversion │ ┌────────────────────┐ + │Message@current├────►│ by ├───►│Message@intermediate│ + └───────────────┘ │ Generic Transfer Func.│ └────────────────────┘ + └───────────────────────┘ + + or + + ┌───────────────────────┐ + ┌───────────────┐ │ User-defiend transfer │ ┌───────────────┐ + │Message@current├────►│ function from current ├───...───►│Message@desired│ + └───────────────┘ │to desired/intermediate│ ▲ └───────────────┘ + └───────────────────────┘ │ + │ + possibly other transfer functions + The tool will start with the current type version and see if it can be automatically converted to the desired type version, or if it is accepted as an input to any user-defined transfer functions or if it can be automatically converted into one of the input type versions for the transfer functions. It will continue to do this until it reaches the desired type version or it fails to find a path from the current to the desired type version. +.. code:: + + ┌──────────────────────┐ /topic ┌─────────────────────────┐ + │Publisher├────────X────────►│Subscription│ + └──────────────────────┘ └─────────────────────────┘ + │ │ │ + remap publisher │ │ │ and add transfer function + ▼ ▼ ▼ + ┌──────────────────────┐ ┌─────────────────────────┐ + │Publisher│ │Subscription│ + └─┬────────────────────┘ └─────────────────────────┘ + │ ▲ + │ ┌─────────────────────────────────┐ │ + ✓ /topic/ABC │ Transfer Functions for ABC->XYZ │ /topic ✓ + │ │ │ │ + │ ┌──────────┴──────────────┐ ┌──────────────┴───────┐ │ + └─►│Subscription│ │Publisher├──┘ + └──────────┬──────────────┘ └──────────────┬───────┘ + │ │ + └─────────────────────────────────┘ + Once the set of necessary transfer functions has been identified, the ROS graph can be changed to have one side of the topic be remapped onto a new topic name which indicates it is of a different version that what is desired, and then the transfer function can be run as a component node which subscribes to one version of the message, performs the conversion using the chain of transfer functions, and then publishes the other version of the message. Tools will assist the user in making these remappings and running the necessary component nodes with the appropriate configurations, either from their launch file or from the command line. @@ -147,7 +220,19 @@ Tools will assist the user in making these remappings and running the necessary Once the mismatched messages are flowing through the transfer functions, communication should be possible and neither the publishing side nor the subscribing side have any specific knowledge of the conversions taking place or that any conversions are necessary. -In order to support this vision, three missing features will need to be added into ROS 2: controlling matching based on the type version hash (interface type enforcement), communicating the interface type description between nodes (inter-process type description distribution), and (de)serializing messages and services given only a type description of the interface and a buffer of bytes (runtime type introspection). +.. TODO:: + + Extend conceptual overview to describe how this will work with Services and Actions. + Services, since they are not sensitive to the many-to-many (many publisher) issue, unlike Topics, and because they do not have as many QoS settings that apply to them, they can probably have transfer functions that are plugins, rather than separate component nodes that repeat the service call, like the ros1_bridge. + Actions will be a combination of topics and services, but will have other considerations in the tooling. + +In order to support this vision, three missing features will need to be added into ROS 2 (which were also mentioned in the introduction): + +- enforcing type compatibility by version +- providing access to the complete type description of types being used +- making it possible to publish and subscribe to topics using just the type description + +These features are described in the following sections. Interface Type Enforcement -------------------------- @@ -155,9 +240,10 @@ Interface Type Enforcement In order to detect type version mismatches and enforce them, a way to uniquely identify versions is required, and this proposal uses type version hashes. Type Version Hash -~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^ -The type version hashes are not sequential and do not imply any rank among versions of the type. That is, given two version hashes of a type, there is no way to tell which is "newer". +The type version hashes are not sequential and do not imply any rank among versions of the type. +That is, given two version hashes of a type, there is no way to tell which is "newer". The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. @@ -180,7 +266,7 @@ The user-defined interface version makes it possible to change the version of a .. TODO:: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more Enforcing Type Version -~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^ The type version hash can be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. @@ -237,7 +323,7 @@ So, it is useful to have a mechanism to convey the type descriptions from the so Furthermore, this feature should be agnostic to the underlying middleware and serialization library. Sending the Type Description -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. TODO:: should each publisher be contacted for each type/version pair or should we assume that the type/version pair guarantees a unique type description? @@ -251,7 +337,7 @@ The service server will respond with the type description and any necessary meta This service is not expected to be called frequently, and is likely to only occur when new topic or service endpoints are created, and even then, only if the endpoint type hashes do not match. Type Description Contents and Format -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The response sent by the service server will contain a combination of the original ``idl`` or ``msg`` file's content, as well as any necessary information to serialize and deserialize the raw message buffers sent on the topic. The response will contain a version of the description that contains comments from the original type description, as those might be relevant to interpreting the semantic meaning of the message fields. @@ -401,7 +487,7 @@ Given that the type description message interface has to be generic enough to su This should be done in such a way that the fields are tick-tocked and deprecated properly, possibly by having explicitly named versions of this interface, e.g. ``TypeDescriptionV1`` and ``TypeDescriptionV2`` and so on. Implementation in the `rcl` Layer -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The implementation of the type description distribution feature will be made in the ``rcl`` layer as opposed to the ``rmw`` layer to take advantage of the abstraction away from the middleware and to allow for compatibility with the client libraries. @@ -627,7 +713,7 @@ Type Description Distribution ----------------------------- Using a Single ROS Service per Node -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The node that is publishing the data must already have access to the correct type description, at the correct version, in order to publish it, and therefore it is natural to get the data from that node. Similarly, a subscribing node also knows what type they are wanting to receive, both in name and version, and therefore it is again natural to get that information from the subscribing node. @@ -642,7 +728,7 @@ Some of the alternatives offered other benefits, but using a ROS Service introdu .. TODO:: cite the above, https://en.wikipedia.org/wiki/Request%E2%80%93response Combining the Raw and Parsed Type Description in the Service Response -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The contents of the "get type description" service response should include information that supports both aforementioned use cases (i.e. tools and runtime type introspection). These use cases have orthogonal interests, with the former requiring human-readable descriptions, and the latter preferring machine-readable descriptions. @@ -655,7 +741,7 @@ At the same time, it is useful to send information with the original description Finally, while there could be an argument for sending a losslessly compressed version of the message file, the expected low frequency of queries to the type description service incurs a negligible overhead that heavily reduces the benefit. Implementing in ``rcl`` versus ``rmw`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ While it is true that implementing the type description distribution on the ``rmw`` layer would allow for much lower level optimization, removing the layer of abstraction avoids having to implement this feature in each rmw implementation. @@ -668,16 +754,17 @@ Alternatives TODO Use Type Hash from Middleware, e.g. from DDS-XTypes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Type hash can be obtained by the native middleware api. For example, with fastDDS, the type hash can be obtained with ``TypeIdentifier->equivalence_hash()`` during the ``on_type_discovery()`` callback. the rmw layer can choose to use the provided hash to impose the aforementioned type enforcement. Evolving Message Definition with Extensible Type -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ When defining the ``.idl`` msg file, user can choose to apply annotations to the message definition (DDS XTypes spec v1.3: 7.3.1.2 Annotation Language). Evolving message type can be achieved by leveraging optional fields and inheritance. For example, the ``Temperature.idl`` below uses ``@optional`` and ``@extensibility`` in the message definition. .. code:: + @extensibility(APPENDABLE) struct Temperature { @@ -689,16 +776,16 @@ When defining the ``.idl`` msg file, user can choose to apply annotations to the Furthermore, an initial evolving message test with FastDDS, Cyclone and Connext middleware implementation shows that ``@appendable`` and ``@optional`` are implemented in Cyclone and Connext, but not FastDDS (as of Jul 2022). Handle Detection of Version Mismatch "Above" rmw Layer -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. The message version for each participant will then be accessable across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. Prevent Communication of Mismatched Versions "Above" rmw Layer -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TODO Type Description Distribution -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Other Providers of Type Description ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -729,7 +816,7 @@ As for the format of the type description, using the ROS interfaces to describe It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. TypeDescription Structure -~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^ Representing Fields as An Array of Field Types ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ From 04306e4906d2329825ac661a6c8ad3bce774ec51 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Wed, 27 Jul 2022 04:23:40 -0700 Subject: [PATCH 13/22] touch ups, rename to Run-Time Interface Reflection, reorganize Signed-off-by: William Woodall --- rep-2011.rst | 591 +++++++++++++++++++++++++++++++++------------------ 1 file changed, 390 insertions(+), 201 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 27ba3b1a3..dcfaab265 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -19,13 +19,13 @@ However, it should be possible to use features provided by different, perhaps fu Specifically, this REP proposes that we provide tooling to convert from old versions of types to newer versions of types on demand, using user-defined transfer functions. This approach is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. -This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the full type descriptions on the wire, beyond their name and version, and use that description to introspect the values of it at runtime. +This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the full type descriptions on the wire, beyond their name and version, and use that description to introspect the values of it at runtime using reflection. This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies. This is true so long as we have the ability to introspect the messages at runtime without prior knowledge of the type description, which is a feature we also need for generic introspection tools like ``rosbag2`` and ``ros2 topic echo``. Also covered in this REP are recommended changes to the middleware API in ROS 2, as well as additional infrastructure, to support runtime introspection of messages. -It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification, and an interface similar to DDS-XTypes will likely be adopted in the ROS 2 middleware API, without explicitly relying on it. +It is very similar to, though slightly narrower and more generic than, the DDS-XTypes specification\ [1]_, and an interface similar to DDS-XTypes will likely be adopted in the ROS 2 middleware API, without explicitly relying on it. Alternatives are also discussed in more detail. @@ -104,7 +104,9 @@ So, if you have rosbag2 publishing ``Temperature`` messages and a program consum The "transfer function" can be user-defined, or for simple changes, like changing the field type to a compatible type, it can be done automatically. We already do something like this for the "ROS 1 to ROS 2 bridge" in order to handle changes between message types in ROS 1 and ROS 2, and something like this was also done for rosbags in ROS 1 with the "bag migration rules" feature. -.. TODO:: cite the ros1_bridge rules and the rosbag migration rules +.. TODO:: + + cite the ros1_bridge rules and the rosbag migration rules The transfer functions require the ability to have a single application which can interact with both the old and the new versions of a message at the same time. Making this possible requires several new technical features for ROS 2, and some new infrastructure and tooling @@ -138,8 +140,9 @@ In order to do this, this proposal describes how ROS 2 can be changed to support - enforcing type compatibility by version - - by providing type versions as hashes for types automatically - - and providing access to the type versions used by remote endpoints + - by providing type versions as hashes for types automatically, and + - providing access to the type versions used by remote endpoints, and + - warning users when type enforcement is preventing two endpoints from matching - providing access to the complete type description of types being used @@ -216,7 +219,9 @@ It will continue to do this until it reaches the desired type version or it fail Once the set of necessary transfer functions has been identified, the ROS graph can be changed to have one side of the topic be remapped onto a new topic name which indicates it is of a different version that what is desired, and then the transfer function can be run as a component node which subscribes to one version of the message, performs the conversion using the chain of transfer functions, and then publishes the other version of the message. Tools will assist the user in making these remappings and running the necessary component nodes with the appropriate configurations, either from their launch file or from the command line. -.. TODO:: discuss the implications for large messages and the possibility of having the transfer functions be colocated with either the publisher or subscription more directly than with component nodes and remapping. +.. TODO:: + + discuss the implications for large messages and the possibility of having the transfer functions be colocated with either the publisher or subscription more directly than with component nodes and remapping. Once the mismatched messages are flowing through the transfer functions, communication should be possible and neither the publishing side nor the subscribing side have any specific knowledge of the conversions taking place or that any conversions are necessary. @@ -248,105 +253,217 @@ That is, given two version hashes of a type, there is no way to tell which is "n The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. -In order to calculate the type version hashes so that they are stable and are not sensitive to trivial changes like changes in the comments or whitespace in the IDL file, the IDL file given by the user, which may be a ``.msg`` file, ``.idl`` file, or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. +The type version hashes are calculated in a stable way and are not sensitive to trivial changes like changes in the comments or whitespace of the IDL file +The IDL file given by the user, which may be a ``.msg`` file, ``.idl`` file, or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. The data structure includes: - a list of field names and types, but not default values -- the serialization format -- the serialization format version +- a recursive list of field names and types for referenced types - an optional user-defined interface version, or 0 if not provided +.. TODO:: + + Should the message name, including package and namespace, be part of this? + Consider a situation where you have the same data structure but two different type names, should those two instances have the same hash? + They are unique when paired with their type name, so it should be ok, but it is a bit weird, perhaps, since we're not using this type hash to check for wire compatibility between differently named types. + +.. TODO:: + + Related TODO, should we just use the TypeDescription described in a later section? + It's essentially the same thing, but it does include the type name. + I (wjwwood) am leaning in this direction. + The resulting data structure is hashed using a standard SHA-1 method, resulting in a standard 160-bit (20-byte) hash value which is also generally known as a "message digest". -This hash is combined with a "type version hash standard version", the first of which will be ``IDLHASH-1``, with an ``@`` symbol, resulting in a complete type version hash like ``IDLHASH-1@<160-bit SHA-1 of data structure>``. +This hash is combined with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS1``, with an ``_`` symbol, resulting in a complete type version hash like ``RIHS1_<160-bit SHA-1 of data structure>``. This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. -The user-defined interface version makes it possible to change the version of a message that only changed in "field semantics" (i.e. without changing field names or types), and therefore makes it possible to write "transfer functions" to handle semantic-only conversions between versions. +For now, the list of field names and their types are the only contributing factors, but in the future that could change, depending on which "annotations" are supported in ``.idl`` files. +The "IDL - Interface Definition and Language Mapping" design document\ [2]_ describes which features of the OMG IDL standard are supported by ROS 2. +If that is extended in the future, then this data structure will need to be updated, and the "type version hash standard version" will need to be incremented as well. + +.. TODO:: + + Re-audit the supported features from OMG IDL according to the referenced design document, including the @key annotation and how it may impact this for the reference implementation. + +The optional user-defined interface version makes it possible to change the version hash of a message that only changed in "field semantics" (i.e. without changing field names or types), and therefore makes it possible to write "transfer functions" to handle semantic-only conversions between versions. +There is currently no standard way specify the user-defined interface version in either ``.msg`` or ``.idl`` files. + +.. TODO:: + + Remove the idea of the user-defined interface version or define how it can be supplied by the user in one or more of the idl file kinds. -.. TODO:: is the list of field names and types sufficient? how to capture things like .idl annotations, etc... I'm thinking of serialization format specific entries can be added to this data structure, but need to sketch it out a bit more +Note that this data structure does not include the serialization format being used, nor does it include the version of the serialization technology. +This type version hash is for the *description* of the type, and is not meant to be used to determine wire compatibility by itself. +The type version hash must be considered in context, with the serialization format and version included to determine wire compatibility. Enforcing Type Version ^^^^^^^^^^^^^^^^^^^^^^ -The type version hash can be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. +The type version hash may be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. +Again, whether or not it is used will depend on the underlying middleware and how it determines if types are compatible between endpoints. +Simpler middlewares will not do anything other than check that the type names match, in which case the version hash will likely be used. +However, in more sophisticated middlewares type compatibility can be determined using more complex rules and by looking at the type descriptions themselves, and in those cases the type version hash may not be used to determine matching. + +When creating a publisher or subscription, the caller normally provides: + +- a topic name, +- QoS settings, and +- a topic type + +Where the topic type is represented as a string and is automatically deduced based on the type given to the function that creates the entity. +The type may be passed as a template parameter in C++ or as an argument to the function in Python. + +For example, creating a publisher for the C++ type ``std_msgs::msg::String`` using ``rclcpp`` may result in a topic type like the string ``std_msgs/msg/String``. + +All of the above items are used by the middleware to determine if two endpoints should communicate or not, and this REP proposes that the type version be added to this list of provided information. -When creating a publisher or subscription, the caller normally provides: a topic name, QoS settings, and a topic type. -The topic type is represented as a string and is automatically deduced based on the type given to the create function, e.g. as a template parameter in C++ or the message type as an argument in Python. -For example, creating a publisher for ``std_msgs::msg::String`` in C++, may result in a topic type like ``std_msgs/msg/String``. -All of these items are used by the middleware to determine if two endpoints should communicate or not, and this REP proposes that the type version be added to this list of provided information. Nothing needs to change from the user's perspective, as the type version can be extracted automatically based on the topic type given, either at the ``rcl`` layer or in the ``rmw`` implementation itself. -However, the type version would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription, and therefore the job of using that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. +That is to say, how users create things like publishers and subscription should not need to change, no matter which programming language is being used. -The method for implementing the detection and enforcement of type version mismatches is left up to the middleware, as some middlewares will have tools to make this efficient and others will implement something like what would be possible in the ``rcl`` and above layers. +However, the type version hash would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription. +The decision of whether or not to use that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. + +The method for implementing the detection and enforcement of type version mismatches is left up to the middleware. +Some middlewares will have tools to handle this without this new type version hash and others will implement something like what would be possible in the ``rcl`` and above layers using the type version hash. By keeping this a detail of the ``rmw`` implementation, we allow the ``rmw`` implementations to make optimizations where they can. Recommended Strategy for Enforcing that Type Versions Match ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If the middleware has a feature to handle type compatibility already, as is the case with DDS-XTypes which is discussed later, then that can be used to enforce type safety, and then the type version hash can be used to warn the user when the communication may not happen due to a version mismatch, and also if can be put into recordings for future comparison. +If the middleware has a feature to handle type compatibility already, as is the case with DDS-XTypes which is discussed later, then that can be used to enforce type safety, and then the type version hash would only be used for debugging and for storing in recordings. + +However, if the middleware lacks this kind of feature, then the recommended strategy for accomplishing this in the ``rmw`` implementation is to simply concatenate the type name and the type version hash with double underscores and then use that as the type name given to the underlying middleware. +For example, a type name using this approach may look like this: + +.. code:: + + sensor_msgs/msg/Image__RIHS1_XXXXXXXXXXXXXXXXXXXX + +This has the benefit of "just working" for most middlewares which at least match based on the name of the type, and it is simple, requiring no further custom hooks into the middleware's discovery or matchmaking process. + +However, one downside with this approach is that it makes interoperation between ROS 2 and the "native" middleware more difficult, as appending the version hash to the type name is just "one more thing" that you have to contend with when trying to connect non-ROS endpoints to a ROS graph. + +Alternative Strategy for Enforcing that Type Versions Match +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Sometimes the recommended strategy would interfere with features of the middleware that allow for more complex type compatibility rules, or otherwise interferes with the function of the underlying middleware. +In these cases, it is appropriate to not take the recommended strategy and delegate the matching process and notification entirely to the middleware. -However, if the middleware lacks this kind of feature, then the recommended strategy for accomplishing this in the ``rmw`` implementation is to simply concatenate the type name and the type version hash and then use that as the type name given to the underlying middleware. -This has the benefit of "just working" for most middlewares which at least match based on the name of the type, and it is simple, requiring no further custom hooks into the middleware's discovery or match making process. -However, the downside is that detecting the mismatch is more difficult and it also makes interoperating with ROS using the native middleware more difficult, as appending the version hash to the type name is just "one more thing" that you have to contend with when trying to connect non-ROS endpoints to a ROS graph. +However, in order to increase compatibility between rmw implementations, it is recommended to fulfill the recommended approach whenever possible, even if the type name is not used in determining if two endpoints will match, i.e. in the case that the underlying middleware does something more sophisticated to determine type compatibility but ignores the type name itself. -.. TODO:: figure out if mismatched types produces a IncompatibleQoSOffered callback or not, then document the recommended way to detect type version mismatches, also look into ``DDS XTypes spec v1.3: 7.6.3.4.2: INCONSISTENT_TOPIC`` as a possible alternative +This recommendation is particularly useful in the case where a DDS based ROS 2 endpoint is talking to another DDS-XTypes based ROS 2 endpoint. +The DDS-XTypes based endpoint then has a chance to "gracefully degrade" to interoperate with the basic DDS based ROS 2 endpoint. +This would not be the case if the DDS-XTypes based ROS 2 endpoint did not include the type version hash in the type name, as is suggested with the recommended strategy. -DDS standard provides an ``on_inconsistent_topic()`` function interface in the ``ParticipantListener`` for user to implement a callback funtion when an inconsistent topic event is detected. An initial investigation has been done to validate the implementation of the ``on_inconsistent_topic()`` for different middleware vendors, specifically FastDDS, Cyclone, and Connext. As a result, the working implementation is only available in RTI Connext (as of Jul 2022). +.. TODO:: + + We need to confirm whether or not having a different type name prevents DDS-XTypes from working properly. + We've done some experiments, but we need to summarize the results and confirm the recommendation in the REP specifically geared towards DDS. + +Notifying the ROS Client Library +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +One potential downside to delegating type matching to the rmw implementation is that detecting the mismatch is more complicated. +If ROS 2 is to provide users a warning that two endpoints will not communicate due to their types not matching, it requires there to be a way for the middleware to notify the ROS layer when a topic is not matched due to the type incompatibility. +As some of the following sections describe, it might be that the rules by which the middleware decides on type compatibility are unknown to ROS 2, and so the middleware has to indicate when matches are and are not made. + +If the middleware just uses the type name to determine compatibility, then the rmw implementation can just check the type version hash, and if they do not match between endpoints then the rmw implementation can notify ROS 2, and a warning can be produced. + +Either way, to facilitate this notice, the ``rmw_event_type_t`` shall be extended to include a new event called ``RMW_EVENT_OFFERED_TYPE_INCOMPATIBLE``. +Related functions and structures will also be updated so that the event can be associated with specific endpoints. + +.. TODO:: + + It's not clear how we will do this just now, since existing "QoS events" lack a way to communicate this information, I (wjwwood) think. + +Accessing the Type Version Hash +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For debugging and introspection, the type version hash will be accessible via the ROS graph API, by extending the ``rmw_topic_endpoint_info_t`` struct, and related types and functions, to include the type version hash, ``topic_type_version_hash``, as a string. +It should be along side the ``topic_type`` string in that struct, but the ``topic_type`` field should not include the concatenated type version hash, if the recommended approach is used. +Instead the type name and version hash should be separated and placed in the fields separately. + +This information should be transmitted as part of the discovery process. + +This field can be optionally an empty string, in order to support interaction with older versions of ROS where this feature was not yet implemented, but it should be provided if at all possible. Notes for Implementing the Recommended Strategy with DDS ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -TODO +The DDS standard provides an ``on_inconsistent_topic()`` method on the ``ParticipantListener`` class as a callback function which is called when an ``INCONSISTENT_TOPIC`` event is detected. +This event occurs when the topic type does not match between endpoints, and can be used for this purpose, but at the time of writing (July 2022), this feature is not supported across all of the DDS vendors that can be used by ROS 2. + +Sending of the type version hash during discovery should be done using the ``USER_DATA`` QoS setting, even if the type version hash is not used for determining type compatibility, and then provided to the user-space code through the ``rmw_topic_endpoint_info_t`` struct. Interactions with DDS-XTypes or Similar Implicit Middleware Features ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -TODO - +When using DDS-Xtypes type compatibility is determined through sophisticated and configurable rules, allowing for things like extensible types, optional fields, implicit conversions, and even inheritance. +Which of these features is supported for use with ROS is out of scope with this REP, but if any of them are in use, then it may be possible for two endpoints to match and communicate even if their ROS 2 type version hashes do not match. -.. TODO:: move the draft section with details here after one of these options is selected: USER_DATA+ignore, USER_DATA+discovery "plugin", append type hash to type name in DDS, use type hash in DDS partition +In this situation the middleware is responsible for communicating to the rmw layer when an endpoint will not be matched due to type incompatibility. +The ``INCONSISTENT_TOPIC`` event in DDS applies for DDS-XTypes as well, and should be useful in fulfilling this requirement. Type Description Distribution ----------------------------- For some use cases the type version hash is insufficient and instead the full type description is required. -One of those use cases, which is also described in this REP, is "runtime type introspection", which is the ability to introspect the contents of a message at runtime when the description for that message, or that version of that message, was unavailable at compile time. +One of those use cases, which is also described in this REP, is "Run-Time Interface Reflection", which is the ability to introspect the contents of a message at runtime when the description for that message, or that version of that message, was unavailable at compile time. In this use case the type description is used to interpret the serialized data dynamically. -Another use case, which is not covered in this REP, is using the type description in tooling to either display the type description to the user or to include it in recordings like rosbags. +Another use case, which is not covered in this REP, is using the type description in tooling to either display the type description to the user or to include it in recordings. In either case, where the type description comes from doesn't really matter, and so, for example, it could be looked up on the local filesystem or read from a rosbag file. -However, in practice, the correct type description may not be found locally, especially in cases where you have different versions of messages in the same system, either because it's on another computer or perhaps because it is from a different distribution of ROS or was built in a different workspace. +However, in practice, the correct type description may not be found locally, especially in cases where you have different versions of messages in the same system, e.g.: + +- because it's on another computer, or +- because it is from a different distribution of ROS, or +- because it was built in a different workspace, or +- because the application has not been restarted since recompiling a change to the type being used -So, it is useful to have a mechanism to convey the type descriptions from the source of the data to other nodes, which we describe here as "type description distribution". -Furthermore, this feature should be agnostic to the underlying middleware and serialization library. +In any case, it is useful to have a mechanism to convey the type descriptions from the source of the data to other nodes, which we describe here as "type description distribution". + +Furthermore, this feature should be agnostic to the underlying middleware and serialization library, as two endpoints may not have the same rmw implementation, or the data may have been serialized to a different format in the case of playback of a recording. Sending the Type Description ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. TODO:: should each publisher be contacted for each type/version pair or should we assume that the type/version pair guarantees a unique type description? +Type descriptions will be provided by a ROS Service called ``~/_get_type_description``, which will be offered by each node. +There will be a single ROS Service per node, regardless of the number of publishers or subscriptions on that node. + +This service may be optional, for example being enabled or disabled when creating the ROS Node. -Type descriptions will be provided by a ROS Service (``~/_get_type_description``) on each node. -There will be a single ROS service per node, regardless of the number of publishers or subscriptions on that node. +.. TODO:: -.. TODO:: should this service be required/optional? + How can we detect when a remote node is not offering this service? + It's difficult to differentiate between "the Service has not been created yet, but will" and "the Service will never be created". + Should we use a ROS Parameter to indicate this? + But then what if remote access to Parameters (another optional Service) is disabled? + Perhaps we need a "services offered" list which is part of the Node metadata, which is sent for each node in the rmw implementation, but that's out of scope for this REP. -A service request to this ROS Service will comprise of the type name and the type version hash, which is distributed during discovery of endpoints and will be accessible through the ROS API. -The service server will respond with the type description and any necessary metadata needed to do runtime type introspection. +A service request to this ROS Service will comprise of the type name and the type version hash, which is distributed during discovery of endpoints and will be accessible through the ROS Graph API, as described in previous sections. +The ROS Service server will respond with the type description and any necessary metadata needed to do Run-Time Interface Reflection. This service is not expected to be called frequently, and is likely to only occur when new topic or service endpoints are created, and even then, only if the endpoint type hashes do not match. +.. TODO:: + + Should each endpoint be asked about their type/version pair or should we assume that the type/version pair guarantees a unique type description and therefore reuse past queries? + (wjwwood) I am leaning towards assuming the type name and type version hash combo as being a unique identifier. + Type Description Contents and Format ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The response sent by the service server will contain a combination of the original ``idl`` or ``msg`` file's content, as well as any necessary information to serialize and deserialize the raw message buffers sent on the topic. +The response sent by the ROS Service server will contain a combination of the original ``idl`` or ``msg`` file's content, as well as any necessary information to serialize and deserialize the raw message buffers sent on the topic. The response will contain a version of the description that contains comments from the original type description, as those might be relevant to interpreting the semantic meaning of the message fields. Additionally, the response could include the serialization library used, its version, or any other helpful information from the original producer of the data. -All of this type description information will be sent as a ROS 2 service response, as the nodes in the previous section will be queried on the ROS layer. +.. TODO:: -.. TODO:: What happens if the message consumer doesn't have access to the serialization library stated in the meta-type? + What happens if the message consumer doesn't have access to the serialization library stated in the meta-type? + (wjwwood) It depends on what you're doing with the response. If you are going to subscribe to a remote endpoint, then I think we just refuse to create the subscription, as communication will not work, and there's a basic assumption that if you're going to communicate with an endpoint then you are using the same or compatible rmw implementations, which includes the serialization technology. The purpose of this section and ROS Service is to provide the information, not to ensure communication can happen. The ROS 2 message that defines the type description must be able to describe any message type, including itself, and since it is describing the message format, it should work independently from any serialization technologies used. This "meta-type description" message would then be used to communicate the structure of the type as part of the "get type description" service response. @@ -355,19 +472,32 @@ The final form of these interfaces should be found in the reference implementati .. code:: string type_name - string version_hash + string type_version_hash --- - bool successful # True if the type description information is available and populated in the response - string failure_reason # Empty if 'successful' was true, otherwise contains details on why it failed + # True if the type description information is available and populated in the response + bool successful + # Empty if 'successful' was true, otherwise contains details on why it failed + string failure_reason + + # The idl or msg file name + string type_description_raw_file_name + # The idl or msg file, with comments and whitespace + # The file extension and/or the contents can be used to determine the format + string type_description_raw + # The parsed type description which can be used programmatically + TypeDescription type_description + + # Key-value pairs of extra information. + string[] extra_information_keys + string[] extra_information_values - string type_description_raw # The idl or msg file, with comments and whitespace - TypeDescription type_description # The parsed type description which can be used programmatically +.. TODO:: - string serialization_library - string serialization_version - # ... other useful meta data + (wjwwood) I propose we use key-value string pairs for the extra information, but I am hoping for discussion on this point. + This type is sensitive to changes, since changes to it will be hard to roll out, and may need manual versioning to handle. + We could also consider bounding the strings and sequences of strings, so the message could be "bounded", which is nice for safety and embedded systems. -Again, the final form of these interfaces should be referenced from the reference implementation, but the ``TypeDescription`` message type might look like this: +Again, the final form of these interfaces should be referenced from the reference implementation, but the ``TypeDescription`` message type might look something like this: .. code:: @@ -378,26 +508,199 @@ And the ``IndividualTypeDescription`` type: .. code:: - string type_name - Field[] fields + string type_name + Field[] fields And the ``Field`` type: .. code:: - NESTED_TYPE = 0 + FIELD_TYPE_NESTED_TYPE = 0 FIELD_TYPE_INT = 1 FIELD_TYPE_DOUBLE = 2 # ... and so on + FIELD_TYPE_INT_ARRAY = ... + FIELD_TYPE_INT_BOUNDED_SEQUENCE = ... + FIELD_TYPE_INT_SEQUENCE = ... + # ... and so on + string field_name uint8_t field_type - string field_name - string nested_type_name # If applicable (when field_type is 0) + uint64_t field_array_size # Only for Arrays and Bounded Sequences + string nested_type_name # Only for FIELD_TYPE_NESTED_TYPE -These naive examples of the interfaces just give an idea of the structure but perhaps do not yet consider some other complications like field annotations and more advanced IDL (generically all "interface description languages" not just the OMG-IDL that DDS uses) have. +These examples of the interfaces just give an idea of the structure but perhaps do not yet consider some other complications like field annotations or other as yet unconsidered features we want to support. -Nested TypeDescription Example -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. TODO:: + + (wjwwood) Add text about how to handle Service types, which are formed as a Request and a Response path, each of which is kind of like a Message. + We could just treat the Request and Response separately, or we could extend this scheme to include explicit support for Services. + Since Actions are composed of Topics and Services, it is less so impacted, but we could similarly consider officially supporting them in this scheme. + +Versioning the ``TypeDescription`` Message Type +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Given that the type description message interface has to be generic enough to support anything described in the ROS interfaces, there will be a need to add or remove fields over time in the type description message itself. +This should be done in such a way that the fields are tick-tocked and deprecated properly, possibly by having explicitly named versions of this interface, e.g. ``TypeDescriptionV1`` and ``TypeDescriptionV2`` and so on. + +Implementation in the ``rcl`` Layer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The implementation of the type description distribution feature will be made in the ``rcl`` layer as opposed to the ``rmw`` layer to take advantage of the abstraction away from the middleware and to allow for compatibility with the client libraries. + +A hook will be added to ``rcl_node_init()`` to initialize the type description distribution service with the appropriate ``rcl_service_XXX()`` functions. +This hook should also keep a map of published and subscribed types which will be populated on each initialization of a publisher or subscription in the respective ``rcl_publisher_init()`` and ``rcl_subscription_init()`` function calls. +The passed ``rosidl_message_type_support_t`` in the init call can be to obtain the relevant information, alongside any new methods added to support type version hashing. + +There will be an option to opt-out of creating this service, and a way to start and stop the service after node creation as well. + +.. TODO:: + + (wjwwood) A thought just occurred to me, which is that we could maybe have "lazy" Service Servers, which watch for any Service Clients to come up on their Service name and when they see it, they could start a Service Server. + The benefit of this is that when we're not using this feature (or other features like Parameter get/set Services), then we don't waste resources creating them, but if you know the node name and the corresponding Service name that it should have a server on, you could cause it to be activated by trying to use it. + The problem with this would be that you cannot browse the Service servers because they wouldn't advertise until requested, but for "well know service names" that might not be such a problem. + I guess the other problem would be there might be a slight delay the first time you go to use the service, but that might be a worthy trade-off too. + +Run-Time Interface Reflection +----------------------------- + +Run-Time Interface Reflection allows access to the data in the fields of a serialized message buffer when given: + +- the serialized message as a ``rmw_serialized_message_t``, basically just an array of bytes as a ``rcutils_uint8_array_t``, +- the message's type description, e.g. received from the aforementioned "type description distribution" or from a bag file, and +- the serialization format, name and version, which was used to create the serialized message, e.g. ``XCDR2`` for ``Extended CDR encoding version 2`` + +From these inputs, we should be able to access the fields of the message from the serialized message buffer using some programmatic interface, which allows you to: + +- list the field names +- list the field types +- access fields by name or index + +.. code:: + + message_buffer ─┐ ┌───────────────────────────────┐ + │ │ │ + message_description ─┼──►│ Run-Time Interface Reflection ├───► Introspection API + │ │ │ + serialization_format ─┘ └───────────────────────────────┘ + +Given that the scope of inputs and expected outputs is so limited, this feature should ideally be implemented as a separate package, e.g. ``rcl_serialization``, that can be called independently by any downstream packages that might need Run-Time Interface Reflection, e.g. introspection tools, rosbag transport, etc. +This feature can then be combined with the ability to detect type mismatches and obtain type descriptions in the previous two sections to facilitate communication between nodes of otherwise incompatible types. + +Additionally, it is important to note that this feature is distinct from ROS 2's existing "dynamic" type support (``rosidl_typesupport_introspection_c`` and ``rosidl_typesupport_introspection_cpp``). +The ``rosidl_typesupport_introspection_c*`` generators generate code at compile time for known types that provides reflection for those types. +This new feature, Run-Time Interface Reflection, will support reflection without generated code at compile time, instead dynamically interpreting the type description to provide this reflection. + +.. TODO:: + + Determine if the generated introspection API needs to continue to exist. + It might be that we just remove it entirely in favor of the new system described here, or at least merge them, as their API (after initialization) will probably be the same. + +.. TODO:: + + (wjwwood) I'm realizing now that we probably need to separate this section into two parts, first the reflection API used by the user and the rmw implementation, and then second the rmw implementation specific part, which we could call "Run-Time Type Support"? + It would be called such because it would be something like "TypeDescription in -> ``rosidl_message_type_support_t`` out"... + This second part is needed to make truly generic subscriptions and publishers, which until now has been kind of assumed in this section about reflection. + +.. TODO:: + + This section needs to be updated to be inclusive to at least Services, and then we can also mention Actions, though they are a combination of Topics and Services and so don't need special support probably. + +Plugin Interface +^^^^^^^^^^^^^^^^ + +As Run-Time Interface Reflection is expected to work across any serialization format, the Run-Time Interface Reflection interface needs to be extensible so that the necessary serialization libraries can be loaded to process the serialized message. +Serialization format support in this case will be provided by writing plugins that wrap the serialization libraries that can then provide the Run-Time Interface Reflection feature with the needed facilities. +Therefore, when initializing a Run-Time Interface Reflection instance it will: + +- enumerate the supported serialization library plugins in the environment, +- match the given serialization format to appropriate plugins, +- select a plugin if more than one matches the criteria, and then +- dynamically load the plugin for use + +These serialization library plugins must implement an interface which provides methods to: + +- determine if the plugin can be used for a given serialization type and version, +- provide an interface for reflection of a specific type, given information provided by the "get type description" ROS Service described in the "Type Description Distribution" section, + + - mainly including the parsed ``TypeDescription`` instance, but also + - perhaps the original ``.idl`` / ``.msg`` text, and + - perhaps other extra information, + +- provide a ``rosidl_message_type_support_t`` instance given similar information from the previous point + +In particular, providing information beyond the ``TypeDescription``, like the ``.idl`` / ``.msg`` text and the serialization library that was used, may be necessary because there might be serialization library or type support specific steps or considerations (e.g. name mangling or ROS specific namespacing) that would not necessarily be captured in the ``.idl`` / ``.msg`` file. + +Dealing with Multiple Applicable Plugins for A Serialization Format +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +In the case where there exists multiple applicable plugins for a particular serialization format (e.g. when a user's machine has a plugin for both RTI Connext's CDR library and Fast-CDR), the plugin matching should follow this priority order: + +- a user specified override, passed to the matching function, or +- a default defined in the plugin matching function, or else +- the first suitable plugin in alphanumeric sorting order + +.. TODO:: + + (wjwwood) We should mention here that the (implicitly read-only) reflection described here is just one logical half of what we could do with the type description. + We could also provide the ability to dynamically define types and/or write to types using the reflection. + + For example, imagine a tool that subscribes to any type (that's just using the RTIR) but then modifies one of the fields by name and then re-publishes it. + That tool would need the ability to mutate an instance using reflection then serialize it to a buffer. + You couldn't just iterate with an existing buffer easily, because it would potentially need to grow the buffer. + + Also, consider a "time stamping" tool that subscribes to any message, and then on-the-fly defines a new message that has a timestamp field and a field with the original message in it, fills that out and publishes it. + That would need the ability to both create a new type from nothing (maybe no more than creating a ``TypeDescription`` instance) as well as a read-write interface to the reflection. + We don't have to support these things in this REP, but we should at least mention them and state if it is in or out of the scope of the REP. + +Rationale +========= + +TODO + +Type Description Distribution +----------------------------- + +Using a Single ROS Service per Node +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The node that is publishing the data must already have access to the correct type description, at the correct version, in order to publish it, and therefore it is natural to get the data from that node. +Similarly, a subscribing node also knows what type they are wanting to receive, both in name and version, and therefore it is again natural to get that information from the subscribing node. +The type description for a given type, at a given version, could have been retrieved from other places, e.g. a centralized database, but the other alternatives considered would have had to take care to ensure that it had the right version of the message, which is not the case for the node publishing the data. + +Because the interface for getting a type description is generic, it is not necessary to have this interface on a per entity, i.e. publisher, subscription, etc, basis, but instead to offer the ROS Service on a per node basis to reduce the number of ROS Services. +Therefore, the specification dictates that the type description is distributed by single ROS Service for each individual node. + +There were also multiple alternatives for how to get this information from each node, but the use of a single ROS Service was selected because the task of requesting the type description from a node is well suited to a request-response style ROS Service. +Some of the alternatives offered other benefits, but using a ROS Service introduced the fewest dependencies, feature-wise, while accomplishing the task. + +.. TODO:: + + cite the above, https://en.wikipedia.org/wiki/Request%E2%80%93response + +Combining the Raw and Parsed Type Description in the Service Response +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The contents of the "get type description" service response should include information that supports both aforementioned use cases (i.e. tools and Run-Time Interface Reflection). +These use cases have orthogonal interests, with the former requiring human-readable descriptions, and the latter preferring machine-readable descriptions. + +Furthermore, the type description should be useful even across middlewares and serialization libraries and that makes it especially important to send at least the original inputs to the "type support pipeline" (i.e. the process of taking user-defined types and generating all supporting code). +In this case, because the "type support pipeline" is a lossy process, there is a need to ensure that enough information is sent to completely reproduce the original definition of the type, and therefore it makes sense to just send the original ``idl`` or ``msg`` file. + +At the same time, it is useful to send information with the original description that makes it easier to process data at the receiving end, as it is often not trivial to get to the "parsed" version of the type description from the original text description. + +Finally, while there could be an argument for sending a losslessly compressed version of the message file, the expected low frequency of queries to the type description service incurs a negligible overhead that heavily reduces the benefit. + +Implementing in ``rcl`` versus ``rmw`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +While it is true that implementing the type description distribution on the ``rmw`` layer would allow for much lower level optimization, removing the layer of abstraction avoids having to implement this feature in each rmw implementation. + +Given that the potential gains from optimization will be small due to how infrequently the service is expected to be called, this added development overhead was determined to not be worth it. +Instead the design prefers to have a unified implementation of this feature in ``rcl`` so it is agnostic to any middleware implementations and client libraries. + +Nested ``TypeDescription`` Example +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``TypeDescription`` message type shown above also supports the complete description of a type that contains other types (a nested type), up to an arbitrary level of nesting. Consider the following example: @@ -480,96 +783,28 @@ With the corresponding ``Field`` fields: In order to handle the type of a nested type such as ``A``, the receiver can use the ``referenced_type_descriptions`` array as a lookup table keyed by the value of ``Field.nested_type_name`` or ``IndividualTypeDescription.type_name`` (which will be identical for a given type) to obtain the type information of a referenced type. This type handling process can also support any recursive level of nesting (e.g. while handling A, C is encountered as a nested type, C can then be looked up using the top level ``referenced_type_descriptions`` array). -Additional Notes for TypeDescription Message Type -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Given that the type description message interface has to be generic enough to support anything described in the ROS interfaces, there will be a need to add or remove fields over time in the type description message itself. -This should be done in such a way that the fields are tick-tocked and deprecated properly, possibly by having explicitly named versions of this interface, e.g. ``TypeDescriptionV1`` and ``TypeDescriptionV2`` and so on. - -Implementation in the `rcl` Layer -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The implementation of the type description distribution feature will be made in the ``rcl`` layer as opposed to the ``rmw`` layer to take advantage of the abstraction away from the middleware and to allow for compatibility with the client libraries. - -A hook will be added to ``rcl_node_init()`` to initialize the type description distribution service with the appropriate ``rcl_service_XXX()`` functions. -This hook should also keep a map of published and subscribed types which will be populated on each initialization of a publisher or subscription in the respective ``rcl_publisher_init()`` and ``rcl_subscription_init()`` function calls. -The passed ``rosidl_message_type_support_t`` in the init call can be introspected with getter functions to obtain the relevant information, alongside any new methods added to support type version hashing. - - -Runtime Type Introspection +Run-Time Interface Reflect -------------------------- -Runtime type introspection allows access to the data in the fields of a serialized message buffer when given: - -- the serialized message as a ``rmw_serialized_message_t``, a.k.a. a ``rcutils_uint8_array_t`` -- the message's type description, e.g. received from the aforementioned type description distribution or from a bag file -- the serialization format, name and version, which was used to create the serialized message, e.g. ``XCDR2`` for ``Extended CDR encoding version 2`` - -From these inputs, we should be able to access the data as fields in the message buffer through some programmatic interface, including: - -- a list of field names -- a list of field types -- access to the field by name or index - -.. code:: - - message_buffer ─┐ ┌────────────────────────────┐ - │ │ │ - message_description ─┼──►│ Runtime Type Introspection ├───► Introspection Interface - │ │ │ - serialization_format ─┘ └────────────────────────────┘ - -Given that the scope of inputs and expected outputs is so limited, this feature should ideally be implemented as a separate package, e.g. ``rcl_serialization``, that can be called independently by any downstream packages that might need runtime type introspection, e.g. introspection tools, rosbag transport, etc. -This feature can then be combined with the ability to detect type mismatches and obtain type descriptions in the previous two sections to facilitate communication between nodes of otherwise incompatible types. - -Additionally, it is important to note that this feature is distinguished from ROS 2's "dynamic" type support (``rosidl_typesupport_introspection_c``) in that "dynamic" type support generates type support code at compile time that then is not expected to change, whereas this feature aims to support any arbitrary message type descriptions at runtime. -Crucially, this means that a node will not need access to a message's type description at compile time in order to be able to support it, since it will be able to receive and utilize the type description with this feature, in contrast to "dynamic" type support, which would require it. +TODO -Requirements +Alternatives ^^^^^^^^^^^^ -In order for runtime type introspection to work, message buffers still need to be able to be transmitted and received, as such, this feature is expected to work if and only if: - -- the consumer of a message has access to a compatible version of the serialization library described in the message type description -- the serialization libraries expected to be used with this feature support at least sequential access to message buffer data members with a runtime programmatic interface -- in the case where messages are being sent over a middleware layer (e.g. one of the DDS implementations), the producers and consumers of the message are using compatible middleware implementations - -Furthermore, this feature will be implemented as a C library to allow for maximum interoperability with the ROS client libraries (e.g. C++, Python, Rust, etc.) +Name of Run-Time Interface Reflection +""""""""""""""""""""""""""""""""""""" -Plugin Interface -^^^^^^^^^^^^^^^^ - -As runtime type introspection is expected to work across any serialization format, the runtime type introspection interface needs to be extensible so that the necessary serialization libraries can be loaded to process the serialized message. -Serialization format support in this case will be provided by writing plugins that wrap the serialization libraries that can then provide the runtime type introspection feature with the needed facilities. -Then, runtime type introspection will: - -- enumerate the supported serialization library plugins on the machine -- match, if possible, the serialization format specified in the message type description to an appropriate plugin (throwing a warning otherwise) -- dynamically load the plugin - -These serialization library plugins must wrap functionality to: - -- determine suitability for use with a message's reported serialization version and type -- parse the ``.idl`` / ``.msg`` component of the message description into a serialization library usable form -- given that parsed description (which provides type reflection information about the message buffer), deserialize the message buffer using the serialization library - -In particular, the step of parsing the ``.idl`` / ``.msg`` file is necessary because there might be serialization library or type support specific steps or considerations (e.g. field re-ordering, endianness) that would not be captured in the ``.idl`` / ``.msg`` file, because they would have been assumed to have been applied during type support. - -Dealing with Multiple Applicable Plugins for A Serialization Format -""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +Other names were considered, like "Runtime Type Introspection", but this name was selected for three main reasons: -In the case where there exists multiple applicable plugins for a particular serialization format (e.g. when a user's machine has both RTI Connext's CDR library and FastCDR), the plugin matching should follow this priority order: - -- user specified overrides passed to the matching function -- defaults defined in the plugin matching function, if applicable, otherwise -- the first suitable plugin in alphanumeric sorting order +- to avoid confusion with C++'s Run-time type information (RTTI)\ [3]_, +- to show it was more than RTTI, but instead was also reflection, + - like how the RTTR C++ library\ [4]_ differs from normal RTTI, and -Example Plugin Interface -^^^^^^^^^^^^^^^^^^^^^^^^ +- to show that it deals not just with any "type" but specifically ROS's Interfaces -Plugin Matching and Loading -""""""""""""""""""""""""""" +Plugin Matching and Loading API Example +""""""""""""""""""""""""""""""""""""""" The following is an example of how this plugin matching and loading interface could look like, defining new ``rcl`` interfaces; with a plugin wrapping FastCDR v1.0.24 for serialization of ``sensor_msgs/msg/LaserScan`` messages: @@ -588,15 +823,12 @@ The following is an example of how this plugin matching and loading interface co // If we wanted to force the use of MicroCDR instead introspection_handle->match_plugin(LaserScanDescription->get_serialization_format(), "microcdr"); -Example Introspection API -^^^^^^^^^^^^^^^^^^^^^^^^^ +Run-Time Interface Reflection API Example +""""""""""""""""""""""""""""""""""""""""" The following is an example for how the introspection API could look like. This example will show a read-only interface. -Overview -"""""""" - It should comprise several components - a handler for the message buffer, to handle pre-processing (e.g. decompression) - a handler for the message description, to keep track of message field names of arbitrary nesting level @@ -604,10 +836,9 @@ It should comprise several components Also, this example uses the LaserScan message definition: https://github.com/ros2/common_interfaces/blob/foxy/sensor_msgs/msg/LaserScan.msg -.. TODO:: (methylDragon) Add a reference somehow? +.. TODO:: -API Example -""""""""""" + (methylDragon) Add a reference somehow? First, the message buffer handler: @@ -686,9 +917,6 @@ If we attempt to do the same by reference, the plugin might decide to allocate n // Be sure to clean up any dangling pointers finalize_field(some_field_data_ptr); -Error cases -""""""""""" - The following should be error cases: - accessing field data as incorrect type @@ -696,58 +924,10 @@ The following should be error cases: .. TODO: (methylDragon) Are there more cases? It feels like there are... -Pointer lifecycles -"""""""""""""""""" - - the raw message buffer should outlive the ``rcl_buffer_handle_t``, since it is not guaranteed that the buffer handle will allocate new memory - the ``rcl_buffer_handle_t`` should outlive any returned field data pointers, since it is not guaranteed that the serialization plugin will allocate new memory - however, ``rcl_field_info_t`` objects **do not** have any lifecycle dependencies, since they are merely descriptors - -Rationale -========= - -TODO - -Type Description Distribution ------------------------------ - -Using a Single ROS Service per Node -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The node that is publishing the data must already have access to the correct type description, at the correct version, in order to publish it, and therefore it is natural to get the data from that node. -Similarly, a subscribing node also knows what type they are wanting to receive, both in name and version, and therefore it is again natural to get that information from the subscribing node. -The type description for a given type, at a given version, could have been retrieved from other places, e.g. a centralized database, but the other alternatives considered would have had to take care to ensure that it had the right version of the message, which is not the case for the node publishing the data. - -Because the interface for getting a type description is generic, it is not necessary to have this interface on a per entity, i.e. publisher, subscription, etc, basis, but instead to offer the ROS Service on a per node basis to reduce the number of ROS Services. -Therefore, the specification dictates that the type description is distributed by single ROS Service for each individual node. - -There were also multiple alternatives for how to get this information from each node, but the use of a single ROS Service was selected because the task of requesting the type description from a node is well suited to a request-response style ROS Service. -Some of the alternatives offered other benefits, but using a ROS Service introduced the fewest dependencies, feature-wise, while accomplishing the task. - -.. TODO:: cite the above, https://en.wikipedia.org/wiki/Request%E2%80%93response - -Combining the Raw and Parsed Type Description in the Service Response -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The contents of the "get type description" service response should include information that supports both aforementioned use cases (i.e. tools and runtime type introspection). -These use cases have orthogonal interests, with the former requiring human-readable descriptions, and the latter preferring machine-readable descriptions. - -Furthermore, the type description should be useful even across middlewares and serialization libraries and that makes it especially important to send at least the original inputs to the "type support pipeline" (i.e. the process of taking user-defined types and generating all supporting code). -In this case, because the "type support pipeline" is a lossy process, there is a need to ensure that enough information is sent to completely reproduce the original definition of the type, and therefore it makes sense to just send the original ``idl`` or ``msg`` file. - -At the same time, it is useful to send information with the original description that makes it easier to process data at the receiving end, as it is often not trivial to get to the "parsed" version of the type description from the original text description. - -Finally, while there could be an argument for sending a losslessly compressed version of the message file, the expected low frequency of queries to the type description service incurs a negligible overhead that heavily reduces the benefit. - -Implementing in ``rcl`` versus ``rmw`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -While it is true that implementing the type description distribution on the ``rmw`` layer would allow for much lower level optimization, removing the layer of abstraction avoids having to implement this feature in each rmw implementation. - -Given that the potential gains from optimization will be small due to how infrequently the service is expected to be called, this added development overhead was determined to not be worth it. -Instead the design prefers to have a unified implementation of this feature in ``rcl`` so it is agnostic to any middleware implementations and client libraries. - Alternatives ------------ @@ -815,8 +995,8 @@ Additionally, the option to add a configuration option to choose what contents t As for the format of the type description, using the ROS interfaces to describe the type, as opposed to an alternative format like XML, JSON, or something like the TypeObject defined by DDS-XTypes, makes it easier to embed in the ROS Service response. It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. -TypeDescription Structure -^^^^^^^^^^^^^^^^^^^^^^^^^ +``TypeDescription`` Structure +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Representing Fields as An Array of Field Types ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -874,6 +1054,15 @@ References .. [1] DDS-XTYPES 1.3 (https://www.omg.org/spec/DDS-XTypes/1.3/About-DDS-XTypes/) +.. [2] IDL - Interface Definition and Language Mapping + (http://design.ros2.org/articles/idl_interface_definition.html) + +.. [3] Run-time type information + (https://en.wikipedia.org/wiki/Run-time_type_information) + +.. [4] RTTR (Run Time Type Reflection): An open source library, which adds reflection to C++ + (https://www.rttr.org/) + Copyright ========= From d5e27d53c0c3598a02874afd2ba6bde7ba49674e Mon Sep 17 00:00:00 2001 From: William Woodall Date: Wed, 27 Jul 2022 04:26:39 -0700 Subject: [PATCH 14/22] diagram typo Signed-off-by: William Woodall --- rep-2011.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rep-2011.rst b/rep-2011.rst index dcfaab265..6f4615d19 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -208,7 +208,7 @@ It will continue to do this until it reaches the desired type version or it fail └─┬────────────────────┘ └─────────────────────────┘ │ ▲ │ ┌─────────────────────────────────┐ │ - ✓ /topic/ABC │ Transfer Functions for ABC->XYZ │ /topic ✓ + ✓ /topic/ABC │ Transfer Functions for ABC->XYZ │ /topic ✓ │ │ │ │ │ ┌──────────┴──────────────┐ ┌──────────────┴───────┐ │ └─►│Subscription│ │Publisher├──┘ From fbc88106ab5e67617258ed0fe15c34d1fbd61e6b Mon Sep 17 00:00:00 2001 From: William Woodall Date: Mon, 1 Aug 2022 04:19:38 -0700 Subject: [PATCH 15/22] fixup Signed-off-by: William Woodall --- rep-2011.rst | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 6f4615d19..5948a353b 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -185,7 +185,7 @@ They can use a tool that will look at a catalogue of available transfer function or ┌───────────────────────┐ - ┌───────────────┐ │ User-defiend transfer │ ┌───────────────┐ + ┌───────────────┐ │ User-defined transfer │ ┌───────────────┐ │Message@current├────►│ function from current ├───...───►│Message@desired│ └───────────────┘ │to desired/intermediate│ ▲ └───────────────┘ └───────────────────────┘ │ @@ -653,6 +653,11 @@ In the case where there exists multiple applicable plugins for a particular seri That would need the ability to both create a new type from nothing (maybe no more than creating a ``TypeDescription`` instance) as well as a read-write interface to the reflection. We don't have to support these things in this REP, but we should at least mention them and state if it is in or out of the scope of the REP. +Tools for Evolving Types +------------------------ + +TODO + Rationale ========= @@ -958,7 +963,7 @@ Furthermore, an initial evolving message test with FastDDS, Cyclone and Connext Handle Detection of Version Mismatch "Above" rmw Layer ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. The message version for each participant will then be accessable across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. +We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. The message version for each participant will then be accessible across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. Prevent Communication of Mismatched Versions "Above" rmw Layer ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ From 01edda8d35f06b3511622f824006b60c1325c035 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Mon, 1 Aug 2022 18:05:30 -0700 Subject: [PATCH 16/22] change title to a better wording Signed-off-by: William Woodall --- rep-2011.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 5948a353b..5adad7187 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -239,8 +239,8 @@ In order to support this vision, three missing features will need to be added in These features are described in the following sections. -Interface Type Enforcement --------------------------- +Type Version Enforcement +------------------------ In order to detect type version mismatches and enforce them, a way to uniquely identify versions is required, and this proposal uses type version hashes. From 9ea2a2043b5eae44b17c0cb383e1e6b524c19463 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Sun, 21 Aug 2022 20:10:47 -0700 Subject: [PATCH 17/22] fixups, reordering, tools section Signed-off-by: William Woodall --- rep-2011.rst | 517 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 373 insertions(+), 144 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 5adad7187..675353486 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -35,15 +35,15 @@ Motivation Evolving types over time is a necessary part of developing a new system using ROS 2, as well as a necessary part of evolving ROS 2 itself over time. As needs change, types often need to change as well to accommodate new use cases by adding or removing fields, changing the type of fields, or splitting/combining messages across different topics. -To understand this need, it's useful to look at a few common scenarios, for example consider this message definition (using the "rosidl" ``.msg`` format for convenience, but it could just as easily be the OMG IDL ``.idl`` format or something else like protobuf): +To understand this need, it's useful to consider some examples, like this message definition: .. code:: # Temperature.msg uint64 timestamp - int64 temperature + int32 temperature -There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from a prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int64 to float64. +There are many potential issues with this message type, but it isn't uncommon for messages like this to be created in the process of taking an idea from a prototype into a product, and for the sake of this example, let's say that after using this message for a while the developers wanted to change the temperature field's type from int32 to float64. There are a few ways the developers could approach this, e.g. if the serialization technology allowed for optional fields, they may add a new optional field like this (pseudo code): @@ -51,7 +51,7 @@ There are a few ways the developers could approach this, e.g. if the serializati # Temperature.msg uint64 timestamp - int64 temperature + int32 temperature optional float64 temperature_float Updated code could use it like this (pseudo code): @@ -70,7 +70,7 @@ Updated code could use it like this (pseudo code): This is not uncommon to see in projects, and it has the advantage that old data, whether it is from a program running on an older machine or is from old recorded data, can be interpreted by newer code without additional machinery or features beyond the optional field type. Older publishing code can also use the new definition without being updated to use the new ``temperature_float`` field, which is very forgiving for developers in a way. -However, as projects grow and the number of incidents, like the one described in the example above, increase, this kind of approach can generate a lot of complexity to sending and receiving code. +However, as projects grow and the number of incidents like the one described in the example above increase, this kind of approach can generate a lot of complexity to sending and receiving code. You can imagine other serialization features, like inheritance or implicit type conversion, could be used to address this desired change, but in each case it relies on a feature of the serialization technology which may or may not be available in all middleware implementations for ROS 2. @@ -89,7 +89,7 @@ And also update any code publishing or subscribing to this type at the same time This is typically what happens right now in ROS 2, and also what happened historically in ROS 1. This approach is simple and requires no additional serialization features, but obviously doesn't do anything on its own to help developers evolve their system while maintaining support for already deployed code and utilizing existing recorded data. -However, this REP proposes that with additional tooling these cases can be handled without the code of the application knowing about. +However, this REP proposes that with additional tooling these cases can be handled without the code of the application knowing about it. Consider again the above example, but this time the update to the message was done without backwards compatibility in the message definition. This means that publishing old recorded data, for example, will not work with new code that subscribes to the topic, because the new subscription expects the new version of the message. For the purpose of this example, let's call the original message type ``Temperature`` and the one using ``float64`` we'll call ``Temperature'``. @@ -113,13 +113,15 @@ Making this possible requires several new technical features for ROS 2, and some However, by keeping the conversion logic contained in these transfer functions, it has the advantage of keeping both the publishing and subscribing code simple. That is to say, it keeps both the publishing and subscribing code agnostic to the fact that there are other versions of the message, and it keeps the message type from being cluttered with vestigial fields, e.g. having both a ``temperature`` and ``temperature_float`` in the same message. -Either way, a problem can usually be solved by changing a message in some of, if not all, of the of the above mentioned ways, and is often influenced by what the underlying technology allows for or encourages. -ROS 2 has special considerations on this topic because it can support different serialization technologies, though CDR is the default and most common right now, and those technologies have different capabilities. -It is neither desirable to depend on features of a specific technology, therefore tying ROS 2 to a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide, again tying ROS 2 to some specific technologies through their features. +As stated before, problems created by changing these ROS interfaces can usually be solved by more than one way, either using some feature like optional fields or by just breaking compatibility directly. +However, the strategy used usually depends on the features that the serialization technology being used offers. +ROS 2 has special considerations on this topic because it can support different serialization technologies, and though CDR is the default and most common right now, others could be used in the future. +Therefore, it is neither desirable to depend on features of a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide. +In either case, that would tie ROS 2 to specific serialization technologies, and that should be avoided if possible. That being said, this proposal will require some specific features from the middleware and serialization technology, but the goal is to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. -With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as a rationale in the Rationale section and a discussion of alternatives in the Alternatives section and its sub-sections. +With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as rationales and considered alternatives in the Rationale section and its sub-sections. Terminology =========== @@ -140,13 +142,13 @@ In order to do this, this proposal describes how ROS 2 can be changed to support - enforcing type compatibility by version - - by providing type versions as hashes for types automatically, and - - providing access to the type versions used by remote endpoints, and - - warning users when type enforcement is preventing two endpoints from matching + - by providing type version hashes, and + - making it possible to see what versions of types are being used by other endpoints, and + - warning users when type enforcement is preventing two endpoints from communicating - providing access to the complete type description of types being used - - and providing remote access of the type description used by remote endpoints + - and making it possible to access the type description from nodes remotely - making it possible to publish and subscribe to topics using just the type description @@ -159,11 +161,15 @@ Conceptual Overview ------------------- Users will be able to calculate the "type version hash" for an interface (e.g. a message, service, or action) using the ``ros2 interface hash `` command. -Additionally, if a topic has two types being used on it with the same type name, but different type versions, a warning may be logged and the endpoints that do not match may not communicate. -An exception to this rule is that if the underlying middleware has more sophisticated rules for matching types, for example the type has been extended with an optional field they may still match, then ROS 2 will defer to it and not produce warnings when the type version hashes do not match. -Instead, ROS 2 will rely on the middleware to notify it when two endpoints do not match based on their types not being compatible, so that a warning can be produced. +This hash is also used by ROS 2 to determine if the same type name but different type versions are being used on the same topic, so that a warning may be logged that the endpoints that do not match may not communicate. -When a mismatch is detected, the user can use predefined, or user-defined, "transfer functions" to convert between versions of the type until it is in the type version they wish to send or receive. +.. note:: + + An exception to this rule is that if the underlying middleware has more sophisticated rules for matching types, for example the type has been extended with an optional field, then they may still match + In that case, ROS 2 will defer to the middleware and not produce warnings when the type version hashes do not match. + Instead, ROS 2 will rely on the middleware to notify it when two endpoints do not match based on their types not being compatible, so that a warning can be produced. + +When a mismatch is detected, the user can use user-defined or automatically generated generic "transfer functions" to convert between versions of the type until it is in the type version they wish to send or receive. They can use a tool that will look at a catalogue of available transfer functions to find a single transfer function, or a set of transfer functions, to get from the current type version to the desired type version. .. code:: @@ -253,7 +259,7 @@ That is, given two version hashes of a type, there is no way to tell which is "n The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. -The type version hashes are calculated in a stable way and are not sensitive to trivial changes like changes in the comments or whitespace of the IDL file +The type version hashes are calculated in a stable way and are not sensitive to trivial changes like changes in the comments or whitespace of the IDL file. The IDL file given by the user, which may be a ``.msg`` file, ``.idl`` file, or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. The data structure includes: @@ -275,19 +281,20 @@ The data structure includes: I (wjwwood) am leaning in this direction. The resulting data structure is hashed using a standard SHA-1 method, resulting in a standard 160-bit (20-byte) hash value which is also generally known as a "message digest". -This hash is combined with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS1``, with an ``_`` symbol, resulting in a complete type version hash like ``RIHS1_<160-bit SHA-1 of data structure>``. +This hash is combined with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS1``. +They are combined using an ``_`` symbol, resulting in a complete type version hash like ``RIHS1_<160-bit SHA-1 of data structure>``. This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. For now, the list of field names and their types are the only contributing factors, but in the future that could change, depending on which "annotations" are supported in ``.idl`` files. The "IDL - Interface Definition and Language Mapping" design document\ [2]_ describes which features of the OMG IDL standard are supported by ROS 2. -If that is extended in the future, then this data structure will need to be updated, and the "type version hash standard version" will need to be incremented as well. +If that is extended in the future, then this data structure may need to be updated, and if so the "type version hash standard version" will also need to be incremented. .. TODO:: Re-audit the supported features from OMG IDL according to the referenced design document, including the @key annotation and how it may impact this for the reference implementation. The optional user-defined interface version makes it possible to change the version hash of a message that only changed in "field semantics" (i.e. without changing field names or types), and therefore makes it possible to write "transfer functions" to handle semantic-only conversions between versions. -There is currently no standard way specify the user-defined interface version in either ``.msg`` or ``.idl`` files. +There is currently no standard way to specify the user-defined interface version in either ``.msg`` or ``.idl`` files. .. TODO:: @@ -295,7 +302,7 @@ There is currently no standard way specify the user-defined interface version in Note that this data structure does not include the serialization format being used, nor does it include the version of the serialization technology. This type version hash is for the *description* of the type, and is not meant to be used to determine wire compatibility by itself. -The type version hash must be considered in context, with the serialization format and version included to determine wire compatibility. +The type version hash must be considered in context, with the serialization format and version in order to determine wire compatibility. Enforcing Type Version ^^^^^^^^^^^^^^^^^^^^^^ @@ -381,7 +388,7 @@ Accessing the Type Version Hash ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For debugging and introspection, the type version hash will be accessible via the ROS graph API, by extending the ``rmw_topic_endpoint_info_t`` struct, and related types and functions, to include the type version hash, ``topic_type_version_hash``, as a string. -It should be along side the ``topic_type`` string in that struct, but the ``topic_type`` field should not include the concatenated type version hash, if the recommended approach is used. +It should be along side the ``topic_type`` string in that struct, but the ``topic_type`` field should not include the concatenated type version hash, even if the recommended approach is used. Instead the type name and version hash should be separated and placed in the fields separately. This information should be transmitted as part of the discovery process. @@ -438,7 +445,7 @@ This service may be optional, for example being enabled or disabled when creatin .. TODO:: How can we detect when a remote node is not offering this service? - It's difficult to differentiate between "the Service has not been created yet, but will" and "the Service will never be created". + It's difficult to differentiate between "the Service has not been created yet, but will be" and "the Service will never be created". Should we use a ROS Parameter to indicate this? But then what if remote access to Parameters (another optional Service) is disabled? Perhaps we need a "services offered" list which is part of the Node metadata, which is sent for each node in the rmw implementation, but that's out of scope for this REP. @@ -533,7 +540,7 @@ These examples of the interfaces just give an idea of the structure but perhaps .. TODO:: - (wjwwood) Add text about how to handle Service types, which are formed as a Request and a Response path, each of which is kind of like a Message. + (wjwwood) Add text about how to handle Service types, which are formed as a Request and a Response part, each of which is kind of like a Message. We could just treat the Request and Response separately, or we could extend this scheme to include explicit support for Services. Since Actions are composed of Topics and Services, it is less so impacted, but we could similarly consider officially supporting them in this scheme. @@ -550,7 +557,7 @@ The implementation of the type description distribution feature will be made in A hook will be added to ``rcl_node_init()`` to initialize the type description distribution service with the appropriate ``rcl_service_XXX()`` functions. This hook should also keep a map of published and subscribed types which will be populated on each initialization of a publisher or subscription in the respective ``rcl_publisher_init()`` and ``rcl_subscription_init()`` function calls. -The passed ``rosidl_message_type_support_t`` in the init call can be to obtain the relevant information, alongside any new methods added to support type version hashing. +The passed ``rosidl_message_type_support_t`` in the init call can be used to obtain the relevant information, alongside any new methods added to support type version hashing. There will be an option to opt-out of creating this service, and a way to start and stop the service after node creation as well. @@ -621,7 +628,7 @@ Therefore, when initializing a Run-Time Interface Reflection instance it will: These serialization library plugins must implement an interface which provides methods to: - determine if the plugin can be used for a given serialization type and version, -- provide an interface for reflection of a specific type, given information provided by the "get type description" ROS Service described in the "Type Description Distribution" section, +- provide an API for reflection of a specific type, given the type's description, - mainly including the parsed ``TypeDescription`` instance, but also - perhaps the original ``.idl`` / ``.msg`` text, and @@ -656,11 +663,272 @@ In the case where there exists multiple applicable plugins for a particular seri Tools for Evolving Types ------------------------ -TODO +The previous sections of the specification have all be working up to supporting the new tools described in this section. + +The goal of these new tools is to help users reason about type versions of ROS interfaces, define transfer functions between two versions of a type when necessary, and put the transfer functions to use in their system to handle changes in types that have occurred. + +These tools will come in the form of new APIs, integrations with the build system, command-line tools, and integration with existing concepts like launch files. + +This section will describe a vision of what is possible with some new tools, but it should not be considered completely prescriptive. +That is to say, after this REP is accepted, tools described in this section may continue to evolve and improve, and even more tools not described here may be added to enable more things. +As always, consider the latest documentation for the various pieces of the reference implementation to get the most accurate details. + +Tools for Interacting with Type Version Hashes and Type Descriptions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ros2 command line tools, and the APIs that support them, should be updated to provide access to the type version hash where ever the type name is currently available and the type version description on-demand as well. +For example: + +- ``ros2 interface`` should be extended with a way to calculate the version hash for a type +- ``ros2 topic info`` should include the type version hash used by each endpoint +- ``ros2 topic info --verbose`` should include the type description used by each endpoint +- a new command to compare the types used by two endpoints, e.g. ``ros2 topic compare `` +- ``ros2 service ...`` commands should also be extended in this way +- ``ros2 action ...`` commands should also be extended in this way + +Again this list should not be considered prescriptive or exhaustive, but gives an idea of what should change. + +Notifying Users when Types Do Not Match +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There should be a warning to users when two endpoints (publisher/subscription or Service server/client, etc.) on the same topic (or service/action) are using different type names or versions and therefore will not communicate. +This warning should be a console logging message, but it should also be possible for the user to get a callback in their own application when this occurs. +How this should happen in the API is described in the Type Version Enforcement section, and it uses the "QoS Event" part of the API. + +The warning should include important information, like the GUID of the endpoints involved, the type name(s) and type version hash(es), and of course the topic name. +These pieces of information can then be used by the user, with the above mentioned changes to the command line tools, to investigate what is happening and why the types do not match. +The warning may even suggest how to compare the types using the previously described ``ros2 topic compare`` command-line tool. + +Tools for Determining if Two Type Versions are Convertible +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There should be a way for a user to determine if it is possible to convert between two versions of a type without writing a transfer function. +This tool would look at the two type descriptions and determine if a conversion can be done automatically with a generated transfer function. +This will help the user know if they need to write a transfer function or check if one already exists. + +This tool might look something like this: + +- ``ros2 interface convertible --from-remote-node --to-local-type `` + +This tool would query the type description from a remote node, and compare it to the type description of the second type version found locally. +If the two types can be converted without writing any code, then this tool would indicate that on the ``stdout`` and by return code. + +There could be ``--from-local-type`` and ``--to-remote-node`` options as well as others too. + +.. TODO:: + + (wjwwood) we need to come back to this section when we start working on the reference implementation, as more details will be clearer then + +Tools for Writing User-Defined Transfer Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To help users write their own transfer functions, when conversions cannot be done automatically, we need to provide them with build system support and API support in various languages. + +The transfer functions will be defined (at least) as a C++ plugin, in the style of component nodes, rviz display plugins, and other instances. +These transfer function plugins can then be loaded and chained together as needed inside of a component node. + +.. TODO:: + + (wjwwood) we need to figure out if this is how we're gonna handle Services, or if instead they will be plugins rather than component nodes, but component nodes with remapping solves a lot of QoS related issues that arise when you think about instead making the conversions happen as a plugin on the publisher or subscription side with topics. + +At the very least we should: + +- provide a convenient way to build and install transfer functions from CMake projects +- provide an API to concisely define transfer functions in C/C++ +- document how transfer functions are installed, discovered, and run + +Documenting how this works will allow support for other build systems and programming languages can be added in the future. +It will also allow users that do not wish to use our helper functions (and the dependencies that come with them) in their projects, e.g. if they would like to use plain CMake and not depend on things like ``ament_cmake``. + +The process will look something like this: + +- compile the transfer function into a library +- put an entry into the ``ament_index`` which includes the transfer functions details, like which type and versions it converts between, and which library contains it + +Putting it into a library allows us to later load the transfer function, perhaps along side other transfer functions, into a single component node which subscribes to the input type topic and publishes to the output type topic. + +The interface for a transfer function will look like a modified Subscription callback, receiving a single ROS message (or request or response in the case of Services) as input and returning a single ROS message as output. +The input will use the Run-Time Interface Reflection API and therefore will not be a standard message structure, e.g. ``std_msgs::msg::String`` in C++, but the output may be a concrete structure if that type is available when compiling the transfer function. + +The transfer function API for C++ may look something like this: + +.. code:: + + void + my_transfer_function( + const DynamicMessage & input, + const MessageInfo & input_info, // type name/version hash, etc. + DynamicMessage & output, + const MessageInfo & output_info) + { + // ... conversion details + } + + REGISTER_TRANSFER_FUNCTION(my_transfer_function) + +Registering a transfer function in an ``ament_cmake`` project might look like this: + +.. code:: + + create_transfer_function( + src/my_transfer_function.cpp + FUNCTION_NAME my_transfer_function + FROM sensor_msgs/msg/Image RIHS1_XXXXXXXXXXXXXXXXX123 + TO sensor_msgs/msg/Image RIHS1_XXXXXXXXXXXXXXXXXabc + ) + +This CMake function would create the library target, link the necessary libraries, and install any files in the ``ament_index`` needed to make the transfer function discoverable by the tools. +This example is the for the simplest case, which we should make easy, but other cases like supporting multiple transfer functions per library, creating the target manually, or even supporting other programming languages would require more complex interfaces too. + +.. TODO:: + + (wjwwood) come back with more details of this interface as the reference implementation progresses + +Tools for Interacting with Available Transfer Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There should be a way to list and get information about the available transfer functions in your local workspace. +This tool would query the ``ament_index`` and list the available transfer functions found therein. + +The tool might look like this: + +- ``ros2 interface transfer_functions list`` +- ``ros2 interface transfer_functions info `` + +The tool might also have options to filter based on the type name, package name providing the type, package name providing the transfer function (not necessarily the same as the package which provides the type itself), etc. + +.. TODO:: + + (wjwwood) not clear to me if the transfer functions should have names or not. What if you have more than one transfer function for a type versions pair, e.g. more than one conversion for sensor_msgs/msg/Image from ABC to XYZ. I'm not sure why this would happen, but it is technically possible since the packages register them separately. + +Tools for Using Transfer Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Additionally, there should be a tool which will look at the available transfer functions, including the possible automatic conversions, and start the node that will do the conversions if there exists a set of transfer function which would convert between two different versions of a type. + +It might look like this: + +.. code:: + + ros2 interface convert_topic_types \ + --from /topic/old \ + --to /topic \ + --component-container + +This tool would find the set of transfer functions between the types on ``/topic/old`` and ``/topic``, assuming they are the same type at different versions and that there exists the necessary transfer functions, and load them into a component container as a component node. + +The tool could run a stand-alone node instance if provided a ``--stand-alone`` option instead of the ``--component-container`` option. + +It could also have options like ``--check`` which would check if it could convert the types or not, giving detailed information about why it cannot convert between them if it is not possible. +This could help users understand what transfer functions they need to write to bridge the gap. + +This tool would also potentially need to let the user decide which path to take if there exists multiple chains of transfer functions from one version to the other. +For example, if the types are ``Type@ABC`` and ``Type@XYZ``, and there are transfer functions for ``Type@ABC -> Type@DEF``, ``Type@DEF -> Type@XYZ``, ``Type@ABC -> Type@XYZ``, and possibly other paths, then it might need help from the user to know which path to take. +By default it might choose the shortest path and then if there's a tie, one arbitrarily. + +There would be similar versions of this tool for Services and Actions as well. + +Integration with Launch +^^^^^^^^^^^^^^^^^^^^^^^ + +With the previously described tool ``ros2 interface convert_topic_types``, you could simply remap topics and execute an instance of it with launch: + +.. code:: + + + + + + + + + +However, we can provide some "syntactic sugar" to make it easier, which might look like this: + +.. code:: + + + + + + + + +That syntactic sugar can do a lot of things, like: + +- remap the topic to something new, like ``/topic/XXX`` +- run the node its enclosed in, wait for the topic to come up to check the version +- run any transfer functions in a node to make the conversion from ``/topic/XXX`` to ``/topic`` + +This could also be extended to support component nodes, rather than stand alone nodes, etc. + +Integration with ros2bag +^^^^^^^^^^^^^^^^^^^^^^^^ + +Similar to the integration with launch, you could run ``ros2 bag play ...`` and the ``ros2 interface convert_topic_types ...`` separately, but we could also provide an option to rosbag itself, which might look like this: + +.. code:: + + ros2 bag play /path/to/bag --convert-to-local-type /topic + +That option would handle the conversion on the fly using the same mechanisms as the command line tool. Rationale ========= +This section captures further rationales for why the specification is the way it is, and also offers summaries of alternatives considered but not selected for various parts of the specification. + +Type Version Enforcement +------------------------ + +Alternatives +^^^^^^^^^^^^ + +Use Type Hash from Middleware, e.g. from DDS-XTypes +""""""""""""""""""""""""""""""""""""""""""""""""""" + +Type hash can be obtained by the native middleware api. For example, with fastDDS, the type hash can be obtained with ``TypeIdentifier->equivalence_hash()`` during the ``on_type_discovery()`` callback. the rmw layer can choose to use the provided hash to impose the aforementioned type enforcement. + +.. TODO:: + + (wjwwood) this needs to be cleaned up + +Evolving Message Definition with Extensible Type +"""""""""""""""""""""""""""""""""""""""""""""""" + +When defining the ``.idl`` msg file, user can choose to apply annotations to the message definition (DDS XTypes spec v1.3: 7.3.1.2 Annotation Language). +Evolving message type can be achieved by leveraging optional fields and inheritance. +For example, the ``Temperature.idl`` below uses ``@optional`` and ``@extensibility`` in the message definition. + +.. code:: + + @extensibility(APPENDABLE) + struct Temperature + { + unsigned long long timestamp + long long temperature + @optional double temperature_float + }; + +Furthermore, an initial test evolving messages with FastDDS, Cyclone, and Connext middleware implementations show that ``@appendable`` and ``@optional`` are implemented in Cyclone and Connext, but not FastDDS (as of Jul 2022). + +.. TODO:: + + (wjwwood) we need to follow up on this + +Handle Detection of Version Mismatch "Above" rmw Layer +"""""""""""""""""""""""""""""""""""""""""""""""""""""" + +We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. +The message version for each participant will then be accessible across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. + +Prevent Communication of Mismatched Versions "Above" rmw Layer +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + TODO Type Description Distribution @@ -788,28 +1056,82 @@ With the corresponding ``Field`` fields: In order to handle the type of a nested type such as ``A``, the receiver can use the ``referenced_type_descriptions`` array as a lookup table keyed by the value of ``Field.nested_type_name`` or ``IndividualTypeDescription.type_name`` (which will be identical for a given type) to obtain the type information of a referenced type. This type handling process can also support any recursive level of nesting (e.g. while handling A, C is encountered as a nested type, C can then be looked up using the top level ``referenced_type_descriptions`` array). -Run-Time Interface Reflect --------------------------- - -TODO - Alternatives ^^^^^^^^^^^^ -Name of Run-Time Interface Reflection -""""""""""""""""""""""""""""""""""""" +Other Providers of Type Description +""""""""""""""""""""""""""""""""""" -Other names were considered, like "Runtime Type Introspection", but this name was selected for three main reasons: +Several other candidate strategies for distributing the type descriptions were considered but ultimately discarded for one or more reasons like: causing a strong dependency on a particular middleware or a third-party technology, difficulties with resolving the message type description locally, difficulties with finding the correct entity to query, or causing network throughput issues. -- to avoid confusion with C++'s Run-time type information (RTTI)\ [3]_, -- to show it was more than RTTI, but instead was also reflection, +These are some of the candidates that were considered, and the reasons for their rejection: - - like how the RTTR C++ library\ [4]_ differs from normal RTTI, and +- Store the type description as a ROS parameter + * Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem +- Store the type description on a centralized node per machine + * Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces. +- Send type description alongside discovery with middlewares + * Works very well if supported, but is only supported by some DDS implementations (which support XTypes or some other way to attach discovery metadata), but causes a strong dependency on DDS. +- Send type description using a different network protocol + * Introduces additional third-party dependencies separate from ROS and the middleware. -- to show that it deals not just with any "type" but specifically ROS's Interfaces +Alternative Type Description Contents and Format +"""""""""""""""""""""""""""""""""""""""""""""""" + +A combination of the original ``idl`` / ``msg`` file and any other information needed for serialization and deserialization being sent allows for one to cover the weaknesses of the other. +Specifically, given that certain use-cases (e.g., ``rosbag``) might encounter situations where consumers of a message are using a different middleware or serialization scheme the message was serialized with, it becomes extremely important to send enough information to both reconstruct the type support, and also allow the message fields to be accessed in a human readable fashion to aid in the writing of transfer functions. +As such, it is not a viable option to only send one or the other. + +Additionally, the option to add a configuration option to choose what contents to receive from the service server was disregarded due to how infrequently the type description query is expected to be called. + +As for the format of the type description, using the ROS interfaces to describe the type, as opposed to an alternative format like XML, JSON, or something like the TypeObject defined by DDS-XTypes, makes it easier to embed in the ROS Service response. +It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. + +Representing Fields as An Array of Field Types +"""""""""""""""""""""""""""""""""""""""""""""" + +The use of an array of ``Field`` messages was balanced against using two arrays in the ``IndividualTypeDescription`` type to describe the field types and field names instead, e.g.: + +.. code:: + + # Rejected IndividualTypeDescription Variants + + # String variant + string type_name + string field_types[] + string field_names[] + + # uint8_t Variant + string type_name + uint8_t field_types[] + string field_names[] + +The string variant was rejected because using strings to represent primitive types wastes space, and will lead to increased bandwidth usage during the discovery and type distribution process. +The uint8_t variant was rejected because uint8_t enums are insufficiently expressive to support nested message types. + +The use of the ``Field`` type, with a ``nested_type_name`` field that defaults to an empty string mitigates the space issue while allowing for support of nested message types. +Furthermore, it allows the fields to be described in a single array, which is easier to iterate through and also reduces the chances of any errors from mismatching the array lengths. + +Using an Array to Store Referenced Types +"""""""""""""""""""""""""""""""""""""""" + +Some alternatives to using an array of type descriptions to store referenced types in a nested type were considered, including: + +- Storing the referenced types inside the individual type descriptions and accessing them by traversing the type description tree recursively instead of using a lookup table. + + - Rejected because the IDL spec does not allow for a type description to store itself, and also because it could possibly introduce duplicate, redundant type descriptions in the tree, using up unnecessary space. + +- Sending referenced types in a separate service call or message. + + - Rejected because needing to collate all of the referenced types on the receiver end introduces additional implementation complexity, and also increases network bandwidth with all the separate calls that must be made. + +Run-Time Interface Reflect +-------------------------- + +TODO Plugin Matching and Loading API Example -""""""""""""""""""""""""""""""""""""""" +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following is an example of how this plugin matching and loading interface could look like, defining new ``rcl`` interfaces; with a plugin wrapping FastCDR v1.0.24 for serialization of ``sensor_msgs/msg/LaserScan`` messages: @@ -829,7 +1151,7 @@ The following is an example of how this plugin matching and loading interface co introspection_handle->match_plugin(LaserScanDescription->get_serialization_format(), "microcdr"); Run-Time Interface Reflection API Example -""""""""""""""""""""""""""""""""""""""""" +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following is an example for how the introspection API could look like. This example will show a read-only interface. @@ -934,112 +1256,19 @@ The following should be error cases: - however, ``rcl_field_info_t`` objects **do not** have any lifecycle dependencies, since they are merely descriptors Alternatives ------------- - -TODO - -Use Type Hash from Middleware, e.g. from DDS-XTypes -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Type hash can be obtained by the native middleware api. For example, with fastDDS, the type hash can be obtained with ``TypeIdentifier->equivalence_hash()`` during the ``on_type_discovery()`` callback. the rmw layer can choose to use the provided hash to impose the aforementioned type enforcement. - -Evolving Message Definition with Extensible Type -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -When defining the ``.idl`` msg file, user can choose to apply annotations to the message definition (DDS XTypes spec v1.3: 7.3.1.2 Annotation Language). Evolving message type can be achieved by leveraging optional fields and inheritance. For example, the ``Temperature.idl`` below uses ``@optional`` and ``@extensibility`` in the message definition. - -.. code:: - - @extensibility(APPENDABLE) - struct Temperature - { - unsigned long long timestamp - long long temperature - @optional double temperature_float - }; - -Furthermore, an initial evolving message test with FastDDS, Cyclone and Connext middleware implementation shows that ``@appendable`` and ``@optional`` are implemented in Cyclone and Connext, but not FastDDS (as of Jul 2022). - -Handle Detection of Version Mismatch "Above" rmw Layer -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. The message version for each participant will then be accessible across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. - -Prevent Communication of Mismatched Versions "Above" rmw Layer -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -TODO - -Type Description Distribution -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Other Providers of Type Description -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Several other candidate strategies for distributing the type descriptions were considered but ultimately discarded for one or more reasons like: causing a strong dependency on a particular middleware or a third-party technology, difficulties with resolving the message type description locally, difficulties with finding the correct entity to query, or causing network throughput issues. - -These are some of the candidates that were considered, and the reasons for their rejection: - -- Store the type description as a ROS parameter - * Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem -- Store the type description on a centralized node per machine - * Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces. -- Send type description alongside discovery with middlewares - * Works very well if supported, but is only supported by some DDS implementations (which support XTypes or some other way to attach discovery metadata), but causes a strong dependency on DDS. -- Send type description using a different network protocol - * Introduces additional third-party dependencies separate from ROS and the middleware. - -Alternative Type Description Contents and Format -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -A combination of the original ``idl`` / ``msg`` file and any other information needed for serialization and deserialization being sent allows for one to cover the weaknesses of the other. -Specifically, given that certain use-cases (e.g., ``rosbag``) might encounter situations where consumers of a message are using a different middleware or serialization scheme the message was serialized with, it becomes extremely important to send enough information to both reconstruct the type support, and also allow the message fields to be accessed in a human readable fashion to aid in the writing of transfer functions. -As such, it is not a viable option to only send one or the other. - -Additionally, the option to add a configuration option to choose what contents to receive from the service server was disregarded due to how infrequently the type description query is expected to be called. - -As for the format of the type description, using the ROS interfaces to describe the type, as opposed to an alternative format like XML, JSON, or something like the TypeObject defined by DDS-XTypes, makes it easier to embed in the ROS Service response. -It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. - -``TypeDescription`` Structure -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Representing Fields as An Array of Field Types -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The use of an array of ``Field`` messages was balanced against using two arrays in the ``IndividualTypeDescription`` type to describe the field types and field names instead, e.g.: - -.. code:: - - # Rejected IndividualTypeDescription Variants - - # String variant - string type_name - string field_types[] - string field_names[] - - # uint8_t Variant - string type_name - uint8_t field_types[] - string field_names[] - -The string variant was rejected because using strings to represent primitive types wastes space, and will lead to increased bandwidth usage during the discovery and type distribution process. -The uint8_t variant was rejected because uint8_t enums are insufficiently expressive to support nested message types. - -The use of the ``Field`` type, with a ``nested_type_name`` field that defaults to an empty string mitigates the space issue while allowing for support of nested message types. -Furthermore, it allows the fields to be described in a single array, which is easier to iterate through and also reduces the chances of any errors from mismatching the array lengths. - -Using an Array to Store Referenced Types -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^ -Some alternatives to using an array of type descriptions to store referenced types in a nested type were considered, including: +Name of Run-Time Interface Reflection +""""""""""""""""""""""""""""""""""""" -- Storing the referenced types inside the individual type descriptions and accessing them by traversing the type description tree recursively instead of using a lookup table. +Other names were considered, like "Runtime Type Introspection", but this name was selected for three main reasons: - - Rejected because the IDL spec does not allow for a type description to store itself, and also because it could possibly introduce duplicate, redundant type descriptions in the tree, using up unnecessary space. +- to avoid confusion with C++'s Run-time type information (RTTI)\ [3]_, +- to show it was more than RTTI, but instead was also reflection, -- Sending referenced types in a separate service call or message. + - like how the RTTR C++ library\ [4]_ differs from normal RTTI, and - - Rejected because needing to collate all of the referenced types on the receiver end introduces additional implementation complexity, and also increases network bandwidth with all the separate calls that must be made. +- to show that it deals not just with any "type" but specifically ROS's Interfaces Backwards Compatibility From 24492e535bb91d3eca690e9a9e2b9a5bb3459d62 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Tue, 6 Sep 2022 14:00:48 -0700 Subject: [PATCH 18/22] add feature progress section Signed-off-by: William Woodall --- rep-2011.rst | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 100 insertions(+), 2 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 675353486..b488ce13c 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -232,7 +232,7 @@ Tools will assist the user in making these remappings and running the necessary Once the mismatched messages are flowing through the transfer functions, communication should be possible and neither the publishing side nor the subscribing side have any specific knowledge of the conversions taking place or that any conversions are necessary. .. TODO:: - + Extend conceptual overview to describe how this will work with Services and Actions. Services, since they are not sensitive to the many-to-many (many publisher) issue, unlike Topics, and because they do not have as many QoS settings that apply to them, they can probably have transfer functions that are plugins, rather than separate component nodes that repeat the service call, like the ros1_bridge. Actions will be a combination of topics and services, but will have other considerations in the tooling. @@ -1276,10 +1276,108 @@ Backwards Compatibility TODO + Feature Progress ================ -TODO +Supporting Feature Development: + +- TypeDescription Message: + + - [ ] Define and place the TypeDescription.msg Message type in a package + +- Enforced Type Versioning: + + - [ ] Create library to calculate version hash from message files + - [ ] Create library to calculate version hash from service files + - [ ] Create library to calculate version hash from action files + - [ ] Create command-line tool to show the version hash of an interface + + - [ ] Add topic type version hash to the "graph" API + - [ ] Implement delivery of type version hash during discovery: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Add rmw API for notifying user when a topic endpoint was not matched due to type mismatch + + - [ ] Detect when Message versions do not match, prevent matching, notify user: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Detect when Service versions do not match, prevent matching, notify user: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Detect when Action versions do not match, prevent matching, notify user: + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + +- Type Description Distribution: + + - [ ] Define and place the GetTypeDescription.srv Service type in a package + - [ ] Implement the "get type description" service in the rcl layer + +- Run-time Interface Reflection: + + - [ ] Prototype "description of type" -> "accessing data fields from buffer" for each rmw vendor: + + - [ ] Fast-DDS + - [ ] CycloneDDS + - [ ] RTI Connext Pro + + - [ ] Prototype "description of type" -> "create datareaders/writers" for each rmw vendor: + + - [ ] Fast-DDS + - [ ] CycloneDDS + - [ ] RTI Connext Pro + + - [ ] Implement reflection API in ``rcl_serialization`` package, providing (de)serialization with just a description + - [ ] Implement single "plugin" using Fast-CDR (tentatively) + - [ ] Implement plugin system, allowing for multiple backends for (X)CDR + - [ ] Implement at least one other (X)CDR backend, perhaps based on RTI Connext Pro + + - [ ] Add "description of type" -> ``rosidl_message_type_support_t`` to ``rmw`` API + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Update APIs for creating Publishers and Subscriptions in rclcpp using a TypeDescription instance + + - [ ] Add "description of type" -> ``rosidl_service_type_support_t`` to ``rmw`` API + + - [ ] rmw_fastrtps_cpp + - [ ] rmw_cyclonedds_cpp + - [ ] rmw_connextdds + + - [ ] Update APIs for creating Service Clients and Servers in rclcpp using a TypeDescription instance + +Tooling Development: + +- [ ] Command-line tools for interacting with type version hashes +- [ ] Add a default warning to users when a topic has mismatched types (by type or version) + +- [ ] Command-line tool for determining if two versions of a type are convertible + +- [ ] CMake and C++ APIs for writing transfer functions, including usage documentation +- [ ] Command-line tools for interacting with user-defined transfer functions +- [ ] Command-line tool for starting a conversion node between two versions of a type + +- [ ] Integration with launch, making it easier to set up conversions in launch files + +- [ ] Integration with rosbag2 + + - [ ] Record the type version hash, TypeDescription, and other metadata into bags + - [ ] Use recorded metadata to create subscriptions without needing the type to be built locally + - [ ] Use recorded metadata to create publishers from descriptions in bag References From bd5cdbc26e7a43b0613f4a7c096cc49ec15dcc95 Mon Sep 17 00:00:00 2001 From: William Woodall Date: Fri, 9 Sep 2022 11:23:29 -0700 Subject: [PATCH 19/22] cleanup feature progress list Signed-off-by: William Woodall --- rep-2011.rst | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index b488ce13c..680e6b43a 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -1350,15 +1350,23 @@ Supporting Feature Development: - [ ] rmw_cyclonedds_cpp - [ ] rmw_connextdds - - [ ] Update APIs for creating Publishers and Subscriptions in rclcpp using a TypeDescription instance - - [ ] Add "description of type" -> ``rosidl_service_type_support_t`` to ``rmw`` API - [ ] rmw_fastrtps_cpp - [ ] rmw_cyclonedds_cpp - [ ] rmw_connextdds - - [ ] Update APIs for creating Service Clients and Servers in rclcpp using a TypeDescription instance + - [ ] Update APIs for rclcpp to support using TypeDescription instances to create entities: + + - [ ] Topics + - [ ] Services + - [ ] Actions + + - [ ] Update APIs for rclpy to support using TypeDescription instances to create entities: + + - [ ] Topics + - [ ] Services + - [ ] Actions Tooling Development: @@ -1378,6 +1386,7 @@ Tooling Development: - [ ] Record the type version hash, TypeDescription, and other metadata into bags - [ ] Use recorded metadata to create subscriptions without needing the type to be built locally - [ ] Use recorded metadata to create publishers from descriptions in bag + - [ ] Provide a method to playback/update old bag files that do not have type descriptions stored References From ae78cf933d9d28ce2b419c2ce4580af3378da69c Mon Sep 17 00:00:00 2001 From: Emerson Knapp <537409+emersonknapp@users.noreply.github.com> Date: Tue, 31 Jan 2023 14:07:26 -0800 Subject: [PATCH 20/22] Update Type Version Hash to use SHA-256, and details updates based on design discussion (#7) --- rep-2011.rst | 71 +++++++++++++++++++++++++++------------------------- 1 file changed, 37 insertions(+), 34 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 680e6b43a..0435424a6 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -119,7 +119,7 @@ ROS 2 has special considerations on this topic because it can support different Therefore, it is neither desirable to depend on features of a specific technology, nor is it desirable suggest patterns that rely on features that only some serialization technologies provide. In either case, that would tie ROS 2 to specific serialization technologies, and that should be avoided if possible. -That being said, this proposal will require some specific features from the middleware and serialization technology, but the goal is to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suites them. +That being said, this proposal will require some specific features from the middleware and serialization technology, but the goal is to choose approaches which give ROS 2 the broadest support across middleware implementations, ideally while not limiting users from using specific features of the underlying technology when that suits them. With those examples and design constraints as motivation, this REP makes a proposal on how to handle evolving message types in the following Specification section, as well as rationales and considered alternatives in the Rationale section and its sub-sections. @@ -248,59 +248,62 @@ These features are described in the following sections. Type Version Enforcement ------------------------ -In order to detect type version mismatches and enforce them, a way to uniquely identify versions is required, and this proposal uses type version hashes. +In order to detect and enforce type version mismatches, as well as communicate information about type descriptions compactly, a way to uniquely identify versions is required. +This proposal uses a hash of the type's description for this purpose. + Type Version Hash ^^^^^^^^^^^^^^^^^ -The type version hashes are not sequential and do not imply any rank among versions of the type. -That is, given two version hashes of a type, there is no way to tell which is "newer". - -The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. -Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. - -The type version hashes are calculated in a stable way and are not sensitive to trivial changes like changes in the comments or whitespace of the IDL file. -The IDL file given by the user, which may be a ``.msg`` file, ``.idl`` file, or something else, is parsed and stored into a data structure which excludes things like comments but includes things that impact compatibility on the wire. - -The data structure includes: +The hash must be calculated in a stable way such that it is not changed by trivial differences that do not impact serialization compatibility. +The hash must also be able to be calculated at runtime from information received on the wire. +This allows subscribers to validate the received TypeDescriptions against advertised hashes, and allows dynamic publishers to invent new types and advertise their hash programmatically. +The interface description source provided by the user, which may be a ``.msg``, ``.idl`` or other file type, is parsed into the TypeDescription object as an intermediate representation. +This way, types coming from two sources that have the same stated name and are wire compatible will be given the same hash, even if they are defined using source text that is not exactly equal. +The hash must only be computed using fields that affect communication compatibility. Thus the representation should include: -- a list of field names and types, but not default values -- a recursive list of field names and types for referenced types -- an optional user-defined interface version, or 0 if not provided +This representation includes: -.. TODO:: +- the package, namespace, and type name, for example `sensor_msgs/msg/Image` +- a list of field names and types +- a list of all recursively referenced types +- no field default values +- no comments - Should the message name, including package and namespace, be part of this? - Consider a situation where you have the same data structure but two different type names, should those two instances have the same hash? - They are unique when paired with their type name, so it should be ok, but it is a bit weird, perhaps, since we're not using this type hash to check for wire compatibility between differently named types. +Finally, the resulting filled data structure must be represented in a platform-independent format, rather than running the hash function on the in-memory native type representation. +Different languages, architectures, or compilers will produce different in-memory representations, and the hash must be consistently calculable in different contexts. -.. TODO:: - - Related TODO, should we just use the TypeDescription described in a later section? - It's essentially the same thing, but it does include the type name. - I (wjwwood) am leaning in this direction. - -The resulting data structure is hashed using a standard SHA-1 method, resulting in a standard 160-bit (20-byte) hash value which is also generally known as a "message digest". +The resulting data structure is hashed using SHA-256, resulting in a 256-bit (32-byte) hash value which is also generally known as a "message digest". This hash is combined with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS1``. -They are combined using an ``_`` symbol, resulting in a complete type version hash like ``RIHS1_<160-bit SHA-1 of data structure>``. -This allows the tooling to know if a hash mismatch is due to a change in this standard (what is being hashed) or due to a difference in the interface types themselves. +They are combined using an ``_`` symbol, resulting in a complete type version hash like ``RIHS1_<256-bit SHA-256 of data structure>``. +This allows the tooling to know if a hash mismatch is due to a change in this standard (how hash is computed) or due to a difference in the interface types themselves. +In the case of a change in standard, it will be unknown whether the interface types are equal or not. For now, the list of field names and their types are the only contributing factors, but in the future that could change, depending on which "annotations" are supported in ``.idl`` files. The "IDL - Interface Definition and Language Mapping" design document\ [2]_ describes which features of the OMG IDL standard are supported by ROS 2. -If that is extended in the future, then this data structure may need to be updated, and if so the "type version hash standard version" will also need to be incremented. +If that is extended in the future, then this data structure may need to be updated, and if so the "ROS IDL Hashing Standard" version will also need to be incremented. +New sanitizing may be needed on the TypeDescription pre-hash procedure, in the case of these new features. .. TODO:: Re-audit the supported features from OMG IDL according to the referenced design document, including the @key annotation and how it may impact this for the reference implementation. -The optional user-defined interface version makes it possible to change the version hash of a message that only changed in "field semantics" (i.e. without changing field names or types), and therefore makes it possible to write "transfer functions" to handle semantic-only conversions between versions. -There is currently no standard way to specify the user-defined interface version in either ``.msg`` or ``.idl`` files. +Notes: -.. TODO:: +The type version hash is not sequential and does not imply any rank among versions of the type. That is, given two version hashes of a type, there is no way to tell which is "newer". + +Because the hash contains the stated name of the type, differently-named types with otherwise identical descriptions will be mismatched as incompatible. +This matches existing ROS precedent of strongly-typed interfaces. + +The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. +Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. - Remove the idea of the user-defined interface version or define how it can be supplied by the user in one or more of the idl file kinds. +It may be desirable, as a user, to change the version hash of a message even when no field types or names have changed, perhaps due to a change in semantics of existing fields. +There is no built-in way to do this manual re-versioning. +However, we suggest the following method which requires no special tooling: provide an extra field within the interface with a name ``bool versionX = true``. +To manually trigger a hash update, change by increment the name of the field, for example ``bool versionY = true``. -Note that this data structure does not include the serialization format being used, nor does it include the version of the serialization technology. +The TypeDescription does not include the serialization format being used, nor does it include the version of the serialization technology. This type version hash is for the *description* of the type, and is not meant to be used to determine wire compatibility by itself. The type version hash must be considered in context, with the serialization format and version in order to determine wire compatibility. From ba2f96c5163b77385956d51065607201ae3f71e7 Mon Sep 17 00:00:00 2001 From: Emerson Knapp <537409+emersonknapp@users.noreply.github.com> Date: Wed, 29 Mar 2023 13:19:34 -0700 Subject: [PATCH 21/22] Update RIHS notes based on latest discussion (#8) Signed-off-by: Emerson Knapp --- rep-2011.rst | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 0435424a6..770797a81 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -274,9 +274,13 @@ Finally, the resulting filled data structure must be represented in a platform-i Different languages, architectures, or compilers will produce different in-memory representations, and the hash must be consistently calculable in different contexts. The resulting data structure is hashed using SHA-256, resulting in a 256-bit (32-byte) hash value which is also generally known as a "message digest". -This hash is combined with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS1``. -They are combined using an ``_`` symbol, resulting in a complete type version hash like ``RIHS1_<256-bit SHA-256 of data structure>``. -This allows the tooling to know if a hash mismatch is due to a change in this standard (how hash is computed) or due to a difference in the interface types themselves. +This hash is paired with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS01``. +RIHS Version 00 is reserved for "Invalid" / "unset", and the RIHS version is limited by this specification to a maximum value of 255. +RIHS hash values must have a well-defined UTF-8 string representation for human readability and for passing over string-only communication channels. +The prefix of a well-formed RIHS string will always be ``RIHSXX_``, where ``X`` is one hexadecimal digit, followed by the version-dependent string representation of the hash value. +For ``RIHS01``, the hash value is 64 hexadecimal digits representing the 256-bit message digest, leading to a known ``RIHS01`` string length of 71. + +This versioning allows the tooling to know if a hash mismatch is due to a change in this standard (how hash is computed) or due to a difference in the interface types themselves. In the case of a change in standard, it will be unknown whether the interface types are equal or not. For now, the list of field names and their types are the only contributing factors, but in the future that could change, depending on which "annotations" are supported in ``.idl`` files. @@ -348,7 +352,7 @@ For example, a type name using this approach may look like this: .. code:: - sensor_msgs/msg/Image__RIHS1_XXXXXXXXXXXXXXXXXXXX + sensor_msgs/msg/Image__RIHS01_XXXXXXXXXXXXXXXXXXXX This has the benefit of "just working" for most middlewares which at least match based on the name of the type, and it is simple, requiring no further custom hooks into the middleware's discovery or matchmaking process. @@ -776,8 +780,8 @@ Registering a transfer function in an ``ament_cmake`` project might look like th create_transfer_function( src/my_transfer_function.cpp FUNCTION_NAME my_transfer_function - FROM sensor_msgs/msg/Image RIHS1_XXXXXXXXXXXXXXXXX123 - TO sensor_msgs/msg/Image RIHS1_XXXXXXXXXXXXXXXXXabc + FROM sensor_msgs/msg/Image RIHS01_XXXXXXXXXXXXXXXXX123 + TO sensor_msgs/msg/Image RIHS01_XXXXXXXXXXXXXXXXXabc ) This CMake function would create the library target, link the necessary libraries, and install any files in the ``ament_index`` needed to make the transfer function discoverable by the tools. From a4e1ef57d460b83a9f59c7e979774a5d1fffa1b2 Mon Sep 17 00:00:00 2001 From: Emerson Knapp <537409+emersonknapp@users.noreply.github.com> Date: Tue, 12 Sep 2023 14:02:50 -0700 Subject: [PATCH 22/22] Remove Type Description/Hash content (moving to REP-2016) (#9) * Full pass removing type descriptions/hashes portion for new rep Signed-off-by: Emerson Knapp * Starting a second pass cleaning up the edits Signed-off-by: Emerson Knapp * Tiny changes Signed-off-by: Emerson Knapp * More tinies Signed-off-by: Emerson Knapp --------- Signed-off-by: Emerson Knapp --- rep-2011.rst | 556 ++++----------------------------------------------- 1 file changed, 38 insertions(+), 518 deletions(-) diff --git a/rep-2011.rst b/rep-2011.rst index 770797a81..05d1ceb29 100644 --- a/rep-2011.rst +++ b/rep-2011.rst @@ -20,6 +20,7 @@ Specifically, this REP proposes that we provide tooling to convert from old vers This approach is a good way to achieve backwards compatibility in messages over long periods of time and in a variety of scenarios, e.g. over the wire, converting bag files, or in specialized tools. This approach does not rely on specific features of the serialization technology and instead relies on the ability to communicate the full type descriptions on the wire, beyond their name and version, and use that description to introspect the values of it at runtime using reflection. +The type description design is described in detail in REP-2016\ [#rep2016]_. This approach can be used in conjunction with serialization technology specific features like optional fields, inheritance, etc., but it works even with the simplest serialization technologies. This is true so long as we have the ability to introspect the messages at runtime without prior knowledge of the type description, which is a feature we also need for generic introspection tools like ``rosbag2`` and ``ros2 topic echo``. @@ -128,6 +129,8 @@ Terminology TODO +- Type Version + Specification ============= @@ -142,14 +145,9 @@ In order to do this, this proposal describes how ROS 2 can be changed to support - enforcing type compatibility by version - - by providing type version hashes, and - - making it possible to see what versions of types are being used by other endpoints, and + - by detecting type mismatches via type hashes communicated from other endpoints, and - warning users when type enforcement is preventing two endpoints from communicating -- providing access to the complete type description of types being used - - - and making it possible to access the type description from nodes remotely - - making it possible to publish and subscribe to topics using just the type description - even when the type was not available at compile time @@ -160,13 +158,13 @@ This Specification section covers the conceptual overview in more detail, then d Conceptual Overview ------------------- -Users will be able to calculate the "type version hash" for an interface (e.g. a message, service, or action) using the ``ros2 interface hash `` command. -This hash is also used by ROS 2 to determine if the same type name but different type versions are being used on the same topic, so that a warning may be logged that the endpoints that do not match may not communicate. +ROS 2 will provide a "type hash" for an interface. +This type hash is used by ROS 2 to determine if the same type name with different type descriptions are being used on the same topic, so that a warning may be logged that endpoints that do not match may not communicate. .. note:: An exception to this rule is that if the underlying middleware has more sophisticated rules for matching types, for example the type has been extended with an optional field, then they may still match - In that case, ROS 2 will defer to the middleware and not produce warnings when the type version hashes do not match. + In that case, ROS 2 will defer to the middleware and not produce warnings when the type hashes do not match. Instead, ROS 2 will rely on the middleware to notify it when two endpoints do not match based on their types not being compatible, so that a warning can be produced. When a mismatch is detected, the user can use user-defined or automatically generated generic "transfer functions" to convert between versions of the type until it is in the type version they wish to send or receive. @@ -222,7 +220,7 @@ It will continue to do this until it reaches the desired type version or it fail │ │ └─────────────────────────────────┘ -Once the set of necessary transfer functions has been identified, the ROS graph can be changed to have one side of the topic be remapped onto a new topic name which indicates it is of a different version that what is desired, and then the transfer function can be run as a component node which subscribes to one version of the message, performs the conversion using the chain of transfer functions, and then publishes the other version of the message. +Once the set of necessary transfer functions has been identified, the ROS graph can be changed to have one side of the topic be remapped onto a new topic name which indicates it is of a different version than what is desired, and then the transfer function can be run as a component node which subscribes to one version of the message, performs the conversion using the chain of transfer functions, and then publishes the other version of the message. Tools will assist the user in making these remappings and running the necessary component nodes with the appropriate configurations, either from their launch file or from the command line. .. TODO:: @@ -237,10 +235,9 @@ Once the mismatched messages are flowing through the transfer functions, communi Services, since they are not sensitive to the many-to-many (many publisher) issue, unlike Topics, and because they do not have as many QoS settings that apply to them, they can probably have transfer functions that are plugins, rather than separate component nodes that repeat the service call, like the ros1_bridge. Actions will be a combination of topics and services, but will have other considerations in the tooling. -In order to support this vision, three missing features will need to be added into ROS 2 (which were also mentioned in the introduction): +In order to support this vision, these missing features will need to be added into ROS 2 (which were also mentioned in the introduction): - enforcing type compatibility by version -- providing access to the complete type description of types being used - making it possible to publish and subscribe to topics using just the type description These features are described in the following sections. @@ -248,106 +245,43 @@ These features are described in the following sections. Type Version Enforcement ------------------------ -In order to detect and enforce type version mismatches, as well as communicate information about type descriptions compactly, a way to uniquely identify versions is required. -This proposal uses a hash of the type's description for this purpose. - - -Type Version Hash -^^^^^^^^^^^^^^^^^ - -The hash must be calculated in a stable way such that it is not changed by trivial differences that do not impact serialization compatibility. -The hash must also be able to be calculated at runtime from information received on the wire. -This allows subscribers to validate the received TypeDescriptions against advertised hashes, and allows dynamic publishers to invent new types and advertise their hash programmatically. -The interface description source provided by the user, which may be a ``.msg``, ``.idl`` or other file type, is parsed into the TypeDescription object as an intermediate representation. -This way, types coming from two sources that have the same stated name and are wire compatible will be given the same hash, even if they are defined using source text that is not exactly equal. -The hash must only be computed using fields that affect communication compatibility. Thus the representation should include: - -This representation includes: - -- the package, namespace, and type name, for example `sensor_msgs/msg/Image` -- a list of field names and types -- a list of all recursively referenced types -- no field default values -- no comments - -Finally, the resulting filled data structure must be represented in a platform-independent format, rather than running the hash function on the in-memory native type representation. -Different languages, architectures, or compilers will produce different in-memory representations, and the hash must be consistently calculable in different contexts. - -The resulting data structure is hashed using SHA-256, resulting in a 256-bit (32-byte) hash value which is also generally known as a "message digest". -This hash is paired with a type version hash standard version, which we will call the "ROS IDL Hashing Standard" or "RIHS", the first version of which will be ``RIHS01``. -RIHS Version 00 is reserved for "Invalid" / "unset", and the RIHS version is limited by this specification to a maximum value of 255. -RIHS hash values must have a well-defined UTF-8 string representation for human readability and for passing over string-only communication channels. -The prefix of a well-formed RIHS string will always be ``RIHSXX_``, where ``X`` is one hexadecimal digit, followed by the version-dependent string representation of the hash value. -For ``RIHS01``, the hash value is 64 hexadecimal digits representing the 256-bit message digest, leading to a known ``RIHS01`` string length of 71. - -This versioning allows the tooling to know if a hash mismatch is due to a change in this standard (how hash is computed) or due to a difference in the interface types themselves. -In the case of a change in standard, it will be unknown whether the interface types are equal or not. - -For now, the list of field names and their types are the only contributing factors, but in the future that could change, depending on which "annotations" are supported in ``.idl`` files. -The "IDL - Interface Definition and Language Mapping" design document\ [2]_ describes which features of the OMG IDL standard are supported by ROS 2. -If that is extended in the future, then this data structure may need to be updated, and if so the "ROS IDL Hashing Standard" version will also need to be incremented. -New sanitizing may be needed on the TypeDescription pre-hash procedure, in the case of these new features. - -.. TODO:: - - Re-audit the supported features from OMG IDL according to the referenced design document, including the @key annotation and how it may impact this for the reference implementation. - -Notes: - -The type version hash is not sequential and does not imply any rank among versions of the type. That is, given two version hashes of a type, there is no way to tell which is "newer". - -Because the hash contains the stated name of the type, differently-named types with otherwise identical descriptions will be mismatched as incompatible. -This matches existing ROS precedent of strongly-typed interfaces. - -The type version hash can only be used to determine if type versions are equal and if there exists a chain of transfer functions that can convert between them. -Because of this, when a change to a type is made, it may or may not be necessary to write transfer functions in both directions depending on how the interface is used. - -It may be desirable, as a user, to change the version hash of a message even when no field types or names have changed, perhaps due to a change in semantics of existing fields. -There is no built-in way to do this manual re-versioning. -However, we suggest the following method which requires no special tooling: provide an extra field within the interface with a name ``bool versionX = true``. -To manually trigger a hash update, change by increment the name of the field, for example ``bool versionY = true``. - -The TypeDescription does not include the serialization format being used, nor does it include the version of the serialization technology. -This type version hash is for the *description* of the type, and is not meant to be used to determine wire compatibility by itself. -The type version hash must be considered in context, with the serialization format and version in order to determine wire compatibility. - Enforcing Type Version ^^^^^^^^^^^^^^^^^^^^^^ -The type version hash may be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. +The Type Hash may be used as an additional constraint to determine if two endpoints (publishers and subscriptions) on a topic should communicate. Again, whether or not it is used will depend on the underlying middleware and how it determines if types are compatible between endpoints. -Simpler middlewares will not do anything other than check that the type names match, in which case the version hash will likely be used. -However, in more sophisticated middlewares type compatibility can be determined using more complex rules and by looking at the type descriptions themselves, and in those cases the type version hash may not be used to determine matching. +Simpler middlewares may not do anything other than check that the type names match, in which case the hash will likely be used as an extra comparison to determine compatibility. +However, in more sophisticated middlewares type compatibility can be determined using more complex rules and by looking at the type descriptions themselves, and in those cases the type hash may not be used to determine matching. When creating a publisher or subscription, the caller normally provides: - a topic name, - QoS settings, and -- a topic type +- a topic type name -Where the topic type is represented as a string and is automatically deduced based on the type given to the function that creates the entity. +Where the topic type name is represented as a string and is automatically deduced based on the type given to the function that creates the entity. The type may be passed as a template parameter in C++ or as an argument to the function in Python. -For example, creating a publisher for the C++ type ``std_msgs::msg::String`` using ``rclcpp`` may result in a topic type like the string ``std_msgs/msg/String``. +For example, creating a publisher for the C++ type ``std_msgs::msg::String`` using ``rclcpp`` may result in a topic type name ``"std_msgs/msg/String"``. All of the above items are used by the middleware to determine if two endpoints should communicate or not, and this REP proposes that the type version be added to this list of provided information. Nothing needs to change from the user's perspective, as the type version can be extracted automatically based on the topic type given, either at the ``rcl`` layer or in the ``rmw`` implementation itself. That is to say, how users create things like publishers and subscription should not need to change, no matter which programming language is being used. -However, the type version hash would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription. +However, the type hash would become something that the ``rmw`` implementation is provided and aware of in the course of creating a publisher or subscription. The decision of whether or not to use that information to enforce type compatibility would be left to the middleware, rather than implementing it as logic in ``rcl`` or other packages above the ``rmw`` API. The method for implementing the detection and enforcement of type version mismatches is left up to the middleware. -Some middlewares will have tools to handle this without this new type version hash and others will implement something like what would be possible in the ``rcl`` and above layers using the type version hash. +Some middlewares will have tools to handle this without the type hash and others will implement something like what would be possible in the ``rcl`` and above layers using the type hash. By keeping this a detail of the ``rmw`` implementation, we allow the ``rmw`` implementations to make optimizations where they can. Recommended Strategy for Enforcing that Type Versions Match ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If the middleware has a feature to handle type compatibility already, as is the case with DDS-XTypes which is discussed later, then that can be used to enforce type safety, and then the type version hash would only be used for debugging and for storing in recordings. +If the middleware has a feature to handle type compatibility already, as is the case with DDS-XTypes which is discussed later, then that can be used to enforce type safety, and then the type hash would only be used for debugging and for storing in recordings. -However, if the middleware lacks this kind of feature, then the recommended strategy for accomplishing this in the ``rmw`` implementation is to simply concatenate the type name and the type version hash with double underscores and then use that as the type name given to the underlying middleware. +However, if the middleware lacks this kind of feature, then the recommended strategy for accomplishing this in the ``rmw`` implementation is to simply concatenate the type name and the type hash with double underscores and then use that as the type name given to the underlying middleware. For example, a type name using this approach may look like this: .. code:: @@ -356,7 +290,7 @@ For example, a type name using this approach may look like this: This has the benefit of "just working" for most middlewares which at least match based on the name of the type, and it is simple, requiring no further custom hooks into the middleware's discovery or matchmaking process. -However, one downside with this approach is that it makes interoperation between ROS 2 and the "native" middleware more difficult, as appending the version hash to the type name is just "one more thing" that you have to contend with when trying to connect non-ROS endpoints to a ROS graph. +However, one downside with this approach is that it makes interoperation between ROS 2 and the "native" middleware more difficult, as appending the type hash to the type name is just "one more thing" that you have to contend with when trying to connect non-ROS endpoints to a ROS graph. Alternative Strategy for Enforcing that Type Versions Match ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -368,7 +302,7 @@ However, in order to increase compatibility between rmw implementations, it is r This recommendation is particularly useful in the case where a DDS based ROS 2 endpoint is talking to another DDS-XTypes based ROS 2 endpoint. The DDS-XTypes based endpoint then has a chance to "gracefully degrade" to interoperate with the basic DDS based ROS 2 endpoint. -This would not be the case if the DDS-XTypes based ROS 2 endpoint did not include the type version hash in the type name, as is suggested with the recommended strategy. +This would not be the case if the DDS-XTypes based ROS 2 endpoint did not include the type hash in the type name, as is suggested with the recommended strategy. .. TODO:: @@ -382,206 +316,33 @@ One potential downside to delegating type matching to the rmw implementation is If ROS 2 is to provide users a warning that two endpoints will not communicate due to their types not matching, it requires there to be a way for the middleware to notify the ROS layer when a topic is not matched due to the type incompatibility. As some of the following sections describe, it might be that the rules by which the middleware decides on type compatibility are unknown to ROS 2, and so the middleware has to indicate when matches are and are not made. -If the middleware just uses the type name to determine compatibility, then the rmw implementation can just check the type version hash, and if they do not match between endpoints then the rmw implementation can notify ROS 2, and a warning can be produced. +If the middleware just uses the type name to determine compatibility, then the rmw implementation can compare type hashes, and if they do not match between endpoints then the rmw implementation can notify ROS 2, and a warning can be produced. Either way, to facilitate this notice, the ``rmw_event_type_t`` shall be extended to include a new event called ``RMW_EVENT_OFFERED_TYPE_INCOMPATIBLE``. Related functions and structures will also be updated so that the event can be associated with specific endpoints. -.. TODO:: - - It's not clear how we will do this just now, since existing "QoS events" lack a way to communicate this information, I (wjwwood) think. - -Accessing the Type Version Hash -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For debugging and introspection, the type version hash will be accessible via the ROS graph API, by extending the ``rmw_topic_endpoint_info_t`` struct, and related types and functions, to include the type version hash, ``topic_type_version_hash``, as a string. -It should be along side the ``topic_type`` string in that struct, but the ``topic_type`` field should not include the concatenated type version hash, even if the recommended approach is used. -Instead the type name and version hash should be separated and placed in the fields separately. - -This information should be transmitted as part of the discovery process. - -This field can be optionally an empty string, in order to support interaction with older versions of ROS where this feature was not yet implemented, but it should be provided if at all possible. - Notes for Implementing the Recommended Strategy with DDS ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The DDS standard provides an ``on_inconsistent_topic()`` method on the ``ParticipantListener`` class as a callback function which is called when an ``INCONSISTENT_TOPIC`` event is detected. This event occurs when the topic type does not match between endpoints, and can be used for this purpose, but at the time of writing (July 2022), this feature is not supported across all of the DDS vendors that can be used by ROS 2. -Sending of the type version hash during discovery should be done using the ``USER_DATA`` QoS setting, even if the type version hash is not used for determining type compatibility, and then provided to the user-space code through the ``rmw_topic_endpoint_info_t`` struct. - Interactions with DDS-XTypes or Similar Implicit Middleware Features ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -When using DDS-Xtypes type compatibility is determined through sophisticated and configurable rules, allowing for things like extensible types, optional fields, implicit conversions, and even inheritance. -Which of these features is supported for use with ROS is out of scope with this REP, but if any of them are in use, then it may be possible for two endpoints to match and communicate even if their ROS 2 type version hashes do not match. +When using DDS-Xtypes, type compatibility is determined through sophisticated and configurable rules, allowing for things like extensible types, optional fields, implicit conversions, and even inheritance. +Which of these features is supported for use with ROS is out of scope with this REP, but if any of them are in use, then it may be possible for two endpoints to match and communicate even if their ROS 2 type hashes do not match. In this situation the middleware is responsible for communicating to the rmw layer when an endpoint will not be matched due to type incompatibility. The ``INCONSISTENT_TOPIC`` event in DDS applies for DDS-XTypes as well, and should be useful in fulfilling this requirement. -Type Description Distribution ------------------------------ - -For some use cases the type version hash is insufficient and instead the full type description is required. - -One of those use cases, which is also described in this REP, is "Run-Time Interface Reflection", which is the ability to introspect the contents of a message at runtime when the description for that message, or that version of that message, was unavailable at compile time. -In this use case the type description is used to interpret the serialized data dynamically. - -Another use case, which is not covered in this REP, is using the type description in tooling to either display the type description to the user or to include it in recordings. - -In either case, where the type description comes from doesn't really matter, and so, for example, it could be looked up on the local filesystem or read from a rosbag file. -However, in practice, the correct type description may not be found locally, especially in cases where you have different versions of messages in the same system, e.g.: - -- because it's on another computer, or -- because it is from a different distribution of ROS, or -- because it was built in a different workspace, or -- because the application has not been restarted since recompiling a change to the type being used - -In any case, it is useful to have a mechanism to convey the type descriptions from the source of the data to other nodes, which we describe here as "type description distribution". - -Furthermore, this feature should be agnostic to the underlying middleware and serialization library, as two endpoints may not have the same rmw implementation, or the data may have been serialized to a different format in the case of playback of a recording. - -Sending the Type Description -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Type descriptions will be provided by a ROS Service called ``~/_get_type_description``, which will be offered by each node. -There will be a single ROS Service per node, regardless of the number of publishers or subscriptions on that node. - -This service may be optional, for example being enabled or disabled when creating the ROS Node. - -.. TODO:: - - How can we detect when a remote node is not offering this service? - It's difficult to differentiate between "the Service has not been created yet, but will be" and "the Service will never be created". - Should we use a ROS Parameter to indicate this? - But then what if remote access to Parameters (another optional Service) is disabled? - Perhaps we need a "services offered" list which is part of the Node metadata, which is sent for each node in the rmw implementation, but that's out of scope for this REP. - -A service request to this ROS Service will comprise of the type name and the type version hash, which is distributed during discovery of endpoints and will be accessible through the ROS Graph API, as described in previous sections. -The ROS Service server will respond with the type description and any necessary metadata needed to do Run-Time Interface Reflection. -This service is not expected to be called frequently, and is likely to only occur when new topic or service endpoints are created, and even then, only if the endpoint type hashes do not match. - -.. TODO:: - - Should each endpoint be asked about their type/version pair or should we assume that the type/version pair guarantees a unique type description and therefore reuse past queries? - (wjwwood) I am leaning towards assuming the type name and type version hash combo as being a unique identifier. - -Type Description Contents and Format -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The response sent by the ROS Service server will contain a combination of the original ``idl`` or ``msg`` file's content, as well as any necessary information to serialize and deserialize the raw message buffers sent on the topic. -The response will contain a version of the description that contains comments from the original type description, as those might be relevant to interpreting the semantic meaning of the message fields. - -Additionally, the response could include the serialization library used, its version, or any other helpful information from the original producer of the data. - -.. TODO:: - - What happens if the message consumer doesn't have access to the serialization library stated in the meta-type? - (wjwwood) It depends on what you're doing with the response. If you are going to subscribe to a remote endpoint, then I think we just refuse to create the subscription, as communication will not work, and there's a basic assumption that if you're going to communicate with an endpoint then you are using the same or compatible rmw implementations, which includes the serialization technology. The purpose of this section and ROS Service is to provide the information, not to ensure communication can happen. - -The ROS 2 message that defines the type description must be able to describe any message type, including itself, and since it is describing the message format, it should work independently from any serialization technologies used. -This "meta-type description" message would then be used to communicate the structure of the type as part of the "get type description" service response. -The final form of these interfaces should be found in the reference implementation, but such a Service interface might look like this: - -.. code:: - - string type_name - string type_version_hash - --- - # True if the type description information is available and populated in the response - bool successful - # Empty if 'successful' was true, otherwise contains details on why it failed - string failure_reason - - # The idl or msg file name - string type_description_raw_file_name - # The idl or msg file, with comments and whitespace - # The file extension and/or the contents can be used to determine the format - string type_description_raw - # The parsed type description which can be used programmatically - TypeDescription type_description - - # Key-value pairs of extra information. - string[] extra_information_keys - string[] extra_information_values - -.. TODO:: - - (wjwwood) I propose we use key-value string pairs for the extra information, but I am hoping for discussion on this point. - This type is sensitive to changes, since changes to it will be hard to roll out, and may need manual versioning to handle. - We could also consider bounding the strings and sequences of strings, so the message could be "bounded", which is nice for safety and embedded systems. - -Again, the final form of these interfaces should be referenced from the reference implementation, but the ``TypeDescription`` message type might look something like this: - -.. code:: - - IndividualTypeDescription type_description - IndividualTypeDescription[] referenced_type_descriptions - -And the ``IndividualTypeDescription`` type: - -.. code:: - - string type_name - Field[] fields - -And the ``Field`` type: - -.. code:: - - FIELD_TYPE_NESTED_TYPE = 0 - FIELD_TYPE_INT = 1 - FIELD_TYPE_DOUBLE = 2 - # ... and so on - FIELD_TYPE_INT_ARRAY = ... - FIELD_TYPE_INT_BOUNDED_SEQUENCE = ... - FIELD_TYPE_INT_SEQUENCE = ... - # ... and so on - - string field_name - uint8_t field_type - uint64_t field_array_size # Only for Arrays and Bounded Sequences - string nested_type_name # Only for FIELD_TYPE_NESTED_TYPE - -These examples of the interfaces just give an idea of the structure but perhaps do not yet consider some other complications like field annotations or other as yet unconsidered features we want to support. - -.. TODO:: - - (wjwwood) Add text about how to handle Service types, which are formed as a Request and a Response part, each of which is kind of like a Message. - We could just treat the Request and Response separately, or we could extend this scheme to include explicit support for Services. - Since Actions are composed of Topics and Services, it is less so impacted, but we could similarly consider officially supporting them in this scheme. - -Versioning the ``TypeDescription`` Message Type -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Given that the type description message interface has to be generic enough to support anything described in the ROS interfaces, there will be a need to add or remove fields over time in the type description message itself. -This should be done in such a way that the fields are tick-tocked and deprecated properly, possibly by having explicitly named versions of this interface, e.g. ``TypeDescriptionV1`` and ``TypeDescriptionV2`` and so on. - -Implementation in the ``rcl`` Layer -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The implementation of the type description distribution feature will be made in the ``rcl`` layer as opposed to the ``rmw`` layer to take advantage of the abstraction away from the middleware and to allow for compatibility with the client libraries. - -A hook will be added to ``rcl_node_init()`` to initialize the type description distribution service with the appropriate ``rcl_service_XXX()`` functions. -This hook should also keep a map of published and subscribed types which will be populated on each initialization of a publisher or subscription in the respective ``rcl_publisher_init()`` and ``rcl_subscription_init()`` function calls. -The passed ``rosidl_message_type_support_t`` in the init call can be used to obtain the relevant information, alongside any new methods added to support type version hashing. - -There will be an option to opt-out of creating this service, and a way to start and stop the service after node creation as well. - -.. TODO:: - - (wjwwood) A thought just occurred to me, which is that we could maybe have "lazy" Service Servers, which watch for any Service Clients to come up on their Service name and when they see it, they could start a Service Server. - The benefit of this is that when we're not using this feature (or other features like Parameter get/set Services), then we don't waste resources creating them, but if you know the node name and the corresponding Service name that it should have a server on, you could cause it to be activated by trying to use it. - The problem with this would be that you cannot browse the Service servers because they wouldn't advertise until requested, but for "well know service names" that might not be such a problem. - I guess the other problem would be there might be a slight delay the first time you go to use the service, but that might be a worthy trade-off too. - Run-Time Interface Reflection ----------------------------- Run-Time Interface Reflection allows access to the data in the fields of a serialized message buffer when given: - the serialized message as a ``rmw_serialized_message_t``, basically just an array of bytes as a ``rcutils_uint8_array_t``, -- the message's type description, e.g. received from the aforementioned "type description distribution" or from a bag file, and +- the message's type description, e.g. received from the ``GetTypeDescription`` service or from a bag file, and - the serialization format, name and version, which was used to create the serialized message, e.g. ``XCDR2`` for ``Extended CDR encoding version 2`` From these inputs, we should be able to access the fields of the message from the serialized message buffer using some programmatic interface, which allows you to: @@ -599,7 +360,7 @@ From these inputs, we should be able to access the fields of the message from th serialization_format ─┘ └───────────────────────────────┘ Given that the scope of inputs and expected outputs is so limited, this feature should ideally be implemented as a separate package, e.g. ``rcl_serialization``, that can be called independently by any downstream packages that might need Run-Time Interface Reflection, e.g. introspection tools, rosbag transport, etc. -This feature can then be combined with the ability to detect type mismatches and obtain type descriptions in the previous two sections to facilitate communication between nodes of otherwise incompatible types. +This feature can then be combined with the ability to detect type mismatches and obtain type descriptions to facilitate communication between nodes of otherwise incompatible types. Additionally, it is important to note that this feature is distinct from ROS 2's existing "dynamic" type support (``rosidl_typesupport_introspection_c`` and ``rosidl_typesupport_introspection_cpp``). The ``rosidl_typesupport_introspection_c*`` generators generate code at compile time for known types that provides reflection for those types. @@ -680,21 +441,6 @@ This section will describe a vision of what is possible with some new tools, but That is to say, after this REP is accepted, tools described in this section may continue to evolve and improve, and even more tools not described here may be added to enable more things. As always, consider the latest documentation for the various pieces of the reference implementation to get the most accurate details. -Tools for Interacting with Type Version Hashes and Type Descriptions -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The ros2 command line tools, and the APIs that support them, should be updated to provide access to the type version hash where ever the type name is currently available and the type version description on-demand as well. -For example: - -- ``ros2 interface`` should be extended with a way to calculate the version hash for a type -- ``ros2 topic info`` should include the type version hash used by each endpoint -- ``ros2 topic info --verbose`` should include the type description used by each endpoint -- a new command to compare the types used by two endpoints, e.g. ``ros2 topic compare `` -- ``ros2 service ...`` commands should also be extended in this way -- ``ros2 action ...`` commands should also be extended in this way - -Again this list should not be considered prescriptive or exhaustive, but gives an idea of what should change. - Notifying Users when Types Do Not Match ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -702,7 +448,7 @@ There should be a warning to users when two endpoints (publisher/subscription or This warning should be a console logging message, but it should also be possible for the user to get a callback in their own application when this occurs. How this should happen in the API is described in the Type Version Enforcement section, and it uses the "QoS Event" part of the API. -The warning should include important information, like the GUID of the endpoints involved, the type name(s) and type version hash(es), and of course the topic name. +The warning should include important information, like the GUID of the endpoints involved, the type name(s) and type hash(es), and of course the topic name. These pieces of information can then be used by the user, with the above mentioned changes to the command line tools, to investigate what is happening and why the types do not match. The warning may even suggest how to compare the types using the previously described ``ros2 topic compare`` command-line tool. @@ -715,7 +461,7 @@ This will help the user know if they need to write a transfer function or check This tool might look something like this: -- ``ros2 interface convertible --from-remote-node --to-local-type `` +- ``ros2 interface convertible --from-remote-node --to-local-type `` This tool would query the type description from a remote node, and compare it to the type description of the second type version found locally. If the two types can be converted without writing any code, then this tool would indicate that on the ``stdout`` and by return code. @@ -764,7 +510,7 @@ The transfer function API for C++ may look something like this: void my_transfer_function( const DynamicMessage & input, - const MessageInfo & input_info, // type name/version hash, etc. + const MessageInfo & input_info, // type name/hash, etc. DynamicMessage & output, const MessageInfo & output_info) { @@ -800,7 +546,7 @@ This tool would query the ``ament_index`` and list the available transfer functi The tool might look like this: - ``ros2 interface transfer_functions list`` -- ``ros2 interface transfer_functions info `` +- ``ros2 interface transfer_functions info `` The tool might also have options to filter based on the type name, package name providing the type, package name providing the transfer function (not necessarily the same as the package which provides the type itself), etc. @@ -895,15 +641,6 @@ Type Version Enforcement Alternatives ^^^^^^^^^^^^ -Use Type Hash from Middleware, e.g. from DDS-XTypes -""""""""""""""""""""""""""""""""""""""""""""""""""" - -Type hash can be obtained by the native middleware api. For example, with fastDDS, the type hash can be obtained with ``TypeIdentifier->equivalence_hash()`` during the ``on_type_discovery()`` callback. the rmw layer can choose to use the provided hash to impose the aforementioned type enforcement. - -.. TODO:: - - (wjwwood) this needs to be cleaned up - Evolving Message Definition with Extensible Type """""""""""""""""""""""""""""""""""""""""""""""" @@ -930,211 +667,17 @@ Furthermore, an initial test evolving messages with FastDDS, Cyclone, and Connex Handle Detection of Version Mismatch "Above" rmw Layer """""""""""""""""""""""""""""""""""""""""""""""""""""" -We can choose to utilize ``USER_DATA`` QoS to distribute the message version during discovery phase. -The message version for each participant will then be accessible across all available nodes. By getting the version hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. +We can choose to utilize ``USER_DATA`` QoS to distribute the type hash during the discovery phase. +The type hash for each participant will then be accessible across all available nodes. By getting the hash through ``user_data`` via the ``rmw`` layer, similar type version matching can be detected. Prevent Communication of Mismatched Versions "Above" rmw Layer """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" TODO -Type Description Distribution +Run-Time Interface Reflection ----------------------------- -Using a Single ROS Service per Node -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The node that is publishing the data must already have access to the correct type description, at the correct version, in order to publish it, and therefore it is natural to get the data from that node. -Similarly, a subscribing node also knows what type they are wanting to receive, both in name and version, and therefore it is again natural to get that information from the subscribing node. -The type description for a given type, at a given version, could have been retrieved from other places, e.g. a centralized database, but the other alternatives considered would have had to take care to ensure that it had the right version of the message, which is not the case for the node publishing the data. - -Because the interface for getting a type description is generic, it is not necessary to have this interface on a per entity, i.e. publisher, subscription, etc, basis, but instead to offer the ROS Service on a per node basis to reduce the number of ROS Services. -Therefore, the specification dictates that the type description is distributed by single ROS Service for each individual node. - -There were also multiple alternatives for how to get this information from each node, but the use of a single ROS Service was selected because the task of requesting the type description from a node is well suited to a request-response style ROS Service. -Some of the alternatives offered other benefits, but using a ROS Service introduced the fewest dependencies, feature-wise, while accomplishing the task. - -.. TODO:: - - cite the above, https://en.wikipedia.org/wiki/Request%E2%80%93response - -Combining the Raw and Parsed Type Description in the Service Response -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The contents of the "get type description" service response should include information that supports both aforementioned use cases (i.e. tools and Run-Time Interface Reflection). -These use cases have orthogonal interests, with the former requiring human-readable descriptions, and the latter preferring machine-readable descriptions. - -Furthermore, the type description should be useful even across middlewares and serialization libraries and that makes it especially important to send at least the original inputs to the "type support pipeline" (i.e. the process of taking user-defined types and generating all supporting code). -In this case, because the "type support pipeline" is a lossy process, there is a need to ensure that enough information is sent to completely reproduce the original definition of the type, and therefore it makes sense to just send the original ``idl`` or ``msg`` file. - -At the same time, it is useful to send information with the original description that makes it easier to process data at the receiving end, as it is often not trivial to get to the "parsed" version of the type description from the original text description. - -Finally, while there could be an argument for sending a losslessly compressed version of the message file, the expected low frequency of queries to the type description service incurs a negligible overhead that heavily reduces the benefit. - -Implementing in ``rcl`` versus ``rmw`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -While it is true that implementing the type description distribution on the ``rmw`` layer would allow for much lower level optimization, removing the layer of abstraction avoids having to implement this feature in each rmw implementation. - -Given that the potential gains from optimization will be small due to how infrequently the service is expected to be called, this added development overhead was determined to not be worth it. -Instead the design prefers to have a unified implementation of this feature in ``rcl`` so it is agnostic to any middleware implementations and client libraries. - -Nested ``TypeDescription`` Example -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The ``TypeDescription`` message type shown above also supports the complete description of a type that contains other types (a nested type), up to an arbitrary level of nesting. -Consider the following example: - -.. code:: - - # A.msg - B b - C c - - # B.msg - bool b_bool - - # C.msg - D d - - # D.msg - bool d_bool - -The corresponding ``TypeDescription`` for ``A.msg`` will be as follows, with the referenced type descriptions accessible as ``IndividualTypeDescription`` types in the ``referenced_type_descriptions`` field of ``A``: - -.. code:: - - # A: TypeDescription - type_description: A_IndividualTypeDescription - referenced_type_descriptions: [B_IndividualTypeDescription, - C_IndividualTypeDescription, - D_IndividualTypeDescription] - -Note that the type description for ``A`` itself is found in the ``type_description`` field instead of the ``referenced_type_descriptions`` field. -Additionally, in the case where a type description contains no referenced types (i.e., when it has no fields, or all of its fields are primitive types), the ``referenced_type_descriptions`` array will be empty. - -.. code:: - - # A: IndividualTypeDescription - type_name: "A" - fields: [A_b_Field, A_c_Field] - - # B: IndividualTypeDescription - type_name: "B" - fields: [B_b_bool_Field] - - # C: IndividualTypeDescription - type_name: "C" - fields: [C_d_Field] - - # D: IndividualTypeDescription - type_name: "D" - fields: [D_d_bool_Field] - -With the corresponding ``Field`` fields: - -.. code:: - - # A_b_Field - field_type: 0 - field_name: "b" - nested_type_name: "B" - - # A_c_Field - field_type: 0 - field_name: "c" - nested_type_name: "C" - - # B_b_bool_Field - field_type: 9 # Suppose 9 corresponds to a boolean field - field_name: "b_bool" - nested_type_name: "" # Empty if primitive type - - # C_d_Field - field_type: 0 - field_name: "d" - nested_type_name: "D" - - # D_d_bool_Field - field_type: 9 - field_name: "d" - nested_type_name: "" - -In order to handle the type of a nested type such as ``A``, the receiver can use the ``referenced_type_descriptions`` array as a lookup table keyed by the value of ``Field.nested_type_name`` or ``IndividualTypeDescription.type_name`` (which will be identical for a given type) to obtain the type information of a referenced type. -This type handling process can also support any recursive level of nesting (e.g. while handling A, C is encountered as a nested type, C can then be looked up using the top level ``referenced_type_descriptions`` array). - -Alternatives -^^^^^^^^^^^^ - -Other Providers of Type Description -""""""""""""""""""""""""""""""""""" - -Several other candidate strategies for distributing the type descriptions were considered but ultimately discarded for one or more reasons like: causing a strong dependency on a particular middleware or a third-party technology, difficulties with resolving the message type description locally, difficulties with finding the correct entity to query, or causing network throughput issues. - -These are some of the candidates that were considered, and the reasons for their rejection: - -- Store the type description as a ROS parameter - * Causes a mass of parameter event messages being sent at once on init, worsening the network initialization problem -- Store the type description on a centralized node per machine - * Helps reduce network bandwidth, but makes it non-trivial to find the correct centralized node to query, and introduces issues of resolving the local message package, such as when nodes are started from different sourced workspaces. -- Send type description alongside discovery with middlewares - * Works very well if supported, but is only supported by some DDS implementations (which support XTypes or some other way to attach discovery metadata), but causes a strong dependency on DDS. -- Send type description using a different network protocol - * Introduces additional third-party dependencies separate from ROS and the middleware. - -Alternative Type Description Contents and Format -"""""""""""""""""""""""""""""""""""""""""""""""" - -A combination of the original ``idl`` / ``msg`` file and any other information needed for serialization and deserialization being sent allows for one to cover the weaknesses of the other. -Specifically, given that certain use-cases (e.g., ``rosbag``) might encounter situations where consumers of a message are using a different middleware or serialization scheme the message was serialized with, it becomes extremely important to send enough information to both reconstruct the type support, and also allow the message fields to be accessed in a human readable fashion to aid in the writing of transfer functions. -As such, it is not a viable option to only send one or the other. - -Additionally, the option to add a configuration option to choose what contents to receive from the service server was disregarded due to how infrequently the type description query is expected to be called. - -As for the format of the type description, using the ROS interfaces to describe the type, as opposed to an alternative format like XML, JSON, or something like the TypeObject defined by DDS-XTypes, makes it easier to embed in the ROS Service response. -It also prevents unnecessary coupling with third-party specifications that could be subject to change and reduces the formats that need to be considered on the receiving end of the ROS Service call. - -Representing Fields as An Array of Field Types -"""""""""""""""""""""""""""""""""""""""""""""" - -The use of an array of ``Field`` messages was balanced against using two arrays in the ``IndividualTypeDescription`` type to describe the field types and field names instead, e.g.: - -.. code:: - - # Rejected IndividualTypeDescription Variants - - # String variant - string type_name - string field_types[] - string field_names[] - - # uint8_t Variant - string type_name - uint8_t field_types[] - string field_names[] - -The string variant was rejected because using strings to represent primitive types wastes space, and will lead to increased bandwidth usage during the discovery and type distribution process. -The uint8_t variant was rejected because uint8_t enums are insufficiently expressive to support nested message types. - -The use of the ``Field`` type, with a ``nested_type_name`` field that defaults to an empty string mitigates the space issue while allowing for support of nested message types. -Furthermore, it allows the fields to be described in a single array, which is easier to iterate through and also reduces the chances of any errors from mismatching the array lengths. - -Using an Array to Store Referenced Types -"""""""""""""""""""""""""""""""""""""""" - -Some alternatives to using an array of type descriptions to store referenced types in a nested type were considered, including: - -- Storing the referenced types inside the individual type descriptions and accessing them by traversing the type description tree recursively instead of using a lookup table. - - - Rejected because the IDL spec does not allow for a type description to store itself, and also because it could possibly introduce duplicate, redundant type descriptions in the tree, using up unnecessary space. - -- Sending referenced types in a separate service call or message. - - - Rejected because needing to collate all of the referenced types on the receiver end introduces additional implementation complexity, and also increases network bandwidth with all the separate calls that must be made. - -Run-Time Interface Reflect --------------------------- - TODO Plugin Matching and Loading API Example @@ -1289,23 +832,7 @@ Feature Progress Supporting Feature Development: -- TypeDescription Message: - - - [ ] Define and place the TypeDescription.msg Message type in a package - -- Enforced Type Versioning: - - - [ ] Create library to calculate version hash from message files - - [ ] Create library to calculate version hash from service files - - [ ] Create library to calculate version hash from action files - - [ ] Create command-line tool to show the version hash of an interface - - - [ ] Add topic type version hash to the "graph" API - - [ ] Implement delivery of type version hash during discovery: - - - [ ] rmw_fastrtps_cpp - - [ ] rmw_cyclonedds_cpp - - [ ] rmw_connextdds +- Type Version Enforcement - [ ] Add rmw API for notifying user when a topic endpoint was not matched due to type mismatch @@ -1327,11 +854,6 @@ Supporting Feature Development: - [ ] rmw_cyclonedds_cpp - [ ] rmw_connextdds -- Type Description Distribution: - - - [ ] Define and place the GetTypeDescription.srv Service type in a package - - [ ] Implement the "get type description" service in the rcl layer - - Run-time Interface Reflection: - [ ] Prototype "description of type" -> "accessing data fields from buffer" for each rmw vendor: @@ -1377,9 +899,6 @@ Supporting Feature Development: Tooling Development: -- [ ] Command-line tools for interacting with type version hashes -- [ ] Add a default warning to users when a topic has mismatched types (by type or version) - - [ ] Command-line tool for determining if two versions of a type are convertible - [ ] CMake and C++ APIs for writing transfer functions, including usage documentation @@ -1390,15 +909,16 @@ Tooling Development: - [ ] Integration with rosbag2 - - [ ] Record the type version hash, TypeDescription, and other metadata into bags - - [ ] Use recorded metadata to create subscriptions without needing the type to be built locally + - [ ] Use recorded type description metadata to create subscriptions without needing the type to be built locally - [ ] Use recorded metadata to create publishers from descriptions in bag - - [ ] Provide a method to playback/update old bag files that do not have type descriptions stored References ========== +.. [#rep2016] REP 2016: ROS 2 Interface Type Descriptions + (https://www.ros.org/reps/rep-2016.html) + .. [1] DDS-XTYPES 1.3 (https://www.omg.org/spec/DDS-XTypes/1.3/About-DDS-XTypes/)