-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Replace 'screen' use with Docker. WIP #54
base: main
Are you sure you want to change the base?
Conversation
I think it's reasonable, with a few notes:
|
Thanks for the quick response-
I don't really know - it seemed to already be popular when I first joined the lab, and I think it's just a trusted tool for the team that wants to be able to quickly pop into a session to check status, but I never understood why when screen is already used by the
I'm trying to keep the
Good question. Today I usually use development containers, makes testing faster.
I think I understand what you're saying, though I'm not sure that is in scope for this refactor?
I have put very little thought into a ROS2 transition plan for PhytO-ARM. Maybe should be a discussion at this upcoming workshop. |
Actually, your "reduced" list of layers is what we have now. The container passes I guess on revisiting I'm like -0.5 on this. I don't feel strongly about the design here and I support the idea to reduce complexity and make management easier. But let me explain why I designed it this way: As you know the canonical ROS1 node management system is The launch XML file can be parameterized, e.g., So I came up with the The other advantage of having the outer Finally, the Docker came around kind of independently to simplify deployment, and since it provides an alternative service management model (container start/stop) so there was no need for
I'm just saying I don't want to create more of an obstacle to migrating away from Docker by making Docker's service management more central to the project. We already run all our containers with |
My two cents here... Extra tmux layer was helpful initially not only because it provided nohup service but also because it captured standard outputs from IFCBacquire and P-A. TMK we aren't capturing this info properly in a log file anywhere so often tmux panes are good/easy way to see where nodes are crashing and to monitor current status of a deployment. |
Taking a step back and trying to think about it from a user story POV, I want to be able to:
That is essentially my goal with this PR, and I'm flexible on the approach. What makes the most sense to me is to go back to using |
I agree with that philosophy. There should be some command that makes it easy for operators. The implementation details of subprocess orchestration and monitoring is separate from that concern, and works decently well as-is. |
"Anything beyond this inevitably requires developer intervention." To the extent this is true, we are falling short of the goal for the overall PhytO-ARM effort. Very very few teams who might adopt P-A will have anyone on staff who would call themselves a developer. That shouldn't mean there isn't a path for them to adapt and repurpose this code independently. I think config only interaction is our goal for "day to day" / operational use. In these cases we want process of standing up a replacement host IFCB to be as streamlined as possible. |
Okay we all seem agreed on the goal. In terms of implementation, if we continue with the current approach using |
b2a13c9
to
8530ef9
Compare
I've changed this to use a new |
What are "processes" in this context? They seem to be containers? What is going on in the different containers? |
I think a I am reluctant to add a 300 line script with new Python package dependencies that essentially implements an alternative I am pondering the benefits of running multiple containers, with the same container image, just with different launch files. As far as I know you can just run At least if we go ahead with this we should define the interface clearly so that if we change out the implementation later, operators aren't disrupted. I am not up to speed on the "multi-arm" concept. |
A ROS launch file. In this case, ROS core and associated "main" launch file, and then any enabled arms. There is probably a better name than "processes". "launches"?
This is not an alternative docker compose.
It can be done with one container. There are tradeoffs, the obvious being the isolation that containerization brings, with easy kill and reboot as you noted. Another is the ease of attaching to the STDOUT specific to the "process" you're interested in. Today we actually do multiple ROS launches within a single container, and it seems to work okay. One thing this requires though is peering into the container to determine what is running and trying to manage container internals from the outside. Whereas multiple containers allows
"arm" is essentially just an abstraction for a scientific payload and its behavior, since IFCB is no longer the only winched instrument in use. We segment them into different ROS launches because which arms get used is highly mission dependent. |
tcp_ports: | ||
bridge_node: 9090 | ||
web_node: 8098 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to find a more elegant way of expressing this. Here, it is attached to the "process" structure, which is pretty far from the actual node configuration some 250 lines later (and called just web
).
@rgov I want your opinion on this WIP refactor before I finish it.
Today we have accumulated a lot of management layers:
phyto-arm
Python scriptscreen
withinphyto-arm
tmux
for monitoring Docker containers has become standard in the fieldIf we commit to always using Docker, we can reduce this to:
phyto-arm
Python scriptSo remove all the
tmux
andscreen
use since Docker persists sessions anyway, usephyto-arm
to launch ROS directly within containers, and simplify config by also havingphyto-arm
handle mounting of devices/volumes/ports from config automatically, which will eliminate all the gotchas we see in the field where startup fails because Docker-config does not match PA-config.This also removes the need to launch arms separately - which processes get launched is all setup in the config YAML. Operators will basically only have to touch the YAML going forward.
Thoughts? Not working yet as-is, but I think there's enough there for you to get the approach.