Port to VORC #19

mabelzhang · 2020-11-11T04:35:57Z

Dependent on osrf/vrx#228 and osrf/vorc#30.

The repository has been adapted to VORC, to live in a new branch.
All the individual scripts and multi-scripts ran.

Things in this PR that are different from the main branch:

Updated README for VORC
Fixed window size issue in video recording to use --windowid, so the evaluator does not need to manually tweak x y width height for recordmydesktop
Removed VRX requirement of sensor config and wamv URDF config. prepare_team.bash still generates an empty file with the team name, because one of the scripts looks at the file names to determine the list of teams.
task_config YAML files added new parameters for VORC (dependent on the PR in vrx).
Note 1: Only trial 0 for each task is updated with coordinates for VORC. I haven’t had time to customize subsequent trials.
Note 2: Gymkhana will need a new YAML file.
Removed generated/, since we don’t have permanent example files yet. Once we do, we can add them back, and probably add the directory to .gitignore, so our local files don’t continuously get committed

Issues

There may still be intermittent seg faults. If you see them, please let me know when it happens... It sometimes happens, but hasn't in my last runs, so... I don't know if they're fixed or not.

After realizing I can set gui:=true in the server Docker (duh) to debug visually, I saw that the boat was actually in the world, contrary to what I saw in the GUI that the video recording script had spun up, which actually requires a workspace to also exist on the host machine. That had a number of things broken because I don’t usually develop on my host machine.

(That itself is a huge problem, because it leads to inconsistencies between what is run in the actual competition server Dockerfile, and what is being recorded in the video - in some arbitrary environment on some evaluator’s own host machine, which could be very different from the reference environment in the server Docker. The whole point of having a Dockerfile is to have everything consistent, and videos should really be recorded from a window in Docker, as opposed to from the host machine. That really needs to be fixed.)

Other than that, there are a number of things that need to be more rigorous and follow good practices. I’m going to open followup issues for them.

Once those issues are cleaned up, similar to the video problem, things will be less error-prone, and there will be a lot less hair to pull.

To test

Follow the README :) Or this shorter version below.

First, I recommend going into vorc_server/vorc-server/run_vorc_trial.sh, and setting gui:=true in the roslaunch vorc_gazebo evaluation.launch line.
This will help the reviewer (and help me) know that the competition run really works for everyone.

Then, build the server Docker (-n for NVIDIA):

$ ./vorc_server/build_image.bash -n

Single scripts:

In the trials, please zoom out in the Gazebo GUI (build Dockerfile with gui:=true, see above), make sure the marina shows up, the robot and task objects show up, and everything looks normal.

Currently, only trial 0 objects are customized to VORC world coordinates. You can try trial 1+, but things probably won’t look right.

With the ghostship solution specific to VORC, when the task starts, you should see the robot moving forward.
With the example_team and example_team_2 solutions specific to VRX (we will remove once we have more examples), nothing will happen, but things should still run.

$ ./prepare_team.bash ghostship

$ ./prepare_task_trials.bash perception
$ ./prepare_task_trials.bash stationkeeping
$ ./prepare_task_trials.bash wayfinding

# Each of these will open the Gazebo GUI from the server Docker container.
# Please inspect that things show up correctly.
$ ./run_trial.bash -n ghostship stationkeeping 0
$ ./run_trial.bash -n ghostship wayfinding 0
$ ./run_trial.bash -n ghostship perception 0

# Make sure this runs all the way and you get a video.
# This will run, but the robot doesn’t show up for my host machine workspace. We need to fix video recording to record from Docker.
$ ./generate_trial_video.bash example_team stationkeeping 0

Batch scripts:

Note that example_team and example_team2 won’t be able to move CoRa, since they’re set up to send commands to WAM-V topics.

$ ./multi_scripts/prepare_all_teams.bash

$ ./multi_scripts/prepare_all_task_trials.bash

$ ./multi_scripts/run_one_team_one_task.bash -n example_team stationkeeping
$ ./multi_scripts/run_one_team_all_tasks.bash -n example_team

# I have run this one all the way. It will take a long time to finish.
$ ./multi_scripts/run_all_teams_all_tasks.bash -n

$ ./multi_scripts/generate_one_team_one_task_videos.bash example_team example_task
# etc.

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang · 2020-11-11T04:40:52Z

@crvogt

Minor issue with the ghostship example solution.

On termination, I'm getting a Traceback:

$ ./run_trial.bash -n ghostship stationkeeping 0 
...
---------------------------------
Creating container for crvogt/ghostship:v1

rosmaster already running
/root
gzserver shut down
OK

Killing rosnodes
[ WARN] [1605066935.331941432, 320.000000000]: Shutdown request received.
[ WARN] [1605066935.331989814, 320.000000000]: Reason given for shutdown: [user request]
Starting node!!!
shutdown request: user request
Traceback (most recent call last):
  File "basic_node.py", line 42, in <module>
    sn.sendCmds()
  File "basic_node.py", line 38, in sendCmds
    self.rate.sleep()
  File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/timer.py", line 103, in sleep
    sleep(self._remaining(curr_time))
  File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/timer.py", line 165, in sleep
    raise rospy.exceptions.ROSInterruptException("ROS shutdown request")
rospy.exceptions.ROSInterruptException: ROS shutdown request
killing:
 * /gazebo_gui
 * /record_1605066612195773050
 * /rosout
 * /send_commands
killed
[ INFO] [1605066612.215335434]: Subscribing to /vorc/task/info
[ INFO] [1605066613.132581061, 0.036000000]: Recording to '/home/master/vorc_rostopics.bag'.
Killing roslaunch pid: 55
OK

Trial ended. Logging data
---------------------------------

For reference, with example_team solution, I get no Traceback between the Reason given for shutdown: [user request] and killing: lines:

---------------------------------
Creating container for tylerlum/vrx-competitor-example:v2.2019

Running /move_forward.sh
gzserver shut down
OK

Killing rosnodes
[ WARN] [1605060003.425509787, 50.001000000]: Shutdown request received.
[ WARN] [1605060003.425572958, 50.001000000]: Reason given for shutdown: [user request]
[ INFO] [1605059951.272962663]: Subscribing to /vorc/task/info
[ INFO] [1605059952.211171918, 0.019000000]: Recording to '/home/master/vorc_rostopics.bag'.
killing:
 * /gazebo_gui
 * /record_1605059951250528346
 * /rosout
 * /rostopic_29_1605059957326
 * /rostopic_30_1605059957325
killed
shutdown request: user request
shutdown request: user request
Killing roslaunch pid: 56
OK

Trial ended. Logging data
---------------------------------

Probably just some termination cleanup issue. Could you look into it? Not a big problem but it looks cleaner.

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang · 2020-11-11T05:09:27Z

I created a meta-ticket #20 tracking the followup issues.
I don't think I'll have time to address all of them by myself. Help is appreciated.

The "VORC Essentials" anyone can do. It's really just playing around with the world and figuring out where to put buoys etc. Fun task. I'm happy to hand it off to someone else.

The "Infrastructure" items I'd really like to get fixed. It will make my life easier and the code more rigorous (which currently bothers me).

crvogt · 2020-11-11T19:08:29Z

@mabelzhang It must be that my node doesn't handle shutdown well and I couldn't reproduce the output, so I simplified to a bash script publishing on the cora thrust command topic. I'll check out the VORC essentials and make a note of which ones I'm handling.

mabelzhang · 2020-11-11T20:28:20Z

Do you think it's worthwhile to commit the example solution to the repo as well? While I was testing, I wondered a few times what is the content of example_team and wanted to just change the topic names to VORC ones, but I had no access to any code. I think it'd be helpful. For the shutdown handling it'd be helpful as well to use as reference.

crvogt · 2020-11-12T15:49:58Z

I took a look at both of the example_team* content (you can view it while the container is running with docker exec -it <container_name> bash) wondering the same thing. It's exactly the script from the tutorial page https://github.com/osrf/vrx/wiki/tutorials-Creating%20a%20Dockerhub%20image%20for%20submission. No actual node running.

I'm still curious how to properly handle shutdown scenarios and if they were handled with vrx or if the traceback occurred for each team (that presumably didn't know how to handle it). I'll see if I can figure it out going ahead because I think it would be helpful.

mabelzhang · 2020-11-16T16:29:39Z

Looks like there's is a basic_node.py Python script? The rospy.exceptions.ROSInterruptException looks like the shutdown signal needs to be handled via a try-catch, something like

    while not rospy.is_shutdown():
        try:
            ...
            rate.sleep()
        except rospy.ROSInterruptException:
	    break

crvogt · 2020-11-16T17:44:05Z

Ah, ok, I'll give that a try right now. Thanks!

crvogt · 2020-11-16T18:31:34Z

Added the try/except. The output looks nicer on mine, let me know if it makes a difference for you (it's uploaded).

EDIT: Trying it with the vorc-docker branch now

crvogt · 2020-11-16T20:47:06Z

So! I ran it with the wayfinding task and got a relatively clean output. I get a few Cannot kill container ... is not running messages, but otherwise nothing like you posted. Specifically, getting:

Killing rosnodes
[ WARN] [1605558728.533756556, 320.000000000]: Shutdown request received.
[ WARN] [1605558728.533790352, 320.000000000]: Reason given for shutdown: [user request]
Starting node!!!
shutdown request: user request
Complete
killing:
 * /gazebo_gui
 * /record_1605558403275765041
 * /rosout
 * /send_commands
killed
[ INFO] [1605558403.290691459]: Subscribing to /vorc/task/info
[ INFO] [1605558403.785919027, 0.070000000]: Recording to '/home/localadmin/vorc_rostopics.bag'.
Killing roslaunch pid: 55
OK

Trial ended. Logging data
---------------------------------
Copying ROS log files from server container...
OK

Creating text file for trial score
Successfully recorded trial score in /home/localadmin/vorc_ws/src/vrx-docker/utils/../generated/logs/ghostship/wayfinding/0/trial_score.txt
OK

Copying ROS log files from competitor container...
OK

Killing containers
Killing any running Docker containers matching 'vorc-competitor-*'...
Error response from daemon: Cannot kill container: 59351dbe9017: Container 59351dbe901765ddda9f2609ce3d943a925b187d7094200dd43a46c2cc455251 is not running
Removing any Docker containers matching 'vorc-competitor-*'...
59351dbe9017
Killing any running Docker containers matching 'vorc-server-*'...
Error response from daemon: Cannot kill container: c214e26e8455: Container c214e26e8455f02bd65cc290a8e77769bbef730f1b5e16bd9a5396210f7df0f4 is not running
Removing any Docker containers matching 'vorc-server-*'...
c214e26e8455
Done.

so you're suggestion looks like a good way forward for shutdown handling.

On a side note, I really struggled getting everything working with my Docker environment. I would receive the following error:

+ docker run --name vorc-server-system -e XAUTHORITY=/tmp/.docker.xauth --env=DISPLAY --env=QT_X11_NO_MITSHM=1 -v /tmp/.docker.xauth:/tmp/.docker.xauth -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro -v /tmp/.docker.xauth:/tmp/.docker.xauth -v /dev/log:/dev/log -v /dev/input:/dev/input --runtime=nvidia --privileged --security-opt seccomp=unconfined -u 1000:1000 --net vorc-network --ip 172.16.1.22 -v /home/localadmin/vorc_ws/src/vrx-docker/generated/task_generated/wayfinding:/task_generated -v /home/localadmin/vorc_ws/src/vrx-docker/generated/logs/ghostship/wayfinding/0:/vorc/logs -e ROS_MASTER_URI=http://172.16.1.22:11311 -e ROS_IP=172.16.1.22 -e VRX_EXIT_ON_COMPLETION=true -e VRX_DEBUG=false vorc-server-melodic-nvidia:latest /run_vorc_trial.sh /task_generated/worlds/wayfinding0.world /vorc/logs
docker: Error response from daemon: network vorc-network not found.
ERRO[0000] error waiting for container: context canceled

This error is covered in maxking/docker-mailman#85 and I believe occurs when the Docker networks persist even after the containers are killed. docker network rm <network> solved this and allowed me to run the trials. If we have a troubleshooting page, it might be good to add this (unless it's common knowledge and I'm out of the loop).

mabelzhang · 2020-11-18T09:06:51Z

Thanks for trying it out! I tried out the new ghostship solution, and the output is clean now.

The Cannot kill container I think is normal. I think it's just cleaning up and killing everything just in case some containers weren't killed cleanly, so it prints it if the container had already been killed cleanly. That's my guess.

Hmm I've seen that vorc-network not found before too, but I don't remember seeing it for vrx-network. I think the time I saw it was when I already had vrx-network running on the same IP, right before I changed the string to vorc-network, so I had to remove the vrx one manually before the vorc-network could use the same IP. I have not seen it with vorc-network in my latest runs.

Normally, in utils/vorc_network.bash, the code already took care of when a network already exists. It runs docker network rm ${NETWORK} before calling docker network create.

If that network message still happens intermittently, maybe something else still needed to be updated for VORC, but I'm not finding any other vrx things remaining...

crvogt · 2020-11-18T15:14:28Z

It's possible then that I was running vrx-docker before switching to vorc-docker. This could have led to vrx-network being on the IP that vorc-network was meant to be on?

mabelzhang · 2020-11-18T17:16:27Z

Yeah they are using the exact same IP and subnet mask, only the name is different. So that could have been it.

crvogt · 2020-11-18T17:28:01Z

Whoops! Thanks for straightening that out for me :)

mabelzhang · 2020-11-19T06:16:39Z

Update about seg fault:
It is still happening - I see the seg fault in verbose_output.txt, but the scoring plugin keeps printing afterwards.
I ran the perception task a few times, sometimes it seg faulted, sometimes it didn't. I watched the GUI, and even when it did seg fault, all 3 prescribed buoys appeared, and the GUI doesn't close until the user requested shutdown. There was a trial score.
So I don't know the implications of the seg fault. I'll keep looking.

caguero

I manage to run all three individual tasks. Gazebo was terminated at the end of the run and I got scores in all of them. It looks good to me.

README.md

caguero · 2020-11-19T17:46:29Z

Update about seg fault:
It is still happening - I see the seg fault in verbose_output.txt, but the scoring plugin keeps printing afterwards.
I ran the perception task a few times, sometimes it seg faulted, sometimes it didn't. I watched the GUI, and even when it did seg fault, all 3 prescribed buoys appeared, and the GUI doesn't close until the user requested shutdown. There was a trial score.
So I don't know the implications of the seg fault. I'll keep looking.

Could that segfault occur while trying to shutdown Gazebo? This is an old issue that happens sometimes in Gazebo. In any case, it doesn't seem to affect.

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang · 2020-11-20T05:48:09Z

Maybe? The weird thing is that the segmentation fault printout is not at the very end, but near the beginning or in the middle, before the rest of the scoring plugin printouts. Though it could be a difference in when things are flushed.

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang · 2020-11-20T05:53:29Z

@crvogt How much work is it to create a second solution Docker image, just so that we have more than one team to test the multi-scripts? It could just be something trivial again, perhaps the boat moving backwards, i.e. "ghostship is back"...... or something more creative.

(I've deleted example_team and example_team_2 from vrx because they don't publish anything to the vorc topics.)

mabelzhang · 2020-11-20T05:59:16Z

I'm going to merge this now since it's been approved, so that we have a base to run the competition. Additions and fixes can be in followup PRs. I know we have at least 2 PRs coming up.

crvogt · 2020-11-23T15:21:43Z

@crvogt How much work is it to create a second solution Docker image, just so that we have more than one team to test the multi-scripts? It could just be something trivial again, perhaps the boat moving backwards, i.e. "ghostship is back"...... or something more creative.

Should only take a few minutes! (new employee orientation on Friday so didn't get a chance to implement). "pihstsohg"? :D

crvogt · 2020-11-24T16:41:47Z

@mabelzhang Added a new team. dockerhub_image.txt should be crvogt/shipghost:v1

mabelzhang · 2020-11-24T21:50:12Z

Thanks! It's working for me. I'll open a new PR and add you as reviewer.
Didn't go with the Dutch name huh? Or I guess in this case it's Gaelic.

crvogt · 2020-11-25T15:27:00Z

Ahaha, it's been anglicized :D

mabelzhang added 4 commits November 5, 2020 23:13

change Dockerfile to vorc

2fd3de1

Signed-off-by: Mabel Zhang <[email protected]>

port task preparation, team preparation, running competition

65c02ca

Signed-off-by: Mabel Zhang <[email protected]>

task_config and other fixes to get scripts to work for VORC

5d1ea80

Signed-off-by: Mabel Zhang <[email protected]>

use playback.launch from vorc; replace WAM-V strings

c680970

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang requested a review from caguero November 11, 2020 04:35

change last few bits from vrx to vorc

8ca305a

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang mentioned this pull request Nov 11, 2020

Followup Issues for Port to VORC #20

Open

9 tasks

caguero approved these changes Nov 19, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

replace example_team in README with ghostship

a086d03

Signed-off-by: Mabel Zhang <[email protected]>

remove example teams from vrx

0baccc6

Signed-off-by: Mabel Zhang <[email protected]>

mabelzhang merged commit 9ac3612 into vorc Nov 20, 2020

mabelzhang deleted the vorc-docker branch November 20, 2020 05:59

mabelzhang mentioned this pull request Nov 20, 2020

Update github branches in Dockerfile #22

Merged

mabelzhang mentioned this pull request Nov 24, 2020

Add shipghost team solution #24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port to VORC #19

Port to VORC #19

mabelzhang commented Nov 11, 2020 •

edited

Loading

mabelzhang commented Nov 11, 2020

mabelzhang commented Nov 11, 2020 •

edited

Loading

crvogt commented Nov 11, 2020

mabelzhang commented Nov 11, 2020

crvogt commented Nov 12, 2020

mabelzhang commented Nov 16, 2020

crvogt commented Nov 16, 2020

crvogt commented Nov 16, 2020 •

edited

Loading

crvogt commented Nov 16, 2020 •

edited

Loading

mabelzhang commented Nov 18, 2020 •

edited

Loading

crvogt commented Nov 18, 2020

mabelzhang commented Nov 18, 2020

crvogt commented Nov 18, 2020

mabelzhang commented Nov 19, 2020

caguero left a comment

caguero commented Nov 19, 2020

mabelzhang commented Nov 20, 2020 •

edited

Loading

mabelzhang commented Nov 20, 2020

mabelzhang commented Nov 20, 2020

crvogt commented Nov 23, 2020

crvogt commented Nov 24, 2020 •

edited

Loading

mabelzhang commented Nov 24, 2020 •

edited

Loading

crvogt commented Nov 25, 2020

Port to VORC #19

Port to VORC #19

Conversation

mabelzhang commented Nov 11, 2020 • edited Loading

Issues

To test

Single scripts:

Batch scripts:

mabelzhang commented Nov 11, 2020

mabelzhang commented Nov 11, 2020 • edited Loading

crvogt commented Nov 11, 2020

mabelzhang commented Nov 11, 2020

crvogt commented Nov 12, 2020

mabelzhang commented Nov 16, 2020

crvogt commented Nov 16, 2020

crvogt commented Nov 16, 2020 • edited Loading

crvogt commented Nov 16, 2020 • edited Loading

mabelzhang commented Nov 18, 2020 • edited Loading

crvogt commented Nov 18, 2020

mabelzhang commented Nov 18, 2020

crvogt commented Nov 18, 2020

mabelzhang commented Nov 19, 2020

caguero left a comment

Choose a reason for hiding this comment

caguero commented Nov 19, 2020

mabelzhang commented Nov 20, 2020 • edited Loading

mabelzhang commented Nov 20, 2020

mabelzhang commented Nov 20, 2020

crvogt commented Nov 23, 2020

crvogt commented Nov 24, 2020 • edited Loading

mabelzhang commented Nov 24, 2020 • edited Loading

crvogt commented Nov 25, 2020

mabelzhang commented Nov 11, 2020 •

edited

Loading

mabelzhang commented Nov 11, 2020 •

edited

Loading

crvogt commented Nov 16, 2020 •

edited

Loading

crvogt commented Nov 16, 2020 •

edited

Loading

mabelzhang commented Nov 18, 2020 •

edited

Loading

mabelzhang commented Nov 20, 2020 •

edited

Loading

crvogt commented Nov 24, 2020 •

edited

Loading

mabelzhang commented Nov 24, 2020 •

edited

Loading