Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(autoware_probabilistic_occupancy_grid_map): cuda accelerated implementation #9542

Merged

Conversation

knzo25
Copy link
Contributor

@knzo25 knzo25 commented Dec 3, 2024

Description

This PR introduces a CUDA accelerated version of the OGM.
Several improvements are possible, but this PR attempts to replicate whenever possible all the functionalities of the original OGM that were actively used.

Algorithms that were not reimplemented in CUDA:

  • extractCommonPointCloud: this option was off in all configs and launchers, and talking with @YoshiRi, it would not be used

Algorithms that were not tested:

  • Fusion. No project actively uses it, but it should work since it uses a non accelerated version

Related links

Parent Issue:

  • Link

How was this PR tested?

Tested with our internal reference design. The OGM will not give the exact same results, but visual inspection shows almost the same characteristics.
The latency with a VLS128 goes from ~20ms to 2-3ms

Notes for reviewers

@soblin
This PR introduces CUDA dependencies. If this would break any P/C workflows, please let me know and I will try to make a version that works even without CUDA installes (sadly, via macros).

In addition, I was told that I should not accelerate the laserscan method. If this is wrong , please let me know

None.

Interface changes

None.

Effects on system behavior

None.

Copy link

github-actions bot commented Dec 3, 2024

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

@github-actions github-actions bot added component:perception Advanced sensor data processing and environment understanding. (auto-assigned) tag:require-cuda-build-and-test labels Dec 3, 2024
@knzo25
Copy link
Contributor Author

knzo25 commented Dec 3, 2024

Note to self: run this with the evaluator

@knzo25 knzo25 self-assigned this Dec 3, 2024
@knzo25
Copy link
Contributor Author

knzo25 commented Dec 3, 2024

@YoshiRi
Using CUDA we can delete the 2 subsample filters from the raw and obstacle pointcloud. Would you prefer a different PR for that?

@soblin
Here or internally, can you give a few pointers on how to check that I did not break the laserscan method?

@knzo25 knzo25 added the run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Dec 3, 2024
Copy link
Member

@youtalk youtalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to be caused by the Autoware development container.

https://github.com/autowarefoundation/autoware.universe/actions/runs/12150740163/job/33884090279?pr=9542#step:5:7660

Error: ponent_container-1] [ERROR] [1733273274.264398228] [occupancy_grid_map_container]: Component constructor threw an exception: cudaErrorInsufficientDriver (35)@/__w/autoware.universe/autoware.universe/install/autoware_cuda_utils/include/autoware/cuda_utils/cuda_unique_ptr.hpp#L43: CUDA driver version is insufficient for CUDA runtime version

The error indicates that the CUDA runtime library is missing when loading into the component container during execution.
Therefore, it is believed that referencing other CUDA-dependent packages and modifying the component loading should resolve the issue.

@knzo25
Copy link
Contributor Author

knzo25 commented Dec 4, 2024

@youtalk
In the past and in my own environment, I had only seen that issue when I messed up my environment (cuda and driver installations).

I tried using the docker environment
https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/

and ran:

colcon test --packages-select autoware_probabilistic_occupancy_grid_map
colcon test

with no errors (to the relevant package). Should I expect the CI/CD to pass build and test if it works in autoware's docker installation?

@soblin
Copy link
Contributor

soblin commented Dec 5, 2024

@knzo25 Sorry I noticed your comments. For P/C workflow, no worry.

@soblin
Copy link
Contributor

soblin commented Dec 5, 2024

@knzo25 For testing laserscan method, run

ros2 launch autoware_launch planning_simulator.launch.xml map_path:=<> vehicle_model:=<> sensor_model:=<> 

, then place ego and a NPC, and visualize the OccupancyGridMap (you can find the checkbox under perception). And check if the region behind of NPC from ego is gray

@YoshiRi
Copy link
Contributor

YoshiRi commented Dec 12, 2024

@knzo25

Using CUDA we can delete the 2 subsample filters from the raw and obstacle pointcloud. Would you prefer a different PR for that?

You mean we can remove obstacle pointcloud filtering with raw data I explained last Friday?
If it is not complicated, I think it's okay to include that feature within this PR.

@knzo25
Copy link
Contributor Author

knzo25 commented Dec 13, 2024

@YoshiRi
I mean, we can delete the Pickup subsample (the extra node). It takes about 10ms for the raw pointcloud currently. If we forego that one, of course, the new implementation will be a bit slower, but not 10ms (sorry, I haven't had the chance to provide all the evidence required here 🙇 )

Copy link
Contributor

@technolojin technolojin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except the conflict, this PR worked in my local environment.

Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
@technolojin technolojin force-pushed the feat/cuda_probabilistic_occupancy_grid_map branch from 64d3030 to 5bfa282 Compare December 27, 2024 00:43
@technolojin
Copy link
Contributor

technolojin commented Dec 27, 2024

I accidentally pushed (-forced) into your repo. I am sorry for this.
The changes are resolving the conflict and update relatively latest main(rebase).

Copy link
Contributor

@YoshiRi YoshiRi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worked on my environment. I left some trivial comments to fix.

int offset_dx = dx > 0 ? 1 : -1; // sign(dx);
int offset_dy = dy > 0 ? size_x : -size_x; // sign(dy) * size_x;

float scale = (dist == 0.0) ? 1.0 : min(1.f, max_length / dist);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for safety, how about using epsilon?

std::abs(dist) < epsilon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 5c14fc0
(this was essentially a copy from the upstream)

sm_lower_left_y * sm_size_x + sm_lower_left_x + region_y * sm_size_x + region_x;
const int dm_index =
dm_lower_left_y * dm_size_x + dm_lower_left_x + region_y * dm_size_x + region_x;

Copy link
Contributor

@YoshiRi YoshiRi Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[imho]
Add index check for safety

if(dm_index >= dm_size_x * dm_size_y) return;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in ea46130

knzo25 and others added 12 commits January 23, 2025 20:04
…er/log_odds_bayes_filter_updater_kernel.cu

Co-authored-by: Yoshi Ri <[email protected]>
…er/log_odds_bayes_filter_updater_kernel.cu

Co-authored-by: Yoshi Ri <[email protected]>
…om:knzo25/autoware.universe into feat/cuda_probabilistic_occupancy_grid_map
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
…om:knzo25/autoware.universe into feat/cuda_probabilistic_occupancy_grid_map
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Copy link

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 0% with 343 lines in your changes missing coverage. Please review.

Project coverage is 29.73%. Comparing base (874dbdd) to head (263feab).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...id_map/lib/costmap_2d/occupancy_grid_map_fixed.cpp 0.00% 51 Missing ⚠️
...p/lib/costmap_2d/occupancy_grid_map_projective.cpp 0.00% 47 Missing ⚠️
...rid_map/lib/costmap_2d/occupancy_grid_map_base.cpp 0.00% 44 Missing ⚠️
...costmap_2d/occupancy_grid_map_projective_kernel.cu 0.00% 38 Missing ⚠️
.../lib/costmap_2d/occupancy_grid_map_fixed_kernel.cu 0.00% 28 Missing ⚠️
...id_map/lib/updater/binary_bayes_filter_updater.cpp 0.00% 21 Missing ⚠️
...istic_occupancy_grid_map/lib/utils/utils_kernel.cu 0.00% 20 Missing ⚠️
...p/src/fusion/synchronized_grid_map_fusion_node.cpp 0.00% 19 Missing ⚠️
...d_map/pointcloud_based_occupancy_grid_map_node.cpp 0.00% 17 Missing ⚠️
...istic_occupancy_grid_map/utils/cuda_pointcloud.hpp 0.00% 13 Missing ⚠️
... and 10 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9542      +/-   ##
==========================================
- Coverage   29.82%   29.73%   -0.09%     
==========================================
  Files        1433     1441       +8     
  Lines      108060   107983      -77     
  Branches    42965    42777     -188     
==========================================
- Hits        32226    32110     -116     
- Misses      72705    72762      +57     
+ Partials     3129     3111      -18     
Flag Coverage Δ *Carryforward flag
differential-cuda 0.00% <0.00%> (?)
total 29.83% <ø> (+0.01%) ⬆️ Carriedforward from 4c13bb0

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@knzo25
Copy link
Contributor Author

knzo25 commented Jan 24, 2025

@soblin
Tested that adding a pedestrian in the planning sim has the expected behavior

…pu version uses the upstream logic

Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Copy link
Contributor

@YoshiRi YoshiRi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@knzo25 knzo25 merged commit 165be35 into autowarefoundation:main Jan 28, 2025
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned) run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) tag:require-cuda-build-and-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants