parallelize costmap bound updates #2393

simutisernestas · 2021-06-07T18:04:27Z

Basic Info

Info	Please fill out this column
Ticket(s) this addresses	#2042
Primary OS tested on	Ubuntu
Robotic platform tested on	Gazebo simulation of turtlebot

Description of contribution

fun little parallel optimization of costmap bounds update
i've described this previously in Assess parallelism opportunities in Nav2 #2042, but this time i've used default config provided in nav2_bringup. When the number of layers gets smaller (or layers are very light) improvement slowly vanishes, but still even with 2-3 layers and no filters there is 5-10% improvement overall in updateMap measured here: https://github.com/ros-planning/navigation2/blob/b63b11f8a638faf389d6b99911d95899abe1580b/nav2_costmap_2d/src/costmap_2d_ros.cpp#L411-L415

Future work that may be required

probably a parallel version of updateCosts function call would be more interesting and useful. However, that's a bit more tricky to do safely on that shared member variable combined_costmap_ here: https://github.com/ros-planning/navigation2/blob/b63b11f8a638faf389d6b99911d95899abe1580b/nav2_costmap_2d/src/layered_costmap.cpp#L208
any thoughts on how you would go about implementing this one? one obvious way would be to make a copy for each thread, update costs on the copy, and later join them but that seems suboptimal because the costmap copy can become expensive

For Maintainers:

Check that any new parameters added are updated in navigation.ros.org
Check that any significant change is added to the migration guide
Check that any new functions have Doxygen added
Check that any new features have test coverage
Check that any new plugins is added to the plugins page
If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

SteveMacenski · 2021-06-08T17:25:34Z

Conviniently, about 50% of the costmap plugins do the majority of their work in the update bounds section (since you need to know what to extend the bounds by, you have to process information there). So the updateCosts function is primarily just copying over the data, the updateBounds probably represents a non-trivial amount of the available optimization.

I think it would actually be a better use of time to migrate more plugins to do their main "work" in the updateBounds function to use this optimization than to try to parallelize updateCosts since that is majority copying over which can't be easily parallelized. I don't see any reason why the layers that do more work in updateCosts couldn't have it moved to updateBounds for the most part to make the best use of this!

nav2_costmap_2d/CMakeLists.txt

nav2_costmap_2d/src/layered_costmap.cpp

SteveMacenski · 2021-06-08T17:32:39Z

nav2_costmap_2d/src/layered_costmap.cpp

-        prev_minx, prev_miny, prev_maxx, prev_maxy,
-        minx_, miny_, maxx_, maxy_,
-        (*plugin)->getName().c_str());
+  #pragma omp for schedule(static, 1)


From my understanding, this scheduling makes it so that each thread takes 1 iteration at a time statically. Why specify this and not have it use auto or the other options? Genuinely not sure, this might be the right thing to do but asking

Similar to the previous point, if you have a small number of iteration and auto sets a schedule for one thread to take all iterations at once this wouldn't make any sense. I'm just cautious about auto because I don't know what the compiler will do here in that case. Therefore one plugin or filter update allocated to one thread seems most appropriate because the number is usually less than 5

did you see a demonstrable change for your testing with and without this?

i'll get back with some quantitative results on this one, haven't tested with auto actually

(or even just lacking of schedule specification, which I assume is auto. I think it defaults to the number of cores)

Yeah, I think we can leave scheduling for OpenMP.

I would go with the third one since the difference is not noticeable and it seems more flexible. I've pushed code with that version. Probably those variables in front of the loop could be made static member variables.

I'll update you on the speed-up in the near future.

I've got back to the speed-up evaluation. Funny enough, without DEBUG log level parallel version is actually underperforming compared to the plain loop... Log level was influencing my measurements previously somehow.

plain average update time: 0.001102279375405933 parallel average update time: 0.0010157955731191883

These are averaged updated from both local and global cost maps. With this config: https://github.com/simutisernestas/navigation2/blob/parallel_config/nav2_bringup/bringup/params/nav2_params.yaml. I wonder if that would be the case with heavier cost map layers or bigger map.

1.1 ms, why so fast? Oh, is this in the TB3 sandbox world? Yeah, it might be worth trying something larger. This shows an overview of getting nav2 running in the AWS warehouse world https://github.com/ros-planning/navigation2/tree/main/nav2_simple_commander.

Strangely enough, map updates are even faster on AWS warehouse world demo. I've pulled the latest changes from the repo the code must have been upgraded since the last time I've benchmarked this. Both parallel and serial versions do the job in about the same amount of time. With parallel slightly better, but it's insignificant I would say. I'm testing on Intel® Core™ i7-3770 CPU @ 3.40GHz × 8 with the same config as mentioned previously.

serial avg update time: 0.7053497955298016 ms from 2416 samples parallel avg update time: 0.6474942507522566 ms from 2991 samples

That doesn't surprise me, since you if each thread is doing "more" work, then the overhead of spinning up a new thread is more worth it. It's about 8% it seems, do you know why its lower than what you were seeing before when you reported a ~30% increase in speed? I'm also a little dubious since the updates are so fast. Your other experiment reported 12-20ms in update times (I assume more data) while this is less than 1ms. 1ms seems way too fast to be what you'd see in a practical deployed application so I wonder if in a more realistic setup where things are taking longer if we do see that benefit again.

nav2_costmap_2d/src/layered_costmap.cpp

SteveMacenski · 2021-10-05T19:44:23Z

I'm going to close this for now since we're still trying to characterize the benefits and settings. We're not seeing the kind of improvements we'd like and I don't want to keep holding you hostage to working on this after so many months. Its a good reference starting point for an effort another time, but I think it would be good to get this off the review queue until a more dedicated effort is started another time.

parallelize costmap bound updates

892ead2

SteveMacenski requested changes Jun 8, 2021

View reviewed changes

simutisernestas added 2 commits June 8, 2021 20:42

remove whitespaces from cmake

c2bf850

separate parallel block for plugins & threads

e7323cb

SteveMacenski closed this Oct 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallelize costmap bound updates #2393

parallelize costmap bound updates #2393

simutisernestas commented Jun 7, 2021 •

edited by SteveMacenski

Loading

SteveMacenski commented Jun 8, 2021 •

edited

Loading

SteveMacenski Jun 8, 2021

simutisernestas Jun 8, 2021

SteveMacenski Jun 8, 2021

simutisernestas Jun 9, 2021

SteveMacenski Jun 9, 2021

simutisernestas Jun 11, 2021

simutisernestas Jul 6, 2021

SteveMacenski Jul 7, 2021

simutisernestas Jul 12, 2021

SteveMacenski Jul 12, 2021

SteveMacenski commented Oct 5, 2021

parallelize costmap bound updates #2393

parallelize costmap bound updates #2393

Conversation

simutisernestas commented Jun 7, 2021 • edited by SteveMacenski Loading

Basic Info

Description of contribution

Future work that may be required

For Maintainers:

SteveMacenski commented Jun 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveMacenski commented Oct 5, 2021

simutisernestas commented Jun 7, 2021 •

edited by SteveMacenski

Loading

SteveMacenski commented Jun 8, 2021 •

edited

Loading