Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyCall related bottlenecks, and are they addressable? #101

Open
tomerarnon opened this issue Oct 5, 2023 · 2 comments
Open

PyCall related bottlenecks, and are they addressable? #101

tomerarnon opened this issue Oct 5, 2023 · 2 comments

Comments

@tomerarnon
Copy link

tomerarnon commented Oct 5, 2023

This isn't an issue directly with RobotOS.jl I think, but rather with its use of PyCall. However, I'm hoping there's some insight to be gained here, and maybe things that can be done to mitigate on the RobotOS/my own side.

What I'm seeing when profiling my ros project is that almost all of the time of my application is split between:

  1. the async callback in pubsub.jl line 59. A significant fraction of this is spent in conversions implemented in PyCall that appear to be inherently type unstable (e.g. py2array). And much of that time is spent on pyerr_check.
    • Side note, but maybe worth mentioning: I've noticed that the longer the time between two calls to rossleep/the subscriber callbacks, the more time is spent processing these pyerrors. Maybe there is a list of them that accumulates over time and is cleared after the yield? Not sure if this is related...
  2. a type unstable call to convert stemming from gentypes.jl line 489, which leads to a type unstable call to NpyArray from PyCall
  3. a type unstable call to ismutable that stems from gentypes line 507, and which is seemingly required for deepcopy and any array allocations

I'm wondering if you're aware of the bottlenecks above, and if you have any experience dealing with them. I'm hoping there is something to be done here, since the 3 items above account for almost my entire computation budget 😅

@jdlangs
Copy link
Owner

jdlangs commented Oct 5, 2023

@tomerarnon thanks for the report and the effort investigating it! Maybe you could share some of your benchmarking and profiling setup in case anyone wants to pursue optimizations? I don't have any immediate ideas or suggestions except that it's been so long since the package was written that I wouldn't be surprised if it's possible to get much better performance out of PyCall now.

@tomerarnon
Copy link
Author

I can't share my particular profiling data or code, but the setup is quite simple. I used ProfileView.jl and ran my "ros loop" for a set number of iterations (I believe it was 50 at 10 hz) with @profview. After that, it's just a matter of inspecting the flame graph to determine where the bottlenecks are, to see what can be addressed. Because the callbacks involve an async process, in this case there are two "sources"/lowest levels of the flame graph, one for the main program and another for the background process, but both contribute to the overall runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants