-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU plan? #52
Comments
I'd say it is pretty hard - the algorithms are very sequential by nature, which is not enjoyed by GPUs. The most promising approach might be to parallelise across events, rather than try to parallelise within an event. Alternatively, one could investigate more parallelisable clustering approaches, like cellular automata or maybe even an ML inference algorithm for clustering. Stretch goal, but interesting... |
I was thinking parallel over events! Which should boils down to writing some loops with KernalAbstrations.jl maybe? |
Sort of... but one would need to see if the code for the CPU version is going to go onto a GPU to do each of the events. The data structures for the plain algorithm would be a lot easier and would probably accelerate well. For the tiled algorithm my gut feeling is it would be a lot of work, as the whole thrust there is to use more complex logic to reduce the computational burden. e.g., linked lists and the whole tiling setup, so the data layouts are not at all GPU friendly. So the plain algorithm would be the one to target IMO. |
Although I should say from the outset that at the typical particle densities where N2Plain is used (Z->ee) the typical jet reconstruction time is O(10μs), so this is not, per-se, a real target for GPU running. |
Is there way to somehow make the algorithms GPU-able?
The text was updated successfully, but these errors were encountered: