Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUFFT plans seem to leak GPU memory #2400

Open
david-macmahon opened this issue May 27, 2024 · 0 comments
Open

CUFFT plans seem to leak GPU memory #2400

david-macmahon opened this issue May 27, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@david-macmahon
Copy link
Contributor

The memory associated with CUFFT plans is not always reclaimed. This was a big problem for CUDA v5.3.4 because plan memory was not consistently reclaimed/reused. On master (as of a2a9b13) the situation is much improved, but there still seems to be a leak of one plan's worth of memory.

Here is the behavior I'm seeing when using master (a2a9b13):

julia> using CUDA, CUDA.CUFFT

julia> CUDA.memory_status()
Effective GPU memory usage: 1.02% (163.438 MiB/15.724 GiB)
Memory pool usage: 0 bytes (0 bytes reserved)

julia> x=CUDA.zeros(2^28); CUDA.memory_status()
Effective GPU memory usage: 7.37% (1.160 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

julia> p=plan_rfft(x); CUDA.memory_status()
Effective GPU memory usage: 13.81% (2.171 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

julia> p=nothing; GC.gc(); CUDA.memory_status(); CUDA.reclaim(); CUDA.memory_status()
Effective GPU memory usage: 13.81% (2.171 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)
Effective GPU memory usage: 13.81% (2.171 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

julia> p=plan_rfft(x); CUDA.memory_status()
Effective GPU memory usage: 20.18% (3.173 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

julia> p=nothing; GC.gc(); CUDA.memory_status(); CUDA.reclaim(); CUDA.memory_status()
Effective GPU memory usage: 20.18% (3.173 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)
Effective GPU memory usage: 13.81% (2.171 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

julia> p=plan_rfft(x); CUDA.memory_status()
Effective GPU memory usage: 20.18% (3.173 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

julia> p=nothing; GC.gc(); CUDA.memory_status(); CUDA.reclaim(); CUDA.memory_status()
Effective GPU memory usage: 20.18% (3.173 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)
Effective GPU memory usage: 13.81% (2.171 GiB/15.724 GiB)
Memory pool usage: 1024.000 MiB (1024.000 MiB reserved)

The data array uses 1 GiB of GPU memory. The first plan uses 1 GiB of memory, but it is not reclaimed after the plan is (presumably) GC'd. The second plan does not reuse the first plan's memory, so GU memory usage goes up to 3 GiB, but this is reclaimed when the second plan is GC'd. The third plan behave the same as the second plan.

Curiously, p.handle goes back and forth between 1 and 2 with each plan creation. I'm not sure if that's relevant, but I think the desired bevavior might be to keep re-using handle 1?

@david-macmahon david-macmahon added the bug Something isn't working label May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant