Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow? #21

Open
CR7273868 opened this issue Sep 10, 2024 · 7 comments
Open

Stack overflow? #21

CR7273868 opened this issue Sep 10, 2024 · 7 comments

Comments

@CR7273868
Copy link

I found this minimal example to be able to cause a stack overflow. It only happens when the destructor is called of async drop within my project it just stack overflows and aborts the program.

Do note these supervisors are stored into a DashMap that's properly cleared out. So, async-drop is called right after the program enters it's exiting stage.

I just clear the map, after awaiting a tokio::signal::ctrl_c().await signal. I clear the map, the async-drop gets called & then async-dropper panics with:

thread 'tokio-runtime-worker' has overflowed its stack
fatal runtime error: stack overflow
Aborted

I use the derive variant.

Rust ENV:

Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:        11
Codename:       bullseye
rustc 1.76.0 (07dca489a 2024-02-04) (STABLE)
#[derive(Default, async_dropper::AsyncDrop)]
pub struct Supervisor {
    pub socket_bot_id: uuid::Uuid,
    pub room_id: String,
    pub private_room: std::sync::atomic::AtomicBool,
    pub state: std::sync::Weak<crate::state::State>,
}
#[async_trait]
impl async_dropper::AsyncDrop for Supervisor {
    async fn async_drop(&mut self) -> Result<(), async_dropper::AsyncDropError> {
        println!("Dropping supervisor: {:?}", self.socket_bot_id);
        tokio::time::sleep(std::time::Duration::from_secs(1)).await;
        println!("Dropped supervisor: {:?}", self.socket_bot_id);
        Ok(())
    }
    fn drop_timeout(&self) -> std::time::Duration {
        std::time::Duration::from_secs(5) // extended from default 3 seconds, as an example
    }
}
@t3hmrman
Copy link
Owner

t3hmrman commented Sep 11, 2024

Hey @CR7273868 Thanks for the issue and minimal repro!

If you implement async_drop, you need to return the Supervisor to the state where it is identical to the output of a Supervisor::default() -- the way that the derived version checks is that it compares the object after calling async_drop() to the default instance of itself.

Just a guess here -- but can you confirm that the Supervisor is equivalent to a default instance? I'd imagine that's where the issue is coming from -- what stops the infinite loop of attempting to drop is equivalence to the default instance.

The code should have panic'd if it wasn't equivalent, but you could try overriding the reset implementation as a test before I get to the bottom of this. An easy (although somewhat clunky) fix for anything that doesn't Default well is to wrap it in an Option<T>.

@CR7273868
Copy link
Author

So I either call reset? Since, async-dropper relies on Default. I will do a std::mem::take(&mut T); to reset it. And I’ll try diagnosing further with the reset impl.

@t3hmrman
Copy link
Owner

Ah I wasn't clear here -- actually async-dropper is doing the Default and std::mem::take. I'm wondering if it's possibly that the default generated reset() doesn't work for the Supervisor object you have...

If you override reset() you can do a check yourself inside there and see what is happening/debug directly.

@CR7273868
Copy link
Author

I must also say, this only happens when the program exits. If the code keeps running, it behaves as expected. I forgot to mention this. Try to replica using tokio signal ctrl_c.

I will do your mentioned steps tonight, when I get home.

@t3hmrman
Copy link
Owner

t3hmrman commented Sep 11, 2024

I must also say, this only happens when the program exits. If the code keeps running, it behaves as expected. I forgot to mention this. Try to replica using tokio signal ctrl_c.

OH! this is really interesting/important information -- yeah will try and reproduce this -- I assume you have a [tokio::main] somewhere. Are you explicitly calling drop in the task that is waiting for the ctrl_c signal?

@CR7273868
Copy link
Author

Yes, I have tokio::main macro with an async fn main();

Yes, I prepare a DashMap (https://docs.rs/dashmap/latest/dashmap/) within there uuid::Uuid as key & Arc inside another struct that’s called SelfBot (does not implement async-dropper).

When signal ctrlc hits, I perform a clean up. I cancel the self_bot tokio task handle, then clear the dashmap that have the entries that implement: Default, async_dropper:AsyncDrop.

when those values get dropped, that’s when it starts to panic with a tokio runtime worker, stack overflow.

@t3hmrman
Copy link
Owner

t3hmrman commented Sep 21, 2024

Hey @CR7273868 I'm having a hard time getting your example to compile -- the PartialEq and Eq traits aren't present for Weak and AtomicBool, and AsyncDrop requires those -- how did you work around this? Did you wrap those in something that can implement PartialEq/Eq?

I've tried to replicate here:

#22

Please let me know if I'm missing something -- it doesn't compile, so now I'm wondering how you even got to that state!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants