Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to temporarily set the current registry even if it is not associated with a worker thread #1166

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

adamreichold
Copy link
Collaborator

Reproducing the issue in #1165.

Will into whether this can be changed without introducing deadlocks...

@adamreichold
Copy link
Collaborator Author

Will into whether this can be changed without introducing deadlocks...

While the change to temporarily stash a reference to a "foreign" registry in a TLS variable appears to pass the test suite, I am admittedly not confident about the implications of having such a current registry that is not associated with a worker thread in the first place.

I am also unsure about the performance implications of having the TLS access in Registry::current. I did try to mitigate this by at least also using it to fetch the global registry instead of doing that in separate step though.

@awused Could you give this a try whether that would work for your use case at least?

@adamreichold
Copy link
Collaborator Author

I am admittedly not confident about the implications of having such a current registry that is not associated with a worker thread in the first place.

But then again, this should be fine as for example the main thread is always in this relation w.r.t. the global pool, right?

@adamreichold adamreichold changed the title Add test case demonstrating that the global pool is used to spawn work from within in_place_scope. Allow to temporarily set the current registry even if it is not associated with a worker thread May 12, 2024
@adamreichold adamreichold marked this pull request as ready for review May 12, 2024 09:30
@awused
Copy link

awused commented May 12, 2024

@awused Could you give this a try whether that would work for your use case at least?

Here are tests I added to iter/test.rs. Based on the documentation I initially would have expected both of these tests to pass, but I don't think they've changed with this PR.

#[test]
#[cfg_attr(any(target_os = "emscripten", target_family = "wasm"), ignore)]
fn scope_par_iter_which_pool() {
    let pool = ThreadPoolBuilder::new()
        .num_threads(1)
        .thread_name(|_| "worker".to_owned())
        .build()
        .unwrap();

    // Determine which pool is currently installed here
    // by checking the thread name seen by spawned work items.
    pool.scope(|_scope| {
        let (name_send, name_recv) = channel();

        let v = [0; 1];

        v.par_iter().for_each(|_| {
            let name = thread::current().name().map(ToOwned::to_owned);

            name_send.send(name).unwrap();
        });

        let name = name_recv.recv().unwrap();

        assert_eq!(name.as_deref(), Some("worker"));
    });
}

#[test]
#[cfg_attr(any(target_os = "emscripten", target_family = "wasm"), ignore)]
fn in_place_scope_par_iter_which_pool() {
    let pool = ThreadPoolBuilder::new()
        .num_threads(1)
        .thread_name(|_| "worker".to_owned())
        .build()
        .unwrap();

    // Determine which pool is currently installed here
    // by checking the thread name seen by spawned work items.
    pool.in_place_scope(|_scope| {
        let (name_send, name_recv) = channel();

        let v = [0; 1];

        v.par_iter().for_each(|_| {
            let name = thread::current().name().map(ToOwned::to_owned);

            name_send.send(name).unwrap();
        });

        let name = name_recv.recv().unwrap();

        assert_eq!(name.as_deref(), Some("worker"));
    });
}

@adamreichold
Copy link
Collaborator Author

As you can infer from the assertion failure

---- iter::test::in_place_scope_par_iter_which_pool stdout ----
thread 'iter::test::in_place_scope_par_iter_which_pool' panicked at src/iter/test.rs:2387:9:
assertion `left == right` failed
  left: Some("iter::test::in_place_scope_par_iter_which_pool")
 right: Some("worker")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

the second does not pass because it makes the assumption that all work would end up on the worker threads, but for the parallel iterators (and rather for any join-based interface) this will not be the case and some of the work can be executed directly on the main thread (which is the test thread in this case).

The test as written does not check which worker pool ends up being used. Please have a look at the tests I added here which use the (global) spawn function instead of the parallel iterators exactly for this reason.

@adamreichold
Copy link
Collaborator Author

However, extending the tests to use more work and checker for either the main thread or the worker threads, still fails, so while the original test did not check this, the work does not seem to end up on the right pool after all. Will investigate...

@adamreichold
Copy link
Collaborator Author

I was missing more direct usages of global_registry, especially called via join and now both tests pass. Note that I used the following version of your second test case:

#[test]
#[cfg_attr(any(target_os = "emscripten", target_family = "wasm"), ignore)]
fn in_place_scope_par_iter_which_pool() {
    let pool = ThreadPoolBuilder::new()
        .num_threads(1)
        .thread_name(|_| "worker".to_owned())
        .build()
        .unwrap();

    // Determine which pool is currently installed here
    // by checking the thread name seen by spawned work items.
    pool.in_place_scope(|_scope| {
        let (name_send, name_recv) = std::sync::mpsc::channel();

        let v = [0; 128];

        v.par_iter().for_each(|_| {
            let name = std::thread::current().name().map(ToOwned::to_owned);

            name_send.send(name).unwrap();
        });

        drop(name_send);

        for name in name_recv {
            let name = name.unwrap();
            assert!(name.contains("in_place_scope_par_iter_which_pool") || name == "worker");
        }
    });
}

which does end up submitting work into the pool. But I think I would still prefer to have a more targetted test case using join in rayon-core. Will work on that...

@awused
Copy link

awused commented May 12, 2024

the second does not pass because it makes the assumption that all work would end up on the worker threads, but for the parallel iterators (and rather for any join-based interface) this will not be the case

That is the entire bug report in #1165. The documentation is unclear on this point and makes it sound like it will be the case, which is why I suggested updating the documentation for clarity.

some of the work can be executed directly on the main thread (which is the test thread in this case).

I don't think that's the case, at least the documentation for rayon as a whole doesn't state that and I've never observed a parallel iterator using the current thread unless it's already in a rayon threadpool. I would only expect that behaviour if I deliberately set use_current_thread on the threadpool when it was built. If I run a bare par_iter with no scope/install/etc, I never see work execute on the main thread.

Huh, I guess it can do that, I have no idea what is different between my code and this test.

@adamreichold
Copy link
Collaborator Author

Huh, I guess it can do that, I have no idea what is different between my code and this test.

Expect for asserting multiple names to be one of two choice, the only real difference in the workload is that you used a vector of length one whereas I used one of length 128 to ensure that some of the work would end up on the worker threads (having None as their thread names for the default global pool).

@adamreichold
Copy link
Collaborator Author

That is the entire bug report in #1165.

I tend to disagree. I think the bug here is that the when work is dispatched onto worker threads via global interfaces like spawn and join (and thereby ParallelIterator), they end up on the global thread pool and not on the one executing the scope, i.e. there is an inconsistency between scope.join and crate::join which this PR tries to remove.

@awused
Copy link

awused commented May 12, 2024

Huh, I guess it can do that, I have no idea what is different between my code and this test.

Expect for asserting multiple names to be one of two choice, the only real difference in the workload is that you used a vector of length one whereas I used one of length 128 to ensure that some of the work would end up on the worker threads (having None as their thread names for the default global pool).

I was running it with enough items (even tried adding sleep to make sure all available threads were finding work). These asserts all pass when run against the current rayon release - with this PR the second run should assert "outer" instead.

    ThreadPoolBuilder::new()
        .thread_name(|u| format!("global"))
        .build_global()
        .unwrap();

    let stuff = vec![0; 50000];
    stuff.par_iter().for_each(|_| {
         assert_eq!(std::thread::current().name(), Some("global"));
        std::thread::sleep(Duration::from_millis(1));
    });

    let outer_pool = ThreadPoolBuilder::new().thread_name(|u| "outer".to_owned()).build().unwrap();

    outer_pool.in_place_scope(|_| {
        stuff.par_iter().for_each(|_| {
            assert_eq!(std::thread::current().name(), Some("global"));
            std::thread::sleep(Duration::from_millis(1));
        })
    });

I still can't figure out what exactly is different with the test being run compared to this code. Based on that test, my code should fail.

@adamreichold
Copy link
Collaborator Author

with this PR the second run should assert "outer" instead.

Exactly and with the currently pushed version, the code above does indeed fail with

thread 'outer' panicked at src/iter/test.rs:2352:13:
assertion `left == right` failed
  left: Some("outer")
 right: Some("global")

I still can't figure out what exactly is different with the test being run compared to this code. Based on that test, my code should fail.

I think I lost you on which test code we are talking about exactly. At least for the code posted in #1166 (comment), the problem was the vector length of one which meant there was no splitting at all and the single invocation happened directly on the main/test thread. Using more work meant join was called at least once which ended up on the global thread pool, i.e. had None as its thread name.

@awused
Copy link

awused commented May 12, 2024

I think I lost you on which test code we are talking about exactly.

In the end I don't think it matters much really, it's a tangential issue that shouldn't make a material difference in program execution since the calling thread is still blocked until the parallel iterator ends anyway.

This PR does seems to address the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants