-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to temporarily set the current registry even if it is not associated with a worker thread #1166
base: main
Are you sure you want to change the base?
Allow to temporarily set the current registry even if it is not associated with a worker thread #1166
Conversation
While the change to temporarily stash a reference to a "foreign" registry in a TLS variable appears to pass the test suite, I am admittedly not confident about the implications of having such a current registry that is not associated with a worker thread in the first place. I am also unsure about the performance implications of having the TLS access in @awused Could you give this a try whether that would work for your use case at least? |
1b7ad17
to
2ae3595
Compare
But then again, this should be fine as for example the main thread is always in this relation w.r.t. the global pool, right? |
2ae3595
to
97b2c9e
Compare
97b2c9e
to
36e5d84
Compare
Here are tests I added to #[test]
#[cfg_attr(any(target_os = "emscripten", target_family = "wasm"), ignore)]
fn scope_par_iter_which_pool() {
let pool = ThreadPoolBuilder::new()
.num_threads(1)
.thread_name(|_| "worker".to_owned())
.build()
.unwrap();
// Determine which pool is currently installed here
// by checking the thread name seen by spawned work items.
pool.scope(|_scope| {
let (name_send, name_recv) = channel();
let v = [0; 1];
v.par_iter().for_each(|_| {
let name = thread::current().name().map(ToOwned::to_owned);
name_send.send(name).unwrap();
});
let name = name_recv.recv().unwrap();
assert_eq!(name.as_deref(), Some("worker"));
});
}
#[test]
#[cfg_attr(any(target_os = "emscripten", target_family = "wasm"), ignore)]
fn in_place_scope_par_iter_which_pool() {
let pool = ThreadPoolBuilder::new()
.num_threads(1)
.thread_name(|_| "worker".to_owned())
.build()
.unwrap();
// Determine which pool is currently installed here
// by checking the thread name seen by spawned work items.
pool.in_place_scope(|_scope| {
let (name_send, name_recv) = channel();
let v = [0; 1];
v.par_iter().for_each(|_| {
let name = thread::current().name().map(ToOwned::to_owned);
name_send.send(name).unwrap();
});
let name = name_recv.recv().unwrap();
assert_eq!(name.as_deref(), Some("worker"));
});
} |
As you can infer from the assertion failure
the second does not pass because it makes the assumption that all work would end up on the worker threads, but for the parallel iterators (and rather for any join-based interface) this will not be the case and some of the work can be executed directly on the main thread (which is the test thread in this case). The test as written does not check which worker pool ends up being used. Please have a look at the tests I added here which use the (global) |
However, extending the tests to use more work and checker for either the main thread or the worker threads, still fails, so while the original test did not check this, the work does not seem to end up on the right pool after all. Will investigate... |
36e5d84
to
301b603
Compare
I was missing more direct usages of #[test]
#[cfg_attr(any(target_os = "emscripten", target_family = "wasm"), ignore)]
fn in_place_scope_par_iter_which_pool() {
let pool = ThreadPoolBuilder::new()
.num_threads(1)
.thread_name(|_| "worker".to_owned())
.build()
.unwrap();
// Determine which pool is currently installed here
// by checking the thread name seen by spawned work items.
pool.in_place_scope(|_scope| {
let (name_send, name_recv) = std::sync::mpsc::channel();
let v = [0; 128];
v.par_iter().for_each(|_| {
let name = std::thread::current().name().map(ToOwned::to_owned);
name_send.send(name).unwrap();
});
drop(name_send);
for name in name_recv {
let name = name.unwrap();
assert!(name.contains("in_place_scope_par_iter_which_pool") || name == "worker");
}
});
} which does end up submitting work into the pool. But I think I would still prefer to have a more targetted test case using |
That is the entire bug report in #1165. The documentation is unclear on this point and makes it sound like it will be the case, which is why I suggested updating the documentation for clarity.
Huh, I guess it can do that, I have no idea what is different between my code and this test. |
…k from within in_place_scope.
…iated with a worker thread
301b603
to
326fd64
Compare
Expect for asserting multiple names to be one of two choice, the only real difference in the workload is that you used a vector of length one whereas I used one of length 128 to ensure that some of the work would end up on the worker threads (having |
I tend to disagree. I think the bug here is that the when work is dispatched onto worker threads via global interfaces like |
I was running it with enough items (even tried adding sleep to make sure all available threads were finding work). These asserts all pass when run against the current rayon release - with this PR the second run should assert "outer" instead. ThreadPoolBuilder::new()
.thread_name(|u| format!("global"))
.build_global()
.unwrap();
let stuff = vec![0; 50000];
stuff.par_iter().for_each(|_| {
assert_eq!(std::thread::current().name(), Some("global"));
std::thread::sleep(Duration::from_millis(1));
});
let outer_pool = ThreadPoolBuilder::new().thread_name(|u| "outer".to_owned()).build().unwrap();
outer_pool.in_place_scope(|_| {
stuff.par_iter().for_each(|_| {
assert_eq!(std::thread::current().name(), Some("global"));
std::thread::sleep(Duration::from_millis(1));
})
}); I still can't figure out what exactly is different with the test being run compared to this code. Based on that test, my code should fail. |
Exactly and with the currently pushed version, the code above does indeed fail with
I think I lost you on which test code we are talking about exactly. At least for the code posted in #1166 (comment), the problem was the vector length of one which meant there was no splitting at all and the single invocation happened directly on the main/test thread. Using more work meant |
In the end I don't think it matters much really, it's a tangential issue that shouldn't make a material difference in program execution since the calling thread is still blocked until the parallel iterator ends anyway. This PR does seems to address the issue. |
…allow reifying the ambient capability
Reproducing the issue in #1165.
Will into whether this can be changed without introducing deadlocks...