-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable CPU execution of IncrementalPCA #6254
base: branch-25.02
Are you sure you want to change the base?
Conversation
from cuml.internals.safe_imports import gpu_only_import | ||
|
||
cp = gpu_only_import("cupy") | ||
|
||
|
||
def _create_rs_generator(random_state): | ||
""" | ||
This is a utility function that returns an instance of CuPy RandomState | ||
This is a utility function that returns an instance of CuPy/numpy | ||
RandomState depending on the current globally-selected device type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandomState depending on the current globally-selected device type | |
RandomState depending on the current globally-selected device type | |
return random_state | ||
else: | ||
raise ValueError("random_state type must be int or CuPy RandomState") | ||
raise ValueError("random_state type must be int or RandomState") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it useful to output the actual type of the thing that is of the wrong type in the error message. Makes it simpler to figure out what is wrong as a user (because presumably they think they are passing the right thing, but it turns out they aren't).
return cp.random.RandomState(seed=random_state) | ||
elif isinstance(random_state, cp.random.RandomState): | ||
return GlobalSettings().xpy.random.RandomState(seed=random_state) | ||
elif isinstance(random_state, GlobalSettings().xpy.random.RandomState): | ||
return random_state | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the global settings thing well enough (and this is an internal function), but I think it can happen that xpy
is cupy and the caller passes in a numpy random state. Maybe useful to have an explicit case for this to tell the caller that "yes you passed a random state, but the wrong kind"
@@ -195,6 +200,9 @@ class IncrementalPCA(PCA): | |||
0.0037122774558343763 | |||
""" | |||
|
|||
_cpu_estimator_import_path = "sklearn.decomposition.IncrementalPCA" | |||
|
|||
@device_interop_preparation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my education, why do we need this decorator? All it seems to do is process the constructor arguments to remove parameters that aren't arguments to the constructor. But why do we need this? What is the use-case where we want to allow someone to call the constructor with invalid keyword arguments and not error?
if sparse_format == "csc": | ||
pytest.skip( | ||
"cupyx.scipy.sparse.csc.csc_matrix does not support" | ||
" indexing as of cupy 7.6.0" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we remove the cases where sparse_format == "csc"
from the parametrisation? And then also remove this skipping. One less bit of noise in the test output log :D
At least it sounds like that indexing has been remove in an old cupy version -> it ain't coming back any time soon
reco_true = cp.dot(u_true * s, v_true) | ||
u_false, v_false = _svd_flip(u, v, u_based_decision=False) | ||
reco_false = cp.dot(u_false * s, v_false) | ||
@pytest.mark.parametrize("device", ["gpu", "cpu"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make a fixture that we can use in all tests that does this. That way we only have one place to change if we want to add a new device or select a subset of the devices, etc
Allow IncrementalPCA to run on CPU.
This PR depends on #6253 and should not be merged until after 6253 is merged. It is otherwise ready for review.