-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support when-then-otherwise #2258
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far! The TODOs seem to be in the right direction. Let me know if it would help to hop on a call and work on this together.
The condition column should always be bool.
Polars does implicit fallible casting in some cases, so checks to ensure the dtypes are the same (or at least the casts are infallible) are needed.
truthy: Arc::new(truthy.expr), | ||
falsy: Arc::new(falsy.expr), | ||
}, | ||
fill: None, // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually right: this is run before aggregation, so there's no empty group that needs a default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, some comments! Thanks.
python/test/test_polars_ternary.py
Outdated
lf_domain, | ||
dp.symmetric_distance(), | ||
lf.select( | ||
pl.when(pl.col("A") == 1).then(1).alias('fifty'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you can allow the computation to run, but the output domain may now contain null.
let (truthy_domain, _truthy_metric) = t_truthy.output_space(); | ||
let (falsy_domain, _falsy_metric) = t_falsy.output_space(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to check that the metrics match too!
let (truthy_domain, _truthy_metric) = t_truthy.output_space(); | ||
let (falsy_domain, _falsy_metric) = t_falsy.output_space(); | ||
|
||
if truthy_domain != falsy_domain { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only need to check that dtypes match. It's ok if the names of the columns in the branch arms are different, and similarly if nullability differs between them, and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to prevent polars fallible casting!
|
||
let mut output_domain = truthy_domain.clone(); | ||
output_domain.column.drop_bounds().ok(); | ||
output_domain.column.nullable = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output_domain.column.nullable = false; | |
output_domain.column.nullable |= falsey_domain.column.nullable; |
} | ||
|
||
let mut output_domain = truthy_domain.clone(); | ||
output_domain.column.drop_bounds().ok(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, if the truthy domain does not have nans and falsey domain has nans, then the resulting output domain would have made the claim that the data does not have nans. Instead, lets clear those descriptors:
This is a shorthand to completely replace the element domain with the loosest descriptors.
output_domain.column.drop_bounds().ok(); | |
output_domain.column.set_dtype(output_domain.column.dtype())?; |
@@ -0,0 +1 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests would be good to add!
Tests pass, but it was a lot of copy and paste, mostly: I need to understand better what's going on.
Things to try:
otherwise
is missing?