-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: enable skipna on groupby reduction ops #43671
Conversation
Thanks for reheating this @geoffrey-eisenbarth! Might not have much time this week, but will look deeper at the cython and testing when I can. For now, I'd recommend fixing up the easier issues like With respect to testing, I'd take a look at the tests we have for groupby ops which do accept Also, until we add tests, probably best to put the pr in draft state so others know it's not fully ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial syntax comments, will take a more thorough look at logic when I can
pandas/_libs/groupby.pyx
Outdated
@@ -709,6 +723,11 @@ def group_mean(floating[:, ::1] out, | |||
t = sumx[lab, j] + y | |||
compensation[lab, j] = t - sumx[lab, j] - y | |||
sumx[lab, j] = t | |||
# don't skip nan | |||
elif skipna == False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This skipna
is missing in the function arguments, causing a build failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also should be elif not skipna
pandas/_libs/groupby.pyx
Outdated
@@ -603,6 +612,10 @@ def group_prod(floating[:, ::1] out, | |||
if val == val: | |||
nobs[lab, j] += 1 | |||
prodx[lab, j] *= val | |||
# don't skip nan | |||
elif skipna == False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif skipna == False: | |
elif not skipna: |
pandas/_libs/groupby.pyx
Outdated
@@ -555,6 +559,10 @@ def group_add(add_t[:, ::1] out, | |||
t = sumx[lab, j] + y | |||
compensation[lab, j] = t - sumx[lab, j] - y | |||
sumx[lab, j] = t | |||
# don't skip nan | |||
elif skipna == False: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif skipna == False: | |
elif not skipna: |
pandas/_libs/groupby.pyx
Outdated
@@ -530,6 +531,9 @@ def group_add(add_t[:, ::1] out, | |||
else: | |||
t = sumx[lab, j] + val | |||
sumx[lab, j] = t | |||
elif skipna == False: | |||
# NOTE: Does this case need to be considered? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. if skipna is False and not checknull(val)
(L524 above) then we sumx[lab, j]
needs to be incremented by val
(so will either become NaN or raise)
Is CircleCI a new addition to the pandas contribution flow? I don't recall running into this when I worked on PR 41321. When I click on the details, it says I need a configuration specified for this project. Is that something I need to do on my own? |
If you merge master it should pick up the config and fix this failure |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
@geoffrey-eisenbarth planning to pick this back up? it looked promising |
@jbrockmendel I was hoping to, since I rely on the old |
No sweat, totally reasonable to have higher priorities under the circumstances. Not sure about the circleci bit; often times I re-push and hope for CI issues to resolve themselves. |
Hello @geoffrey-eisenbarth! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
|
would likely take this, but needs to be passing an fully code-reviewed. closing as stale. |
@mzeitlin11 Any help you can provide on what tests I should add, or where to start digging for similar
GroupBy
tests would be appreciated. As stated in PR #41399, I'm not super familiar with Cython, so hoping that the previous contributors (stale) PR is close to what's needed (I suppose the tests will tell us).Thanks!