Skip to content
This repository has been archived by the owner on May 21, 2022. It is now read-only.

Propose some changes to DataFrames.jl API #68

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bkamins
Copy link

@bkamins bkamins commented Sep 14, 2021

I will comment on the rationale of the changes inline (as I might not understand all the details of the intended design).

@@ -13,7 +13,7 @@ Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"

[compat]
DataFrames = "0.19, 0.20, 0.21, 0.22, 1"
DataFrames = "0.22, 1"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.22 introduces only

@@ -6,7 +6,7 @@ LearnBase.getobs(dt::AbstractDataFrame, idx) = dt[idx,:]

LearnBase.nobs(dt::DataFrameRow) = 1 # it is a observation
function LearnBase.getobs(dt::DataFrameRow, idx)
idx == 1:1 || throw(ArgumentError(
only(idx) == 1 || throw(ArgumentError(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that idx can be any collection as long as it has a single element equal to 1. Right?

@@ -23,9 +23,12 @@ LearnBase.gettarget(::typeof(identity), dt::DataFrameRow) =
_throw_table_error()

# convenience syntax to allow column name
LearnBase.gettarget(col::Symbol, dt::AbstractDataFrame) = dt[1, col]
LearnBase.gettarget(col::Symbol, dt::AbstractDataFrame) =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that for this to work we should make sure that the data frame has only one row. Right?

Also the question is why do we only allow Symbol for col. Also integer or string index is allowed in DataFrames.jl.

LearnBase.gettarget(col::Symbol, dt::DataFrameRow) = dt[col]
LearnBase.gettarget(fun, dt::AbstractDataFrame) = fun(dt)
LearnBase.gettarget(fun, dt::AbstractDataFrame) =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the intention correctly the fun form should also normally take a DataFrameRow as an argument. Right? This change is breaking (and I might have misunderstood the intention here).

@bkamins
Copy link
Author

bkamins commented Sep 14, 2021

CI is failing because only is not defined in Julia 1.0 but this can be fixed later after we confirm the design intention.

@johnnychen94
Copy link
Member

Looks okay to me. Since I don't use DF very much, @oxinabox how does this look to you?

@oxinabox
Copy link
Member

Seems right to me though I haven't looked at LearnBase in literally years, and doing #46 seems like a better idea, but this is a net improvement.

@bkamins
Copy link
Author

bkamins commented Sep 18, 2021

Ah - I have not seen #46. Indeed this PR should be closed if someone is working on #46.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants