Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Categorical/enum datatypes cannot be parsed in polars #15281

Open
mjclarke94 opened this issue Mar 10, 2025 · 3 comments · May be fixed by #15292
Open

Categorical/enum datatypes cannot be parsed in polars #15281

mjclarke94 opened this issue Mar 10, 2025 · 3 comments · May be fixed by #15292
Labels
dataframe Work related to the polars dataframe implementation needs-triage An issue that hasn't had any proper look

Comments

@mjclarke94
Copy link

Describe the bug

Nu cannot collect polars dataframes which have categorical datatypes.

How to reproduce

  1. Use polars to write a simple parquet file with a categorical data type
import polars as pl

pl.DataFrame({"foo": ['a', 'b']}, schema_overrides={'foo': pl.Categorical}).write_parquet('example.parquet')
  1. Attempt to read file
❯ polars open example.parquet | polars collect
Error:
  × Error creating Dataframe
  help: Value not supported in nushell: cat
  1. Show file is valid by casting to string
❯ polars open example.parquet | polars cast str foo | polars collect
╭───┬─────╮
│ # │ foo │
├───┼─────┤
│ 0 │ a   │
│ 1 │ b   │
╰───┴─────╯

Expected behavior

Nu dataframes should be able to handle categorical data types and other types that are part of the parquet standard/native polars data types.

Configuration

key value
version 0.101.0
major 0
minor 101
patch 0
branch
commit_hash
build_os macos-aarch64
build_target aarch64-apple-darwin
rust_version rustc 1.83.0 (90b35a623 2024-11-26) (Homebrew)
cargo_version cargo 1.83.0
build_time 2024-12-22 14:10:19 +00:00
build_rust_channel release
allocator mimalloc
features default, sqlite, trash
installed_plugins polars 0.101.0
@mjclarke94 mjclarke94 added the needs-triage An issue that hasn't had any proper look label Mar 10, 2025
@fdncred fdncred added the dataframe Work related to the polars dataframe implementation label Mar 10, 2025
@fdncred
Copy link
Contributor

fdncred commented Mar 10, 2025

@ayax79 when you get a minute, it would be good to get your take on this.

@ayax79
Copy link
Contributor

ayax79 commented Mar 10, 2025

It looks like I will need to add the "dtype-categorical" feature to polars and add nushell conversion. I'll try to get to this this week.

@ayax79 ayax79 linked a pull request Mar 11, 2025 that will close this issue
@ayax79
Copy link
Contributor

ayax79 commented Mar 11, 2025

Fixed via #15292

I will provide the ability write categorical and enum data later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dataframe Work related to the polars dataframe implementation needs-triage An issue that hasn't had any proper look
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants