You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.
Hi, I'm trying to optimize reading from Parquet files as outlined here: https://github.com/segmentio/parquet-go#optimizing-reads
I am using a schema-less reading approach, so there are no data classes or schemas defined in my application. I am trying to read columns (pages) of data in the form of Go slices, e.g. []float64.
My problem is this: In my parquet file, all columns are defined as optional, so I'm getting *parquet.optionalPageValues when reading a page. This does not implement DoubleReader and there does not seem to be any way to get the underlying "base" ValueReader (which supposedly is a DoubleReader). So at present, it is not possible to use optimized reads into Go slices directly.
Are there any overrides that could be used, for example to ignore that the columns are optional in parquet?
Many thanks.
The text was updated successfully, but these errors were encountered:
Is the same issue true for optimizing writes? Seems that its not possible to write optional fields even with parquet.Value since the underlying type of the ColumnBuffers for an int64 is []int64 instead of []*int64.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi, I'm trying to optimize reading from Parquet files as outlined here: https://github.com/segmentio/parquet-go#optimizing-reads
I am using a schema-less reading approach, so there are no data classes or schemas defined in my application. I am trying to read columns (pages) of data in the form of Go slices, e.g. []float64.
My problem is this: In my parquet file, all columns are defined as optional, so I'm getting *parquet.optionalPageValues when reading a page. This does not implement DoubleReader and there does not seem to be any way to get the underlying "base" ValueReader (which supposedly is a DoubleReader). So at present, it is not possible to use optimized reads into Go slices directly.
Are there any overrides that could be used, for example to ignore that the columns are optional in parquet?
Many thanks.
The text was updated successfully, but these errors were encountered: