Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace file readers with ingest library's read_df #749

Open
znatty22 opened this issue Sep 21, 2021 · 0 comments
Open

Replace file readers with ingest library's read_df #749

znatty22 opened this issue Sep 21, 2021 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@znatty22
Copy link
Member

znatty22 commented Sep 21, 2021

We need to replace the individual file readers in creator.analyses.analyzer with the ingest library's read_df method.

The ingest library's read_df is already able to read all of the file formats covered by the individual read functions in creator.analyses.analyzer so these read functions are redundant. And more importantly, the ingest library's read_df automatically detects the encoding and properly decodes the file content. Right now the CSVDictReader assumes utf-8 encoding which can be wrong, resulting in the BOM left in the file. This has resulted in silent validation failures since the BOM may be part of the first file column and then is unrecognized by the validator.

Since this change would affect critical functionality in study creator, we can simply use read_df in the validation code to start.

@znatty22 znatty22 added the bug Something isn't working label Sep 21, 2021
@znatty22 znatty22 self-assigned this Sep 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant