-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect and report missing data in GWO ingest #626
Comments
Should any of these three changes lead to a stop in the ingestion process? Or do we just want to report these things? |
I think just report these things but maybe we should ask @allisonheath |
I've been thinking through the 3rd checks here and it seems to me some parts of it should be done elsewhere. For example, checking that a harmonized genomic file's corresponding genomic file doesn't exist is something we can do immediately just using the GWO manifest itself. Query the dataservice for genomic-files which match the source file column entries and if any are missing then we have a problem. EDIT: On second thought it probably is better to do it in the |
@gsantia Yea the issue I wrote up might not be exactly how it turns out to be implemented. You will prob have a better idea since you're doing the implementation. The important thing is we're able to record and report any missing data which we feel is important for the user to know about |
The study creator's
GenomicDataLoader
currently does not detect any discrepancies between the GWO manifest and S3 or between the GWO manifest and the Dataservice. This is an important part of the analysts' current manual process of loading the harmonized genomic file info into the Dataservice.Each of the 3 load functions in the
GenomicDataLoader
should be modified to detect discrepancies and report them either through log statements and/or event firing.Specifics:
In
load_harmonized_genomic_files
method:In
load_specimen_harmonized_gf_links
method:In
load_seq_exp_harmonized_genomic_files
method:The text was updated successfully, but these errors were encountered: