Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGE gCNV #206

Merged
merged 14 commits into from
Mar 6, 2025
Merged

BGE gCNV #206

merged 14 commits into from
Mar 6, 2025

Conversation

kachulis
Copy link
Collaborator

Pull BGE production pipeline from gatk branch to here.

@kachulis
Copy link
Collaborator Author

I will test for identical results, and probably add a test as well

@kachulis
Copy link
Collaborator Author

kachulis commented Mar 6, 2025

Results confirmed to be identical to gatk wdl

Copy link
Contributor

@rickymagner rickymagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, assuming the WDL has been validated elsewhere. Just had a question below out of curiosity.

"CNVCallingAndMergeForFabric.filtered_cnv_genotyped_segments_vcf_md5sum": null,
"CNVCallingAndMergeForFabric.merged_vcf": {
"file": "gs://palantir-workflows-test-data/CNVCallingAndMergeForFabric/0437227296_subset.merged.vcf.gz",
"line_skip_regex": "^##bcftools|^##GATKCommandLine=|^##FORMAT=<ID=GT|^##source="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why skip the FORMAT=<ID=GT line in the header? Is this not standardized so it fails because there's a different description or something?

Copy link
Collaborator Author

@kachulis kachulis Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, essentially. It turns out that MergeVcfs in Picard generates a non-deterministic header with respect to which input vcf it will pull the FORMAT=<ID=GT line from. In our case, the header line includes a different description depending on whether it gets pulled from the gcnv vcf or the dragen vcf.

@kachulis kachulis merged commit 02882bd into main Mar 6, 2025
6 checks passed
@kachulis kachulis deleted the ck_gcnv branch March 6, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants