Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pb deepvariant #7363

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 15 additions & 15 deletions modules/nf-core/parabricks/deepvariant/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -7,35 +7,35 @@ process PARABRICKS_DEEPVARIANT {
container "nvcr.io/nvidia/clara/clara-parabricks:4.4.0-1"

input:
tuple val(meta), path(input), path(input_index), path(interval_file)
tuple val(meta), path(bam), path(bai), path(interval_file)
tuple val(ref_meta), path(fasta)
path model_file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also include a test that uses this model file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes good idea - @gburnett-nvidia and I will find a model file and add a test. Will have to also find a way to host the model file... somewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the model file can go into nf-core/test-datasets


output:
tuple val(meta), path("*.vcf"), optional: true, emit: vcf
tuple val(meta), path("*.g.vcf.gz"), optional: true, emit: gvcf
path "versions.yml", emit: versions
tuple val(meta), path("*.vcf"), optional: true, emit: vcf
tuple val(meta), path("*.g.vcf"), optional: true, emit: gvcf
Comment on lines +15 to +16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility to return gzipped vcf files as standard?

Copy link
Contributor

@gburnett-nvidia gburnett-nvidia Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now there is no way for Parabricks directly to return gzipped vcf files. It would have to be added as an additional step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maye you can add bgzip to the parabricks docker/singularity container? Then we could easily do it in the script section of the module. That would be very helpful!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path "versions.yml", emit: versions
Comment on lines +15 to +17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you align these nicely? (as it was before)


when:
task.ext.when == null || task.ext.when

script:
// Exit if running this module with -profile conda / -profile mamba
if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
exit 1, "Parabricks module does not support Conda. Please use Docker / Singularity / Podman instead."
}
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def output_file = ("--gvcf" =~ task.ext.args)? "${prefix}.g.vcf.gz" : "${prefix}.vcf"
def interval_file_command = interval_file ? interval_file.collect{"--interval-file $it"}.join(' ') : ""
def prefix = task.ext.suffix ? "${meta.id}${task.ext.suffix}" : "${meta.id}"
def output_file = ("--gvcf" =~ task.ext.args)? "${prefix}.g.vcf" : "${prefix}.vcf"
def interval_file_option = interval_file ? interval_file.collect{"--interval-file $it"}.join(' ') : ""
def model_command = model_file ? "--pb-model-file $model_file" : ""
def num_gpus = task.accelerator ? "--num-gpus $task.accelerator.request" : ''

"""
pbrun \\
deepvariant \\
--ref $fasta \\
--in-bam $input \\
--in-bam $bam \\
--out-variants $output_file \\
$interval_file_command \\
$num_gpus \\
${interval_file_option} \\
${num_gpus} \\
${model_command} \\
$args

cat <<-END_VERSIONS > versions.yml
Expand All @@ -46,7 +46,7 @@ process PARABRICKS_DEEPVARIANT {

stub:
def prefix = task.ext.prefix ?: "${meta.id}"
def output_cmd = ("--gvcf" =~ task.ext.args)? "echo '' | gzip > ${prefix}.g.vcf.gz" : "touch ${prefix}.vcf"
def output_cmd = ("--gvcf" =~ task.ext.args)? "echo '' | gzip > ${prefix}.g.vcf" : "touch ${prefix}.vcf"
"""
$output_cmd

Expand Down
14 changes: 9 additions & 5 deletions modules/nf-core/parabricks/deepvariant/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ input:
description: |
Groovy Map containing tumor sample information - id must match read groups for this sample.
[ id:'test']
- input:
- bam:
type: file
description: bam file for sample to be variant called.
pattern: "*.bam"
- input_index:
- bai:
type: file
description: bai index corresponding to input bam file. Only necessary if intervals
are provided.
Expand All @@ -44,6 +44,10 @@ input:
type: file
description: reference fasta - must be unzipped.
pattern: "*.fasta"
- - model_file:
type: file
description: custom deepvariant model file
pattern: "*,model"
output:
- vcf:
- meta:
Expand All @@ -56,15 +60,15 @@ output:
description: vcf file created with deepvariant (does not support .gz for normal vcf), optional
pattern: "*.vcf"
- gvcf:
- meta:
- meta:
type: map
description: |
Groovy Map containing sample information.
e.g. [ id:'test' ]
- "*.g.vcf.gz":
- "*.g.vcf":
type: file
description: bgzipped gvcf created with deepvariant, optional
pattern: "*.g.vcf.gz"
pattern: "*.g.vcf"
- versions:
- versions.yml:
type: file
Expand Down
21 changes: 20 additions & 1 deletion modules/nf-core/parabricks/deepvariant/tests/main.nf.test
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,12 @@ nextflow_process {

test("human - bam") {

config './nextflow.config'

when {
params {
module_args = ''
}
process {
"""
input[0] = [
Expand All @@ -25,6 +30,7 @@ nextflow_process {
[ id:'test'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[2] = []
"""
}
}
Expand All @@ -44,9 +50,15 @@ nextflow_process {

test("human - bam - intervals") {

config './nextflow.config'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is now added to all tests we can also move it to the top of the file avoiding too much code duplication :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes good idea - we can do that! This seemed to be the pattern used in other module tests, but happy to adjust!


when {
params {
module_args = ''
}
process {
"""

input[0] = [
[ id:'test'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true),
Expand All @@ -57,6 +69,7 @@ nextflow_process {
[ id:'ref'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[2] = []
"""
}
}
Expand Down Expand Up @@ -94,6 +107,7 @@ nextflow_process {
[ id:'ref'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[2] = []
"""
}
}
Expand Down Expand Up @@ -132,6 +146,7 @@ nextflow_process {
[ id:'ref'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[2] = []
"""
}
}
Expand All @@ -155,6 +170,9 @@ nextflow_process {
options "-stub"

when {
params {
module_args = ''
}
process {
"""
input[0] = [
Expand All @@ -167,6 +185,7 @@ nextflow_process {
[ id:'test'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[2] = []
"""
}
}
Expand All @@ -185,7 +204,6 @@ nextflow_process {

test("human - bam - intervals - gvcf - stub") {

config './nextflow.config'
options "-stub"

when {
Expand All @@ -204,6 +222,7 @@ nextflow_process {
[ id:'test'],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)
]
input[2] = []
"""
}
}
Expand Down
18 changes: 9 additions & 9 deletions modules/nf-core/parabricks/deepvariant/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
Expand Up @@ -404,29 +404,29 @@
"content": [
{
"0": [

],
"1": [
[
{
"id": "test"
},
"test.g.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940"
"test.vcf:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"1": [

],
"2": [
"versions.yml:md5,c7bcf80d609e2951eb99f0b774cd2f6d"
],
"gvcf": [

],
"vcf": [
[
{
"id": "test"
},
"test.g.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940"
"test.vcf:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"vcf": [

],
"versions": [
"versions.yml:md5,c7bcf80d609e2951eb99f0b774cd2f6d"
Expand All @@ -442,7 +442,7 @@
"nf-test": "0.9.2",
"nextflow": "24.10.2"
},
"timestamp": "2024-12-16T11:13:01.9854302"
"timestamp": "2025-01-23T20:31:31.427166739"
},
"human - bam - gvcf": {
"content": [
Expand Down
3 changes: 2 additions & 1 deletion modules/nf-core/parabricks/deepvariant/tests/nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ process {

withName: 'PARABRICKS_DEEPVARIANT' {
ext.args = params.module_args
containerOptions = '--gpus all'
}

}
}
Loading