Name	Name	Last commit message	Last commit date
Latest commit dfornika Parse quast report to csv (#16 ) Dec 2, 2021 f8a6416 · Dec 2, 2021 History 24 Commits
bin	bin	Parse quast report to csv (#16 )	Dec 2, 2021
environments	environments	Provenance (#11 )	Nov 26, 2021
modules	modules	Parse quast report to csv (#16 )	Dec 2, 2021
.gitignore	.gitignore	Initial commit	Feb 13, 2021
README.md	README.md	Parse quast report to csv (#16 )	Dec 2, 2021
main.nf	main.nf	Provenance (#11 )	Nov 26, 2021
nextflow.config	nextflow.config	Tag outputs with assembler name (#15 )	Dec 1, 2021

Repository files navigation

Routine Assembly

A generic pipeline for creating routine draft assemblies

Analyses

Read trimming & QC: fastp
Genome Assembly: shovill or unicycler
Gene Annotation: prokka or bakta
Assembly QC: quast

Usage

By default, shovill and prokka will be used:

nextflow run BCCDC-PHL/routine-assembly-nf \
  --fastq_input <fastq input directory> \
  --outdir <output directory>

Unicycler and/or bakta can be used with the --unicycler and --bakta flags:

nextflow run BCCDC-PHL/routine-assembly-nf \
  --fastq_input <fastq input directory> \
  --unicycler \
  --bakta \
  --outdir <output directory>

Any combination of shovill/unicycler and prokka/bakta is supported: Shovill with bakta:

nextflow run BCCDC-PHL/routine-assembly-nf \
  --fastq_input <fastq input directory> \
  --bakta \
  --outdir <output directory>

Unicycler with prokka:

nextflow run BCCDC-PHL/routine-assembly-nf \
  --fastq_input <fastq input directory> \
  --unicycler \
  --outdir <output directory>

The pipeline also supports a 'samplesheet input' mode. Pass a samplesheet.csv file with the headers ID, R1, R2:

nextflow run BCCDC-PHL/routine-assembly-nf \
  --samplesheet_input <samplesheet.csv> \
  --outdir <output directory>

Output

An output directory will be created for each sample under the directory provided with the --outdir flag. The directory will be named by sample ID, inferred from the fastq files (all characters before the first underscore in the fastq filenames).

If we have sample-01_R{1,2}.fastq.gz, the output directory will be:

sample-01
├── sample-01_20211125165316_provenance.yml
├── sample-01_fastp.csv
├── sample-01_fastp.json
├── sample-01_shovill_prokka.gbk
├── sample-01_shovill_prokka.gff
├── sample-01_shovill_quast.csv
├── sample-01_shovill.fa
└── sample-01_shovill.log

Including the tool name suffixes to output files allows re-analysis of the same sample with multiple tools without conflicting output filenames:

sample-01
├── sample-01_20211125165316_provenance.yml
├── sample-01_20211128122118_provenance.yml
├── sample-01_unicycler_bakta.gbk
├── sample-01_unicycler_bakta.gff
├── sample-01_unicycler_bakta.json
├── sample-01_unicycler_bakta.log
├── sample-01_fastp.csv
├── sample-01_fastp.json
├── sample-01_shovill_prokka.gbk
├── sample-01_shovill_prokka.gff
├── sample-01_shovill_quast.csv
├── sample-01_unicycler_quast.csv
├── sample-01_shovill.fa
├── sample-01_shovill.log
├── sample-01_unicycler.fa
├── sample-01_unicycler.gfa
└── sample-01_unicycler.log

Provenance files

For each pipeline invocation, each sample will produce a provenance.yml file with the following contents:

- tool_name: fastp
  tool_version: 0.23.1
- tool_name: shovill
  tool_version: 1.1.0
- tool_name: prokka
  tool_version: 1.14.5
- tool_name: quast
  tool_version: 5.0.2
- input_filename: sample-01_R1.fastq.gz
  sha256: 4ac3055ac5f03114a005aff033e7018ea98486cbebdae669880e3f0511ed21bb
- input_filename: sample-01_R2.fastq.gz
  sha256: 8db388f56a51920752319c67b5308c7e99f2a566ca83311037a425f8d6bb1ecc
- pipeline_name: BCCDC-PHL/routine-assembly
  pipeline_version: 0.1.0
- timestamp_analysis_start: 2021-11-25T16:53:10.549863

The filename of the provenance file includes a timestamp with format YYYYMMDDHHMMSS to ensure that re-analysis of the same sample will create a unique provenance.yml file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Routine Assembly

Analyses

Usage

Output

Provenance files

About

Releases 12

Packages 2

Contributors 2

Languages

BCCDC-PHL/routine-assembly

Folders and files

Latest commit

History

Repository files navigation

Routine Assembly

Analyses

Usage

Output

Provenance files

About

Topics

Resources

Stars

Watchers

Forks

Releases 12

Packages 2

Contributors 2

Languages