This Nextflow pipeline is designed for the analysis of Twist NGS Methylation data, including quality control, alignment, methylation calling, differential methylation analysis, and post-processing. It integrates various tools and custom scripts to provide a comprehensive analysis workflow.
Step | Workflow |
---|---|
Generate Reference Genome Index (optional) | Bismark |
Raw data QC | FastQC |
Adapter sequence trimming | Trim Galore |
Align Reads | Bismark (bowtie2) |
Deduplicate Alignments | Bismark |
Sort and indexing | Samtools |
Extract Methylation Calls | Bismark |
Sample Report | Bismark |
Summary Report | Bismark |
Alignment QC | Qualimap |
QC Reporting | MultiQC |
Differential Methylation Analysis | EdgeR/MethylKit |
Post processing | ggplot2 |
GO analysis | Gene Ontology |
- Nextflow (>=21.10.3)
- Docker or Singularity (for containerized execution)
- Java (>=8)
-
User can start from the FASTQ files or Bismark aligned BAM files. Find the details on the manual
-
User can choose to run the differential methylation analysis - either EdgeR or MethylKit or both. Find the details on the manual
-
User can also use
--skip_diff_meth
to avoid the differential methylation analysis.
# when using the reference genome indexing, --genome_fasta
nextflow run JD2112/TwistNext \
-profile singularity \
--sample_sheet Sample_sheet_twist.csv \
--genome_fasta /data/reference_genome/hg38/hg38.fa \
--run_both_methods \
--gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
--refseq_file /data/hg19_RefSeq.bed.gz \
--outdir Results/TwistNext_both
# if you already have the bisulfite genome index, --bismark_index
nextflow run JD2112/TwistNext \
-profile singularity \
--sample_sheet Sample_sheet_twist.csv \
--bismark_index /data/reference_genome/hg38/ \
--run_both_methods \
--gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
--refseq_file /data/hg19_RefSeq.bed.gz \
--outdir /mnt/Results/TwistNext_both
# when using the reference genome indexing, --genome_fasta
nextflow run JD2112/TwistNext \
-profile singularity \
--sample_sheet Sample_sheet_twist.csv \
--genome_fasta /data/reference_genome/hg38/hg38.fa \
--diff_meth_method edger \
--refseq_file /data/hg19_RefSeq.bed.gz \
--outdir Results/TwistNext_edgeR
# if you already have the bisulfite genome index, --bismark_index
nextflow run JD2112/TwistNext \
-profile singularity \
--sample_sheet Sample_sheet_twist.csv \
--bismark_index /data/reference_genome/hg38/ \
--diff_meth_method edger \
--refseq_file /data/hg19_RefSeq.bed.gz \
--outdir /mnt/Results/TwistNext_edgeR
# when using the reference genome indexing, --genome_fasta
nextflow run JD2112/TwistNext \
-profile singularity \
--sample_sheet Sample_sheet_twist.csv \
--genome_fasta /data/reference_genome/hg38/hg38.fa \
--diff_meth_method methylkit \
--gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
--outdir Results/TwistNext_methylKit
# if you already have the bisulfite genome index, --bismark_index
nextflow run JD2112/TwistNext \
-profile singularity \
--sample_sheet Sample_sheet_twist.csv \
--bismark_index /data/reference_genome/hg38/ \
--diff_meth_method methylkit \
--gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
--outdir Results/TwistNext_methylKit
options | Description |
---|---|
--sample_sheet |
Path to the sample sheet CSV file (required) |
--bismark_index |
Path to the Bismark index directory (required unless --genome or --aligned_bams is provided) |
--genome |
Path to the reference genome FASTA file (required if --bismark_index not provided) |
--aligned_bams |
Path to aligned BAM files (use this to start from aligned BAM files instead of FASTQ files) |
--refseq_file |
Path to RefSeq file for annotation (reuired to run both or methylkit ) |
--gtf_file |
Path to GTF file for annotation (reuired to run both or edger ) |
--outdir |
Output directory (default: ./results) |
--diff_meth_method |
Differential methylation method to use: 'edger' or 'methylkit' (default: edger) |
--run_both_methods |
Run both edgeR and methylkit for differential methylation analysis (default: false) |
--skip_diff_meth |
Skip differential methylation analysis (default: false) |
--coverage_threshold |
Minimum read coverage to consider a CpG site (default: 10) |
--logfc_cutoff |
Differential methylation cut-off for Volcano or MA plot (default: 1.5) |
--pvalue_cutoff |
Differential methylation P-value cut-off for Volcano or MA plot (default: 0.05) |
--hyper_color |
Hypermethylation color for Volcano or MA plot (default: red) |
--hypo_cutoff |
Hypomethylation color for Volcano or MA plot (default: blue) |
--nonsig_color |
Non-significant color for Volcano or MA plot (default: black) |
--compare_str |
Comparison string for differential analysis (e.g. "Group1-Group2") |
--top_n_genes |
Number of top differentially methylated genes to report for GOplot (default: 100) |
--help |
Show this help message and exit |
nextflow run JD2112/TwistNext --help --outdir .
Find the details on the manual
-
Main Author:
- Jyotirmoy Das (@JD2112)
-
Collaborators:
- Debojyoti Das (@BioDebojyoti)
- Leila Nasirzadeh (@Lailanasd)
Das, J. (2024). TwistNext (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.14204261
Please check the manual for details.
Please create issues on github.
We would like to acknowledge the Core Facility, Faculty of Medicine and Health Sciences, Linköping University, Linköping, Sweden and Clinical Genomics Linköping, Science for Life Laboratory, Sweden for their support. We are grateful to PDC (KTH, Sweden) support for computational support to test and validate the pipeline on the Dardel HPC.