Skip to content

JD2112/TwistNext

DOI GitBook Docs GitHub Invite Collaborators

wakatime

Overview

This Nextflow pipeline is designed for the analysis of Twist NGS Methylation data, including quality control, alignment, methylation calling, differential methylation analysis, and post-processing. It integrates various tools and custom scripts to provide a comprehensive analysis workflow.

Features

Step Workflow
Generate Reference Genome Index (optional) Bismark
Raw data QC FastQC
Adapter sequence trimming Trim Galore
Align Reads Bismark (bowtie2)
Deduplicate Alignments Bismark
Sort and indexing Samtools
Extract Methylation Calls Bismark
Sample Report Bismark
Summary Report Bismark
Alignment QC Qualimap
QC Reporting MultiQC
Differential Methylation Analysis EdgeR/MethylKit
Post processing ggplot2
GO analysis Gene Ontology

Pipeline Schema

Requirements

Usage

  1. User can start from the FASTQ files or Bismark aligned BAM files. Find the details on the manual

  2. User can choose to run the differential methylation analysis - either EdgeR or MethylKit or both. Find the details on the manual

  3. User can also use --skip_diff_meth to avoid the differential methylation analysis.

--run_both_methods

# when using the reference genome indexing, --genome_fasta

nextflow run JD2112/TwistNext \
    -profile singularity \
    --sample_sheet Sample_sheet_twist.csv \
    --genome_fasta /data/reference_genome/hg38/hg38.fa \ 
    --run_both_methods \
    --gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
    --refseq_file /data/hg19_RefSeq.bed.gz \
    --outdir Results/TwistNext_both 


# if you already have the bisulfite genome index, --bismark_index

nextflow run JD2112/TwistNext \
    -profile singularity \
    --sample_sheet Sample_sheet_twist.csv \
    --bismark_index /data/reference_genome/hg38/ \ 
    --run_both_methods \
    --gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
    --refseq_file /data/hg19_RefSeq.bed.gz \    
    --outdir /mnt/Results/TwistNext_both

--diff_meth_method EdgeR

# when using the reference genome indexing, --genome_fasta

nextflow run JD2112/TwistNext \
    -profile singularity \
    --sample_sheet Sample_sheet_twist.csv \
    --genome_fasta /data/reference_genome/hg38/hg38.fa \ 
    --diff_meth_method edger \
    --refseq_file /data/hg19_RefSeq.bed.gz \
    --outdir Results/TwistNext_edgeR 


# if you already have the bisulfite genome index, --bismark_index

nextflow run JD2112/TwistNext \
    -profile singularity \
    --sample_sheet Sample_sheet_twist.csv \
    --bismark_index /data/reference_genome/hg38/ \ 
    --diff_meth_method edger \
    --refseq_file /data/hg19_RefSeq.bed.gz \
    --outdir /mnt/Results/TwistNext_edgeR 

--diff_meth_method MethylKit

# when using the reference genome indexing, --genome_fasta

nextflow run JD2112/TwistNext \
    -profile singularity \
    --sample_sheet Sample_sheet_twist.csv \
    --genome_fasta /data/reference_genome/hg38/hg38.fa \ 
    --diff_meth_method methylkit \
    --gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
    --outdir Results/TwistNext_methylKit 


# if you already have the bisulfite genome index, --bismark_index

nextflow run JD2112/TwistNext \
    -profile singularity \
    --sample_sheet Sample_sheet_twist.csv \
    --bismark_index /data/reference_genome/hg38/ \ 
    --diff_meth_method methylkit \
    --gtf_file /data/Homo_sapiens.GRCh38.104.gtf \
    --outdir Results/TwistNext_methylKit 

Options:


options Description
--sample_sheet Path to the sample sheet CSV file (required)
--bismark_index Path to the Bismark index directory (required unless --genome or --aligned_bams is provided)
--genome Path to the reference genome FASTA file (required if --bismark_index not provided)
--aligned_bams Path to aligned BAM files (use this to start from aligned BAM files instead of FASTQ files)
--refseq_file Path to RefSeq file for annotation (reuired to run both or methylkit)
--gtf_file Path to GTF file for annotation (reuired to run both or edger)
--outdir Output directory (default: ./results)
--diff_meth_method Differential methylation method to use: 'edger' or 'methylkit' (default: edger)
--run_both_methods Run both edgeR and methylkit for differential methylation analysis (default: false)
--skip_diff_meth Skip differential methylation analysis (default: false)
--coverage_threshold Minimum read coverage to consider a CpG site (default: 10)
--logfc_cutoff Differential methylation cut-off for Volcano or MA plot (default: 1.5)
--pvalue_cutoff Differential methylation P-value cut-off for Volcano or MA plot (default: 0.05)
--hyper_color Hypermethylation color for Volcano or MA plot (default: red)
--hypo_cutoff Hypomethylation color for Volcano or MA plot (default: blue)
--nonsig_color Non-significant color for Volcano or MA plot (default: black)
--compare_str Comparison string for differential analysis (e.g. "Group1-Group2")
--top_n_genes Number of top differentially methylated genes to report for GOplot (default: 100)
--help Show this help message and exit

HELP

nextflow run JD2112/TwistNext --help --outdir .

Find the details on the manual

Credits

Citation

Das, J. (2024). TwistNext (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.14204261

HELP/FAQ/Troubleshooting

Please check the manual for details.

Please create issues on github.

License(s)

GNU-3 public license.

Acknowledgement

We would like to acknowledge the Core Facility, Faculty of Medicine and Health Sciences, Linköping University, Linköping, Sweden and Clinical Genomics Linköping, Science for Life Laboratory, Sweden for their support. We are grateful to PDC (KTH, Sweden) support for computational support to test and validate the pipeline on the Dardel HPC.