Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To use STAR module, how should the GTF file look like #1347

Open
huma1995 opened this issue Jul 24, 2024 · 1 comment
Open

To use STAR module, how should the GTF file look like #1347

huma1995 opened this issue Jul 24, 2024 · 1 comment

Comments

@huma1995
Copy link

Currently the NF1 GTF File matches the NM_001042492.2.fasta file.
However, I am still getting this error :
ERROR ~ Error executing process > 'STAR_GENOMEGENERATE (NM_001042492.2.fasta)'

Caused by:
Process STAR_GENOMEGENERATE (NM_001042492.2.fasta) terminated with an error exit status (104)

Command executed:

samtools faidx NM_001042492.2.fasta
NUM_BASES=gawk '{sum = sum + $2}END{if ((log(sum)/log(2))/2 - 1 > 14) {printf "%.0f", 14} else {printf "%.0f", (log(sum)/log(2))/2 - 1}}' NM_001042492.2.fasta.fai

mkdir star
STAR
--runMode genomeGenerate
--genomeDir star/
--genomeFastaFiles NM_001042492.2.fasta
--sjdbGTFfile NF1.gtf
--runThreadN 4
--genomeSAindexNbases $NUM_BASES
--limitGenomeGenerateRAM 17079869184 \

cat <<-END_VERSIONS > versions.yml
"STAR_GENOMEGENERATE":
star: $(STAR --version | sed -e "s/STAR_//g")
samtools: $(echo $(samtools --version 2>&1) | sed 's/^.samtools //; s/Using.$//')
gawk: $(echo $(gawk --version 2>&1) | sed 's/^.GNU Awk //; s/, .$//')
END_VERSIONS

Command exit status:
104

Command output:
STAR --runMode genomeGenerate --genomeDir star/ --genomeFastaFiles NM_001042492.2.fasta --sjdbGTFfile NF1.gtf --runThreadN 4 --genomeSAindexNbases 6 --limitGenomeGenerateRAM 17079869184
STAR version: 2.7.9a compiled: 2021-05-04T09:43:56-0400 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
Jul 24 16:03:26 ..... started STAR run
Jul 24 16:03:26 ... starting to generate Genome files
Jul 24 16:03:26 ..... processing annotations GTF

Command error:

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29670027 29670153 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29670153 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29676138 29676269 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29676269 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29677201 29677336 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29677336 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29679275 29679432 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29679432 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29683478 29683600 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29683600 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29683978 29684108 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29684108 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29684287 29684387 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29684387 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29685498 29685640 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29685640 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29685987 29686033 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29686033 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29687505 29687721 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29687721 is larger than the chromosome 17 length = 8520 , will skip this exon

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29701031 29704695 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29704695 is larger than the chromosome 17 length = 8520 , will skip this exon

Fatal INPUT FILE error, no valid exon lines in the GTF file: NF1.gtf
Solution: check the formatting of the GTF file. One likely cause is the difference in chromosome naming between GTF and FASTA file.

Jul 24 16:03:26 ...... FATAL ERROR, exiting

Work dir:
/home/hz1/git/NF1_cDNA_Pipeline/work/30/fd85bc72f459bed4bf08b7aca5b43b

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

-- Check '.nextflow.log' file for details

Please could you look into this for me. Thank you

@pinin4fjords
Copy link
Member

The errors here are pretty clear:

!!!!! WARNING: while processing sjdbGTFfile=NF1.gtf, line:
17 hg19_refGene exon 29701031 29704695 0.000000 + . gene_id "NM_000267"; transcript_id "NM_000267";
exon end = 29704695 is larger than the chromosome 17 length = 8520 , will skip this exon

Have you, for example, checked chromosome 17 in your FASTA and found that it's longer than 8520?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants