Context Navigation

Changes between Version 63 and Version 64 of SOP/CallingVariantsRNAseq

Timestamp:: 09/06/17 09:57:11 (8 years ago)
Author:: krichard
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOP/CallingVariantsRNAseq

-              v63
+              v64
 \\
 '''1 - Run the STAR 2-pass procedure to map reads to the reference genome.'''
+'''Index the reference genome for First Pass.'''
+        Create folder, "FirstPass" before running these commands.
+To generate genome index files for STAR:
+{{{
+bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirFirstPass --genomeFastaFiles /path/to/genome/fasta --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8
+}}}
+To map your reads:
+    '''Run this command within the FirstPass directory'''
+{{{
+bsub STAR --genomeDir /path/to/GenomeDirFirstPass --readFilesIn /path/to/Reads_1.fastq  --outFileNamePrefix whateverPrefix --runThreadN 8
+}}}
+'''Index the reference genome for Second Pass.'''
+        Create folder, "SecondPass" before running these commands.
+To generate the 2nd pass genome index files for STAR:
+{{{
+bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirSecondPass --genomeFastaFiles /path/to/genome/fasta --sjdbFileChrStartEnd /path/to/first/pass/directory/SJ.out.tab --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8
+}}}
+To map your reads:
+    '''Run this command within the SecondPass directory'''
+{{{
+Input format: fastq ; output format: SAM
+bsub STAR --genomeDir /path/to/GenomeDirSecondPass --readFilesIn /path/to/Reads_1.fastq  --outFileNamePrefix whateverPrefix --runThreadN 8
+}}}
+The parameters included in the above sample commands are:
+  * '''--sjdbOverhang  ''' Specifies the length of the genomic sequence around the annotated junction to be used in constructing the splice junctions database.  For short reads (<50) use readLength - 1, otherwise a generic value of 100 will work as well (see manual for more info).
+  * '''--sjdbGTFfile <GTF_file.gtf>''' Supplies STAR with a GTF file during the genomeGenerate step.  Combined with the --sjdbScore <n> option during mapping, this will bias the alignment toward annotated junctions, and reduces alignment to pseudogenes.
+  * '''--runMode <alignReads, genomeGenerate>'''   "alignReads" does the actual mapping. "genomeGenerate" generates the genomeDir required for mapping (default = alignReads).
+  * '''--genomeDir </path/to/GenomeDir>'''  Specifies the path to the directory used for storing the genome information created in the genomeGenerate step.
+  * '''--genomeFastaFiles <genome FASTA files>''' Specifies genome FASTA files to be used.
+  * '''--sjdbFileChrStartEnd <output from first pass>  '''  path to the file with genomic coordinates for introns
+  * '''--readFilesIn <Reads_1.fastq read2.fastq> ''' Specifies the fastq files containing the reads, can be single-end or paired-end.
+  * '''--runThreadN <n> ''' Specifies the number of threads to use.
+For the STAR 2-pass mapping procedure, please see our mapping SOP, under STAR.
 '''2 - Replace ReadGroups, mark duplicate reads , clip intron overhangs and  reassign mapping qualities with[http://broadinstitute.github.io/picard/ Picard Tools] and [https://software.broadinstitute.org/gatk/download/ GATK]'''