Changes between Version 63 and Version 64 of SOP/CallingVariantsRNAseq


Ignore:
Timestamp:
09/06/17 09:57:11 (7 years ago)
Author:
krichard
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOP/CallingVariantsRNAseq

    v63 v64  
    1515\\
    1616
     17
    1718'''1 - Run the STAR 2-pass procedure to map reads to the reference genome.'''
    1819
    19 '''Index the reference genome for First Pass.'''
    20 
    21 
    22 
    23         Create folder, "FirstPass" before running these commands.
    24 
    25 To generate genome index files for STAR:
    26 {{{
    27 
    28 bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirFirstPass --genomeFastaFiles /path/to/genome/fasta --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8
    29 }}}
    30 
    31 
    32 To map your reads:
    33     '''Run this command within the FirstPass directory'''
    34 {{{
    35 bsub STAR --genomeDir /path/to/GenomeDirFirstPass --readFilesIn /path/to/Reads_1.fastq  --outFileNamePrefix whateverPrefix --runThreadN 8
    36 }}}
    37 
    38 '''Index the reference genome for Second Pass.'''
    39         Create folder, "SecondPass" before running these commands.
    40 
    41 To generate the 2nd pass genome index files for STAR:
    42 {{{
    43 
    44 bsub STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDirSecondPass --genomeFastaFiles /path/to/genome/fasta --sjdbFileChrStartEnd /path/to/first/pass/directory/SJ.out.tab --sjdbGTFfile /path/to/GTF/FileName.gtf --sjdbOverhang 100 --runThreadN 8
    45 }}}
    46 
    47 To map your reads:
    48     '''Run this command within the SecondPass directory'''
    49 {{{
    50 Input format: fastq ; output format: SAM
    51 bsub STAR --genomeDir /path/to/GenomeDirSecondPass --readFilesIn /path/to/Reads_1.fastq  --outFileNamePrefix whateverPrefix --runThreadN 8
    52 }}}
    53 
    54 
    55 The parameters included in the above sample commands are:
    56   * '''--sjdbOverhang  ''' Specifies the length of the genomic sequence around the annotated junction to be used in constructing the splice junctions database.  For short reads (<50) use readLength - 1, otherwise a generic value of 100 will work as well (see manual for more info).
    57   * '''--sjdbGTFfile <GTF_file.gtf>''' Supplies STAR with a GTF file during the genomeGenerate step.  Combined with the --sjdbScore <n> option during mapping, this will bias the alignment toward annotated junctions, and reduces alignment to pseudogenes.
    58   * '''--runMode <alignReads, genomeGenerate>'''   "alignReads" does the actual mapping. "genomeGenerate" generates the genomeDir required for mapping (default = alignReads).
    59   * '''--genomeDir </path/to/GenomeDir>'''  Specifies the path to the directory used for storing the genome information created in the genomeGenerate step.
    60   * '''--genomeFastaFiles <genome FASTA files>''' Specifies genome FASTA files to be used.
    61   * '''--sjdbFileChrStartEnd <output from first pass>  '''  path to the file with genomic coordinates for introns
    62   * '''--readFilesIn <Reads_1.fastq read2.fastq> ''' Specifies the fastq files containing the reads, can be single-end or paired-end.
    63   * '''--runThreadN <n> ''' Specifies the number of threads to use.
    64 
     20For the STAR 2-pass mapping procedure, please see our mapping SOP, under STAR. 
    6521
    6622'''2 - Replace ReadGroups, mark duplicate reads , clip intron overhangs and  reassign mapping qualities with[http://broadinstitute.github.io/picard/ Picard Tools] and [https://software.broadinstitute.org/gatk/download/ GATK]'''