Context Navigation

Changes between Version 9 and Version 10 of SOPs/mapping

Timestamp:: 09/12/13 10:02:17 (12 years ago)
Author:: gbell
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/mapping

-              v9
+              v10
 }}}
 The parameters included in the sample command are:
+The parameters included in the sample command:
   * '''-L <int>'''     length of seed substrings; must be >3 and <32 (default=22)
   * '''-N <int>'''     max # mismatches in seed alignment; can be 0 or 1 (default=0)
-  * '''--phred64'''    (if input quals are from GA Pipeline ver. >= 1.3 and before Illumina 1.8)  See the table at the top of FastQC output to identify the "encoding" scale [[br]]
   * '''-S'''           name of SAM output file
+Choices for fastq encoding (which is listed as "Encoding" in the top "Basic Statistics" table of the FastQC output file).  See the [http://en.wikipedia.org/wiki/FASTQ_format FASTQ format page] for more details.
+  * '''--solexa-quals'''         (for input quality scores from Illumina versions 1.2 and earlier)
+  * '''--phred64'''     (for input quality scores from Illumina versions 1.3-1.7)
+  * '''--phred33'''         (default "Sanger format"; for input quality scores from Illumina versions 1.8 and later)
 bowtie2 can also perform local alignments where the unaligned end(s) of a read are clipped (so, for example, remaining adapter won't prevent alignment) by adding the argument '''--local'''.
 …
 Sample command:
 {{{
 bsub tophat -o s_7_tophat_out --phred64-quals --no-novel-juncs --segment-length 20 -G /nfs/genomes/mouse_gp_jul_07_no_random/gtf/Mus_musculus.NCBIM37.67_noNT.gtf /nfs/genomes/mouse_gp_jul_07_no_random/bowtie/mm9 s_7.txt
+bsub tophat -o s_7_tophat_out --phred64-quals --segment-length 20 -G /nfs/genomes/mouse_gp_jul_07_no_random/gtf/Mus_musculus.NCBIM37.67_noNT.gtf --no-novel-juncs /nfs/genomes/mouse_gp_jul_07_no_random/bowtie/mm9 s_7.txt
 }}}
 The parameters included in the sample command are:
   * '''-o/--output-dir <word>'''     All output files will be created in this directory (default = tophat_out)
+  * '''--solexa-quals'''             (if input quals are from GA Pipeline ver. < 1.3)  See the table at the top of FastQC output to identify the "encoding" scale [[br]]
+  * '''--phred64-quals''' or '''solexa1.3-quals'''   (if input quals are from GA Pipeline ver. >= 1.3 before Illumina 1.8)  See the table at the top of FastQC output to identify the "encoding" scale [[br]]
+  * '''--segment-length <int>'''  Shortest length of a spliced read that can map to one side of the junction.  For reads shorter than ~45 nt, set this to half the read length (so set '--segment-length 20' for 40-nt reads).  For longer reads, the default length (25) can be used.
+  * '''-G <GTF file>''' Supply bowtie with a GTF file of transcript models.  This can help bowtie identify functions that may otherwise be missed.
+  * '''--no-novel-juncs ''' Only look for spliced reads across junctions in the supplied GTF file.  Typically not used.
+Choices for fastq encoding (which is listed as "Encoding" in the top "Basic Statistics" table of the FastQC output file).  See the [http://en.wikipedia.org/wiki/FASTQ_format FASTQ format page] for more details.
+  * '''--solexa-quals'''         (for input quality scores from Illumina versions 1.2 and earlier)
+  * '''--solexa1.3-quals''' or '''--phred64-quals'''     (for input quality scores from Illumina versions 1.3-1.7)
+'''[http://tophat.cbcb.umd.edu/ tophat version 2]'''
+TopHat version 2 uses bowtie2, rather than bowtie, for its mapping.
+Sample command:
+{{{
+bsub tophat -o s_7_tophat_out --phred64-quals --segment-length 20 -G /nfs/genomes/mouse_gp_jul_07_no_random/gtf/Mus_musculus.NCBIM37.67_noNT.gtf --no-novel-juncs /nfs/genomes/mouse_gp_jul_07_no_random/bowtie/mm9 s_7.txt
+}}}
+The parameters included in the sample command are:
+  * '''-o/--output-dir <word>'''     All output files will be created in this directory (default = tophat_out)
+  * '''--segment-length <int>'''  Shortest length of a spliced read that can map to one side of the junction.  For reads shorter than ~45 nt, set this to half the read length (so set '--segment-length 20' for 40-nt reads).  For longer reads, the default length (25) can be used.
+  * '''-G <GTF file>''' Supply bowtie with a GTF file of transcript models.  This can help bowtie identify functions that may otherwise be missed.
+  * '''--no-novel-juncs ''' Only look for spliced reads across junctions in the supplied GTF file.  Typically not used.
+Choices for fastq encoding (which is listed as "Encoding" in the top "Basic Statistics" table of the FastQC output file).  See the [http://en.wikipedia.org/wiki/FASTQ_format FASTQ format page] for more details.
+  * '''--solexa-quals'''         (for input quality scores from Illumina versions 1.2 and earlier)
+  * '''--solexa1.3-quals''' or '''--phred64-quals'''     (for input quality scores from Illumina versions 1.3-1.7)