Changes between Version 19 and Version 20 of SOPs/SAMBAMqc
- Timestamp:
- 04/26/17 11:15:41 (8 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/SAMBAMqc
v19 v20 17 17 If paired-end insert size or distance is unknown or need to be verified, it can be extracted from a BAM/SAM file after running Bowtie. 18 18 19 When mapping with bowtie (or another mapper), the insert size can often be included as an input parameter (example for bowtie: -X 500), which can help with mapping. See the [[http://barcwiki.wi.mit.edu/wiki/SOPs/mapping |mapping SOP]] for mapping details.19 When mapping with bowtie (or another mapper), the insert size can often be included as an input parameter (example for bowtie: -X 500), which can help with mapping. See the [[http://barcwiki.wi.mit.edu/wiki/SOPs/mapping mapping SOP]] for mapping details. 20 20 21 21 … … 37 37 }}} 38 38 39 Method 2: Calculate insert sizes with CollectInsertSizeMetrics function from picard (http://picard.sourceforge.net). This is also a good approximation for RNA samples.39 Method 2: Calculate insert sizes with CollectInsertSizeMetrics function from [http://broadinstitute.github.io/picard/ picard]. This is also a good approximation for RNA samples. 40 40 {{{ 41 41 # … … 48 48 }}} 49 49 50 You might need to specify a different java path if above command is not working. On local tak, you can use /usr/local/jre1.8 .0_72/bin/java.50 You might need to specify a different java path if above command is not working. On local tak, you can use /usr/local/jre1.8/bin/java 51 51 52 52 == QC to get a (visual) summary of mapping statistics. For eg. coverage/distribution of mapped reads across the genome or transcriptome == 53 53 54 ==== [http://broadinstitute.github.io/picard/ | Picard]:CollectRnaSeqMetrics.jar to find coverage across gene body for 5' or 3' bias ====54 ==== Use [http://broadinstitute.github.io/picard/ Picard] CollectRnaSeqMetrics.jar to find coverage across gene body for 5' or 3' bias ==== 55 55 ==== [RNA-seq only] Get global coverage profile across transcripts ==== 56 56 … … 71 71 72 72 73 ==== [http://qualimap.bioinfo.cipf.es/ | QualiMap]:can be used on DNA or RNA-Seq to get summary of mapping and coverage/distribution ====73 ==== [http://qualimap.bioinfo.cipf.es/ QualiMap] can be used on DNA or RNA-Seq to get summary of mapping and coverage/distribution ==== 74 74 {{{ 75 75 # Graphical interface: enter 'qualimap' on the command line … … 98 98 99 99 100 ==== [http://rseqc.sourceforge.net/ | RSeQC]:RNA-Seq quality control package for getting mapping statistics (eg. unique/multi-mapped reads) ====100 ==== [http://rseqc.sourceforge.net/ RSeQC] is a RNA-Seq quality control package for getting mapping statistics (eg. unique/multi-mapped reads) ==== 101 101 {{{ 102 102 bam_stat.py -i myFile.bam … … 104 104 for bamFile in `/bin/ls *.bam`; do bsub "bam_stat.py -i $bamFile > $bamFile.bam_stat.txt"; done 105 105 }}} 106 ==== infer_experiment.py from RseQC package: Check if/how your RNA-seq reads are stranded. ====106 ==== Use infer_experiment.py from the RseQC package to check if/how your RNA-seq reads are stranded. ==== 107 107 {{{ 108 108 … … 177 177 }}} 178 178 179 Figure from [http://onetipperday.sterding.com/2012/07/how-to-tell-which-library-type-to-use.html |Tophat/Bowtie library options ]179 Figure from [http://onetipperday.sterding.com/2012/07/how-to-tell-which-library-type-to-use.html Tophat/Bowtie library options ] 180 180 [[Image(tophat_library.png,500px)]] 181 181 … … 184 184 == Graphically analyze read duplication == 185 185 186 The R/Bioconductor package [https://www.bioconductor.org/packages/release/bioc/html/dupRadar.html |dupRadar] can do this, analyzing a BAM file that has had duplicates flagged (such as with Picard's MarkDuplicates tool).187 188 A set of commands can be run with an R script by the package authors available from their [https://www.bioconductor.org/packages/release/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#including-dupradar-into-pipelines |Using the dupRadar package] page.186 The R/Bioconductor package [https://www.bioconductor.org/packages/release/bioc/html/dupRadar.html dupRadar] can do this, analyzing a BAM file that has had duplicates flagged (such as with Picard's MarkDuplicates tool). 187 188 A set of commands can be run with an R script by the package authors available from their [https://www.bioconductor.org/packages/release/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#including-dupradar-into-pipelines Using the dupRadar package] page. 189 189 190 190 A BaRC script (/nfs/BaRC_Public/BaRC_code/R/dupRadar/dupRadar.R) does both the duplicate marking and the analysis with a command like … … 196 196 == Interpreting quality control issues == 197 197 198 See [[https://sequencing.qcfail.com/ |QC Fail Sequencing]] from the Babraham Institute198 See [[https://sequencing.qcfail.com/ QCFAIL.com]] from the Babraham Institute 199 199 200 200 \\