Changes between Version 19 and Version 20 of SOPs/SAMBAMqc


Ignore:
Timestamp:
04/26/17 11:15:41 (8 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/SAMBAMqc

    v19 v20  
    1717If paired-end insert size or distance is unknown or need to be verified, it can be extracted from a BAM/SAM file after running Bowtie. 
    1818
    19 When mapping with bowtie (or another mapper), the insert size can often be included as an input parameter (example for bowtie: -X 500), which can help with mapping.  See the [[http://barcwiki.wi.mit.edu/wiki/SOPs/mapping|mapping SOP]] for mapping details.
     19When mapping with bowtie (or another mapper), the insert size can often be included as an input parameter (example for bowtie: -X 500), which can help with mapping.  See the [[http://barcwiki.wi.mit.edu/wiki/SOPs/mapping mapping SOP]] for mapping details.
    2020
    2121
     
    3737}}}
    3838
    39 Method 2: Calculate insert sizes with CollectInsertSizeMetrics function from picard (http://picard.sourceforge.net).  This is also a good approximation for RNA samples.
     39Method 2: Calculate insert sizes with CollectInsertSizeMetrics function from [http://broadinstitute.github.io/picard/ picard].  This is also a good approximation for RNA samples.
    4040{{{
    4141   #
     
    4848}}}
    4949
    50 You might need to specify a different java path if above command is not working. On local tak, you can use /usr/local/jre1.8.0_72/bin/java.
     50You might need to specify a different java path if above command is not working. On local tak, you can use /usr/local/jre1.8/bin/java
    5151
    5252== QC to get a (visual) summary of mapping statistics.  For eg. coverage/distribution of mapped reads across the genome or transcriptome ==
    5353
    54 ==== [http://broadinstitute.github.io/picard/ | Picard]: CollectRnaSeqMetrics.jar to find coverage across gene body for 5' or 3' bias ====
     54==== Use [http://broadinstitute.github.io/picard/ Picard] CollectRnaSeqMetrics.jar to find coverage across gene body for 5' or 3' bias ====
    5555==== [RNA-seq only] Get global coverage profile across transcripts ====
    5656
     
    7171
    7272
    73 ==== [http://qualimap.bioinfo.cipf.es/ | QualiMap]: can be used on DNA or RNA-Seq to get summary of mapping and coverage/distribution ====
     73==== [http://qualimap.bioinfo.cipf.es/ QualiMap] can be used on DNA or RNA-Seq to get summary of mapping and coverage/distribution ====
    7474{{{
    7575# Graphical interface: enter 'qualimap' on the command line
     
    9898
    9999
    100 ==== [http://rseqc.sourceforge.net/ | RSeQC]: RNA-Seq quality control package for getting mapping statistics (eg. unique/multi-mapped reads) ====
     100==== [http://rseqc.sourceforge.net/ RSeQC] is a RNA-Seq quality control package for getting mapping statistics (eg. unique/multi-mapped reads) ====
    101101{{{
    102102bam_stat.py -i myFile.bam
     
    104104for bamFile in `/bin/ls *.bam`; do bsub "bam_stat.py -i $bamFile > $bamFile.bam_stat.txt"; done
    105105}}}
    106 ==== infer_experiment.py from RseQC package: Check if/how your RNA-seq reads are stranded. ====
     106==== Use infer_experiment.py from the RseQC package to check if/how your RNA-seq reads are stranded. ====
    107107{{{
    108108
     
    177177}}}
    178178
    179 Figure from [http://onetipperday.sterding.com/2012/07/how-to-tell-which-library-type-to-use.html | Tophat/Bowtie library options ]
     179Figure from [http://onetipperday.sterding.com/2012/07/how-to-tell-which-library-type-to-use.html Tophat/Bowtie library options ]
    180180       [[Image(tophat_library.png,500px)]]
    181181
     
    184184== Graphically analyze read duplication ==
    185185
    186 The R/Bioconductor package [https://www.bioconductor.org/packages/release/bioc/html/dupRadar.html | dupRadar] can do this, analyzing a BAM file that has had duplicates flagged (such as with Picard's MarkDuplicates tool).
    187 
    188 A set of commands can be run with an R script by the package authors available from their [https://www.bioconductor.org/packages/release/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#including-dupradar-into-pipelines | Using the dupRadar package] page.
     186The R/Bioconductor package [https://www.bioconductor.org/packages/release/bioc/html/dupRadar.html dupRadar] can do this, analyzing a BAM file that has had duplicates flagged (such as with Picard's MarkDuplicates tool).
     187
     188A set of commands can be run with an R script by the package authors available from their [https://www.bioconductor.org/packages/release/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#including-dupradar-into-pipelines Using the dupRadar package] page.
    189189
    190190A BaRC script (/nfs/BaRC_Public/BaRC_code/R/dupRadar/dupRadar.R) does both the duplicate marking and the analysis with a command like
     
    196196== Interpreting quality control issues ==
    197197
    198 See [[https://sequencing.qcfail.com/|QC Fail Sequencing]] from the Babraham Institute
     198See [[https://sequencing.qcfail.com/ QCFAIL.com]] from the Babraham Institute
    199199
    200200\\