Context Navigation

Changes between Version 44 and Version 45 of SOPs/rna-seq-diff-expressions

Timestamp:: 06/07/17 15:04:22 (8 years ago)
Author:: gbell
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/rna-seq-diff-expressions

-              v44
+              v45
       * featureCounts is much faster than htseq-count, but the details of its counting method is quite different from that of htseq-count, especially for paired-end reads
       * See [[http://www.ncbi.nlm.nih.gov/pubmed/24227677|Liao et al., 2014]] for details of the method (and comparisons with other counting tools)
+      * featureCounts needs the paired-read BAM file to be sorted by read ID, but if it isn't, it'll do the sorting.
       * Sample commands:
 {{{
+#default: unstranded
+#single-end reads
+# single-end reads (unstranded)
 featureCounts -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
 #PE reads
+# paired-end reads (unstranded)
 featureCounts -p -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
+#stranded (fwd)
+featureCounts -p -s 1 -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
+# paired-end reads (forward stranded)
+featureCounts -p -s 1 -a gene_annotations.gtf -o MySamples.featureCounts.txt *sortedByName.bam
+# paired-end reads (reverse stranded)
+featureCounts -p -s 2 -a gene_annotations.gtf -o MySamples.featureCounts.txt *sortedByName.bam
 }}}
     * For some analyses (or for visualization), you can add a pseudocount (such as 1 or another small number) to all genes in all samples to prevent log2 ratios that require dividing by 0 and reduce background count noise -- BUT be aware that some statistical methods (like DESeq) require raw input values without any pseudocounts or normalization.
     * **NOTE:**