Changes between Version 44 and Version 45 of SOPs/rna-seq-diff-expressions


Ignore:
Timestamp:
06/07/17 15:04:22 (8 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/rna-seq-diff-expressions

    v44 v45  
    5757      * featureCounts is much faster than htseq-count, but the details of its counting method is quite different from that of htseq-count, especially for paired-end reads
    5858      * See [[http://www.ncbi.nlm.nih.gov/pubmed/24227677|Liao et al., 2014]] for details of the method (and comparisons with other counting tools)
     59      * featureCounts needs the paired-read BAM file to be sorted by read ID, but if it isn't, it'll do the sorting.
    5960      * Sample commands:
    6061{{{
    61 #default: unstranded
    62 #single-end reads
     62# single-end reads (unstranded)
    6363featureCounts -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
    64 #PE reads
     64# paired-end reads (unstranded)
    6565featureCounts -p -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
    6666
    67 #stranded (fwd)
    68 featureCounts -p -s 1 -a gene_anotations.gtf -o MySample.featureCounts.txt MySample.bam
    69 
     67# paired-end reads (forward stranded)
     68featureCounts -p -s 1 -a gene_annotations.gtf -o MySamples.featureCounts.txt *sortedByName.bam
     69# paired-end reads (reverse stranded)
     70featureCounts -p -s 2 -a gene_annotations.gtf -o MySamples.featureCounts.txt *sortedByName.bam
    7071}}}
     72
    7173    * For some analyses (or for visualization), you can add a pseudocount (such as 1 or another small number) to all genes in all samples to prevent log2 ratios that require dividing by 0 and reduce background count noise -- BUT be aware that some statistical methods (like DESeq) require raw input values without any pseudocounts or normalization.
    7274    * **NOTE:**