Changes between Version 3 and Version 4 of SOPs/rna-seq-diff-expressions


Ignore:
Timestamp:
03/25/13 15:26:52 (12 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/rna-seq-diff-expressions

    v3 v4  
    4040         * ''bsub  "samtools sort -n -m 5000000000 accepted_hits.bam accepted_hitsSortedByname"''
    4141         * ''bsub  "samtools view accepted_hitsSortedByname.bam | htseq-count -m intersection-strict --stranded=no - gene_models.gtf >| gene_model.counts.txt"''
    42     * Add a pseudocount (such as 1) to all genes in all samples to
    43       * prevent log2 ratios that require dividing by 0
    44       * reduce background count noise
    45       * reduce problems with statistical methods that don't like 0's
    4642    * Remove the rows at the bottom with descriptions like no_feature, ambiguous, etc.
     43    * For custom analyses, you can add a pseudocount (such as 1) to all genes in all samples to prevent log2 ratios that require dividing by 0 AND reduce background count noise -- BUT be aware that some statistical methods (like DESeq) require raw input values without any pseudocounts or normalization.
    4744
    4845  * **Quantification of FPKM values**
     
    5855
    5956  * **Gene filtering**
    60     * Remove from the analysis any genes with 0 counts (or counts = pseudocounts).
    61     * Remove counts for any genes we want to classify as contaminants such as ribosomal RNAs (if these are included in the GTF gene annotation file).
     57    * Remove from the analysis any genes with 0 counts.
     58    * Remove counts for any genes we want to classify as contaminants or simply "not interesting" such as ribosomal RNAs (if these are included in the GTF gene annotation file).
    6259
    6360  * **Normalization**