Changes between Version 3 and Version 4 of SOPs/rna-seq-diff-expressions
- Timestamp:
- 03/25/13 15:26:52 (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/rna-seq-diff-expressions
v3 v4 40 40 * ''bsub "samtools sort -n -m 5000000000 accepted_hits.bam accepted_hitsSortedByname"'' 41 41 * ''bsub "samtools view accepted_hitsSortedByname.bam | htseq-count -m intersection-strict --stranded=no - gene_models.gtf >| gene_model.counts.txt"'' 42 * Add a pseudocount (such as 1) to all genes in all samples to43 * prevent log2 ratios that require dividing by 044 * reduce background count noise45 * reduce problems with statistical methods that don't like 0's46 42 * Remove the rows at the bottom with descriptions like no_feature, ambiguous, etc. 43 * For custom analyses, you can add a pseudocount (such as 1) to all genes in all samples to prevent log2 ratios that require dividing by 0 AND reduce background count noise -- BUT be aware that some statistical methods (like DESeq) require raw input values without any pseudocounts or normalization. 47 44 48 45 * **Quantification of FPKM values** … … 58 55 59 56 * **Gene filtering** 60 * Remove from the analysis any genes with 0 counts (or counts = pseudocounts).61 * Remove counts for any genes we want to classify as contaminants such as ribosomal RNAs (if these are included in the GTF gene annotation file).57 * Remove from the analysis any genes with 0 counts. 58 * Remove counts for any genes we want to classify as contaminants or simply "not interesting" such as ribosomal RNAs (if these are included in the GTF gene annotation file). 62 59 63 60 * **Normalization**