Changes between Version 63 and Version 64 of SOPs/rna-seq-diff-expressions
- Timestamp:
- 11/03/22 09:53:38 (2 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/rna-seq-diff-expressions
v63 v64 118 118 119 119 * **Gene filtering** 120 * Remove from the analysis any genes with 0 counts across all samples. Some analysis tools do this themselves.121 * Remove counts for any genes we want to classify as contaminants or simply "not interesting" such as ribosomal RNAs (if these are included in the GTF gene annotation file).120 * One used to remove from the analysis any genes with 0 counts across all samples. Now most analysis tools do this themselves, so we skip this step. 121 * Remove counts for any genes we want to classify as contaminants or simply "not interesting" such as ribosomal RNAs or pseudogenes (if these are included in the GTF gene annotation file). If in doubt, leave them in. 122 122 123 123 * **Normalization** … … 132 132 * Effective library size (a more complex, but probably more valid method included in programs such as edgeR and DESeq) 133 133 * '''Differential expression statistics packages can output a matrix of normalized counts (typically using the method recommended for the accompanying statistics), so typically no additional normalization needs to be done''' (unless one want to perform further normalization, such as using gene length). 134 * As with microarraynormalization, be aware of the assumptions of each method and choose the method(s) which are most valid with your experiment.134 * As with all types of normalization, be aware of the assumptions of each method and choose the method(s) which are most valid with your experiment. 135 135 * Origin of recommendation for upper-quartile normalization: [[http://www.ncbi.nlm.nih.gov/pubmed/20167110|Bullard et al., 2010]] 136 136 * Use DESeq, DESeq2, or see [[http://jura.wi.mit.edu/bio/scripts/R/normalize_DGE_matrix.R|normalize_DGE_matrix.R]] for a command-line script for count normalization