wiki:SOPs/rna-seq-diff-expressions/EdgeR

Using edgeR

  • Only for experiments with replication.
  • See the edgeR home page for official documentation.
  • Algorithm comes from lab of main developer of limma, Gordon Smyth.
  • Input for edgeR should be a matrix of counts, not RPKM values.
    • From the edgeR manual: "RPKM values should not be used for assessing differential expression of genes between samples in edgeR. We use the raw counts, because the methods implemented in edgeR are based on the negative binomial distribution, a discrete distribution."
  • See get_DE_genes_with_edgeR.R for a command-line script to use edgeR.
  • Major edgeR upgrade (April 2010?) required a change from version 0.1 to 0.2 of above script.
  • Outputs also includes image files: boxplot, MA plot, volcano plot.
  • Sample command:
    • bsub "R --vanilla < get_DE_genes_with_edgeR.R edgeR_sample_input.txt 2 5 edgeR_sample_output.txt"
  • In running edgeR, the parameter prior.n should be chosen where the default value is 10. prior.n determines the amount of smoothing tagwise dispersion towards the common dispersion.
    The suggested method for choosing prior.n is to ensure prior.n * df is approx 50 where degrees of freedom (df) = lib. size - number of groups. prion.n should generally be greater than 5. For more details see section 6.4 edgeR Manual.
Note: See TracWiki for help on using the wiki.