| 1 | === Using DESeq === |
| 2 | * For experiments with or without replication |
| 3 | * See [[http://www.bioconductor.org/packages/devel/bioc/html/DESeq.html|the DESeq home page]] for official documentation |
| 4 | * Also see our [[http://jura.wi.mit.edu/bio/education/R2011/|Hot Topics RNA-Seq vignette]] (session 3, third part) for another discussion of the use of DESeq. |
| 5 | * Package summary: "Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution" |
| 6 | * From the DESeq publication: "DESeq owes its basic idea to edgeR, yet differs in several aspects." |
| 7 | * Input is also a matrix of counts, with gene identifiers used as row names. |
| 8 | * Sample code to use __with__ replication: |
| 9 | |
| 10 | {{{ |
| 11 | # Use package |
| 12 | library(DESeq) |
| 13 | # Read counts |
| 14 | countsTable = read.delim("Gene_counts.txt") |
| 15 | # Describe groups |
| 16 | groups = c(rep("C",3), rep("T", 3)) |
| 17 | # Make a CountDataSet |
| 18 | cds = newCountDataSet(countsTable, groups) |
| 19 | # Estimate effective library size |
| 20 | cds = estimateSizeFactors(cds) |
| 21 | # Core assumption of the method: variance is a function of expression level |
| 22 | # Estimate variance for each gene |
| 23 | cds = estimateDispersions(cds) |
| 24 | # Do stats based on a negative binomial distribution |
| 25 | results = nbinomTest(cds, "T", "C") |
| 26 | write.table(results, file="DESeq_output.txt", sep="\t", quote=F) |
| 27 | }}} |
| 28 | |
| 29 | * Sample code to use __without__ replication: |
| 30 | |
| 31 | {{{ |
| 32 | # Show that all samples are different |
| 33 | conds = c("a", "b", "c", "d", "e", "f") |
| 34 | }}} |
| 35 | |
| 36 | |
| 37 | * Then apply the same code as above except for ''estimateVarianceFunctions()'': |
| 38 | * ''cds = estimateVarianceFunctions(cds, method='blind')'' |
| 39 | |
| 40 | * Make desired comparison |
| 41 | * ''results = nbinomTest(cds, "a", "b")'' |