Changes between Initial Version and Version 1 of SOPs/diff_microarry


Ignore:
Timestamp:
01/23/13 16:49:43 (12 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/diff_microarry

    v1 v1  
     1=== Processing After Normalization ===
     2  * Two packages in Bioconductor can be used to determined differentially expressed genes.  Significant Analysis of Microarrays (SAM) is implement by the package //siggenes// and the other is //limma// (recommended).  Also see, [[http://bioconductor.org/packages/2.7/bioc/vignettes/limma/inst/doc/usersguide.pdf | Limma User's Guide (Oct 28, 2010)]] (or a more recent version) for examples of lots of experimental designs.
     3  * Processing Affy data:
     4
     5Create an expression set (ie. an eset object) from normalized expression values or ratios, either can be used but it needs to be log-transformed (see [[http://www.bioconductor.org/help/course-materials/2005/BioC2005/labs/lab01/printing/6.appendix.pdf |limma: Linear Models for Microarray Data, Section 23.2]]).  If using the rma function, output values are log2 intensities.
     6
     7{{{
     8  library(limma)
     9  eset=rma(Data)
     10  eset.rma.expr=exprs(eset)
     11 }}}
     12 
     13
     14Create a design file which shows your targets, example
     15
     16{{{
     17  #If using read.delim, then column/header names are shifted to the left (see below)
     18  Control Experiment
     19  Sample1       0       1
     20  Sample2       0       1
     21  Sample3       1       0
     22  Sample4       1       0
     23   
     24  #read in the design file
     25  design <- read.delim("myDesignFile.txt")
     26 }}}
     27 
     28For other examples of design: [[http://iona.wi.mit.edu/bio/bioinfo/Rscripts/miRNA_1color_arrays.txt|Affy]] and [[http://iona.wi.mit.edu/bio/bioinfo/Rscripts/Affy_design_1_GB.txt|Agilent]]
     29
     30Fit linear model
     31
     32 {{{
     33rma.fit = lmFit(eset.rma.expr,design)
     34}}}
     35 
     36Set Contrast depending on question(s) being asked, eg. experiment vs control
     37
     38{{{
     39  contrast.matrix=makeContrasts(Experiment-Control, levels=design)
     40  #check if contrast is correct
     41  contrast
     42}}}
     43 
     44Get linear model fit for the contrasts.  Use eBayes to estimate the variance using emperical Bayes approach.
     45
     46{{{
     47  rma.fit2=contrasts.fit(rma.fit, contrast.matrix)
     48  rma.fit2.ebayes = eBayes(rma.fit2)
     49}}}
     50
     51Create MA and Volcano Plot to look for outliers (ie. differentially expressed genes)
     52
     53{{{
     54  plot(rma.fit.ebayes$Amean, rma.fit2.ebayes$coefficients[,1]
     55  plot(rma.fit.ebayes$coeficients[,1], -log10(rma.fit2.ebayes$p.value[,1])
     56}}}
     57 
     58Write results to file after adjusting using FDR
     59
     60{{{
     61write.fit(rma.fit2.ebayes, file="limma.out", adjust="fdr")
     62 }}}
     63 
     64To find differentially expressed genes, sort the data by both log2 ratio and adjusted p-value.  Choose a log2 ratio and p-value cutoff (eg. abs(log2 ratio) > 1 and adjusted p-value of < 0.05), a combination of both p-values and FC should be used to select DE genes [[ https://stat.ethz.ch/pipermail/bioconductor/2007-October/019479.html | [BioC] topTable threshold on p-value and logFC, Reply from Gordon Smyth]].  Another way to examine DE genes is to look at the volcano-plot, the //tips// of "V" in the volcano would generally have these DE genes.  [[br]]
     65 
     66
     67Another Example: [[http://iona.wi.mit.edu/bio/bioinfo/Rscripts/Affy_hairpin_analysis_GB.R | Affy_hairpin_analysis_GB.R ]]
     68
     69=== Other Issues  ===
     70  * For limma, the design file that lists the files should be in the same order as what is read-in.
     71
     72==== Correcting for dye bias ====
     73
     74Example from the User's Guide (8.1.2):
     75  * design = cbind(DyeEffect=1,MUvsWT=c(1,-1,1,-1))
     76  * fit = lmFit(MA, design); fit = eBayes(fit)
     77  * To get genes with a dye effect: topTable(fit, coef="DyeEffect")
     78  * To get MUvsWT diff: topTable(fit, coef="MUvsWT")
     79  * "The fold changes and significant tests in this list are corrected for dye-effects. Including the dye-effect in the model in this way uses up one degree of freedom which might otherwise be used to estimate the residual variability, but it is valuable if many genes show non-negligible dye-effects."
     80
     81For more information, see the third session of [http://iona.wi.mit.edu/bio/education/R2011/ | An introduction to R and Bioconductor: A BaRC Short Course]