Changes between Version 21 and Version 22 of SOPs/go_annotation
- Timestamp:
- 09/29/20 13:19:25 (4 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/go_annotation
v21 v22 3 3 4 4 === DAVID === 5 [[http://david.abcc.ncifcrf.gov/home.jsp | DAVID]]6 5 7 * DAVID is generally the best place to start your enrichment analysis. 8 * Instructions for using DAVID can be found under //Functional Annotation// on the DAVID web site. 9 * You'll probably end up running DAVID multiple times, with different types of annotations, to get the more informative combination. 10 * Full output can be downloaded and viewed as a spreadsheet. 6 * [[http://david.abcc.ncifcrf.gov/home.jsp | DAVID]] is generally the best place to start your enrichment analysis. 7 * DAVID is a tool that analyzes a subset of assayed genes, asking the general question, "What's special about these genes compared to a random list of genes of the same size?" 8 * Instructions for using DAVID can be found under //Functional Annotation// on the DAVID web site. 9 * You'll probably end up running DAVID multiple times, with different types of annotations, to get the more informative combination. 10 * Full output can be downloaded and viewed as a spreadsheet. 11 11 12 12 === Gene Set Enrichment Analysis (GSEA) === 13 [[http://www.broadinstitute.org/gsea/index.jsp|Broad GSEA]]14 13 15 [[http s://www.gsea-msigdb.org/gsea/login.jsp|Download the GSEA software and additional resources to analyze, annotate and interpret enrichment results.]]14 [[http://www.broadinstitute.org/gsea/index.jsp|GSEA]] is very different different from tools like DAVID. GSEA takes as input all assayed genes, along with a metric that GSEA uses to order the genes. Then it asks the general question, "What's special about the order of these genes compared to a randomly ordered list of the same genes?" In other words, it looks for gene annotations that are enriched at the top or bottom of your ordered genes. 16 15 17 [[https://www.gsea-msigdb.org/gsea/msigdb/index.jsp|Explore the Molecular Signatures Database (MSigDB), ]]a collection of annotated gene sets for use with GSEA software. 18 19 [[http://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page | GSEA and MSigDB documentation]] 20 21 [[http://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Using_RNA-seq_Datasets_with_GSEA | Guidelines for using RNA-seq datasets with GSEA]] 16 GSEA can be run on about any operating system (so on your own computer or on a Whitehead Linux server like tak). 22 17 23 ==== Ranked List====18 ==== GSEAPreranked: start with a list of genes and values ==== 24 19 1. Create a two column file with gene names as first column and numeric values for second column (eg. log2 fold change, log2 ratio). The file does not need to be sorted and it should have extension ".rnk". 25 20 * The second column, used to rank genes, could be log2 fold change, t-statistic, or another scoring scheme that takes into account both log ratio and p-value. 26 21 2. To run using the GUI 27 * 1. Upload your ranked file "file.rnk". Click on "Steps in GSEA analysis -> Load data" 28 * 2. Click on "Tools -> GseaPreranked" 29 * 3. Select one of the gene sets from the "Gene sets database". We recommend starting with the Hallmarks set (h.all). You can find more information about the sets [[https://www.gsea-msigdb.org/gsea/msigdb/index.jsp|here ]] 30 * 4. Select your uploaded ranked list and click the run button. 22 23 * 1. Start GSEA. On tak, the command is 'gsea'. 24 * 2. Upload your ranked file "file.rnk". Click on "Steps in GSEA analysis -> Load data" 25 * 3. Click on "Tools -> GseaPreranked" 26 * 4. Select one of the gene sets from the "Gene sets database". We recommend starting with the Hallmarks set (h.all). You can find more information about the sets [[https://www.gsea-msigdb.org/gsea/msigdb/index.jsp|here ]] 27 * 5. Select your uploaded ranked list and click the run button. 31 28 3. To run the same type of analysis on the command line, you can see the command the GUI used clicking the "Command" button and run that command in your Linux machine. You will need java 11. For Whitehead users: switch to java 11 with this command "ml java/10" (after this if you run "java --version" you will get "openjdk 11.0.8 2020-07-14") 32 29 {{{ … … 51 48 [[http://rowley.mit.edu/caw_web/ssGSEAProjection/ssGSEA_caw_BIG_120314.pptx | ssGSEA Presentation from MIT BMC ]] 52 49 50 [[https://www.gsea-msigdb.org/gsea/login.jsp|Download the GSEA software and additional resources to analyze, annotate and interpret enrichment results.]] 51 52 [[https://www.gsea-msigdb.org/gsea/msigdb/index.jsp|Explore the Molecular Signatures Database (MSigDB), ]]a collection of annotated gene sets for use with GSEA software. 53 54 [[http://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page | GSEA and MSigDB documentation]] 55 56 [[http://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Using_RNA-seq_Datasets_with_GSEA | Guidelines for using RNA-seq datasets with GSEA]] 53 57 [[http://software.broadinstitute.org/webservices/gpModuleRepository/download/prod/module/?file=/ssGSEAProjection/broad.mit.edu:cancer.software.genepattern.module.analysis/00270/7.6/ssGSEAProjection.zip | Broad's ssGSEA from GenePattern R/jar scripts]] 54 58