wiki:SOPs/go_annotation

Identifying enriched biological themes in gene sets

DAVID

DAVID

  • DAVID is generally the best place to start your enrichment analysis.
  • Instructions for using DAVID can be found under Functional Annotation on the DAVID web site.
  • You'll probably end up running DAVID multiple times, with different types of annotations, to get the more informative combination.
  • Full output can be downloaded and viewed as a spreadsheet.

Gene Set Enrichment Analysis (GSEA)

Broad GSEA

GSEA Wiki

Ranked List

  1. Create a two column file with gene names as first column and numeric values for second column (eg. weight, p-value, etc), does not need to be sorted.
    • Assigning weights: There is no standard way to assign weights, however, it should reflect some logical order. GSEA uses the correlation between expression and phenotype to assign weights, if the list is not pre-ordered or ranked. A similar scheme can be used to rank genes by log2 ratio, t-statistic, or a scoring scheme that takes into account both log ratio and p-value.
    • If a gene list is not unique, duplicate genes can be given a shared weight, for eg. if a gene occurs four times in the list it is given a weight of 0.25, if it is unique a weight of 1 is given.
  2. Run GSEA: Tools -> GseaPreranked
  3. To run the same type of analysis on the command line, use a command like
    java -Xmx512m -cp /usr/lib/share/gsea2/gsea2-2.2.2.jar xtools.gsea.GseaPreranked -gmx gseaftp.broadinstitute.org://pub/gsea/gene_sets/h.all.v5.2.symbols.gmt -collapse false -mode Max_probe -norm meandiv -nperm 1000 -rnk ./MY_COMPARISON.rnk -scoring_scheme weighted -rpt_label GSEA_out_v1 -chip gseaftp.broadinstitute.org://pub/gsea/annotations/GENE_SYMBOL.chip -include_only_symbols true -make_sets true -plot_top_x 20 -rnd_seed timestamp -set_max 500 -set_min 2 -zip_report false -out GSEA_OUT.TEST_v1 -gui false
    

Fast gene set enrichment analysis (fgsea)

fgsea

Unranked List

GSEA will rank the genes

  1. Create necessary files in correct format for expression, phenotype and chip annotation (see GSEA wiki)
  2. Use MSigDB for gene sets or create custom gene sets in correct format
  3. Run GSEA, use default options to start

Single-sample GSEA (ssGSEA)

An extension of GSEA that can be used to determine enrichment of gene sets in individual samples.

ssGSEA Presentation from MIT BMC

Broad's ssGSEA from GenePattern R/jar scripts

BiNGO

BiNGO Plugin
You need to have Cytoscape installed to use BiNGO

  1. Start BiNGO via Cytoscape , Plugins->Start BiNGO
  2. Get genes from cluster/network or paste gene list
  3. Select the correct options (eg. species)
  4. Run BiNGO

GeneGO

GeneGO Login (Password Required)

  1. Upload gene list and activate
  2. One-click analysis -> Select GeneGo Pathway Maps

GO Term Finder : significant GO terms shared among a list of genes from your organism.

GO Term Mapper : maps the granular GO annotations for genes in a list to a set of GO slim terms, allowing you to bin your genes into broad categories.

Ingenuity IPA, subscription required.

Advaita iPathwayGuide, login required - subscription required for downloading.

More Information

Hot Topics: Gene List Enrichment