Changes between Version 11 and Version 12 of SOP/scRNA-seq


Ignore:
Timestamp:
07/13/20 11:36:30 (4 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOP/scRNA-seq

    v11 v12  
    11
    22== Single cell RNA-Seq to quantify gene levels and assay for differential expression ==
     3
     4=== Create a matrix of gene counts by cells ===
     5
     6    * For 10x Genomics experiments, we use [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger cell ranger] to get this counts matrix.
     7      * The main command is [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count 'cellranger count'], which requires a reference transcriptome indexed specifically for cellranger.
     8      * [https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#grch38_3.0.0 Pre-built reference transcriptomes] are available from 10x Genomics.  Several of them are available at Whitehead on tak under /nfs/genomes/[ASSEMBLY]/10x where ASSEMBLY is specific to our nomenclature.  Note that only certain gene types are included in these pre-built references.
     9      * Custom reference transcriptomes can be created with cellranger commands:
     10        * Filter the gtf to include only a subset of the annotated gene biotypes, for example,
     11{{{
     12bsub cellranger mkgtf Homo_sapiens.GRCh38.93.gtf Homo_sapiens.GRCh38.93.filtered.gtf --attribute=gene_biotype:protein_coding
     13}}}
     14        * Create the cellranger index using a command such as
     15{{{
     16bsub cellranger mkref --genome=MyGenome --fasta=genome.fa --genes=Genes.filtered.gtf --ref-version=1.0
     17}}}
     18      * Run the actual [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count 'cellranger count'] command using syntax like
     19{{{
     20bsub cellranger count –id=ID –fastqs=PATH –transcriptome=DIR –sample=SAMPLE_LIST –project=PROJECT
     21}}}
     22
     23      * The [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/overview output] of 'cellranger count' includes
     24        * An indexed BAM file of all mapped reads (possorted_genome_bam.bam)
     25        * A Loupe Browser visualization and analysis file (cloupe.cloupe)
     26
     27      * The quality control summary is "web_summary.html" in the 'outs' folder and has important quality metrics and graphs such as: Estimated Number of Cells, Mean Reads per Cell, Mean Reads per Cell, Sequencing Saturation, etc.
     28
     29      * The "matrix" output files are not in the usual matrix structure. To create a standard 2-dimensional matrix, one can use R commands such as
     30{{{
     31library(monocle)
     32library(cellrangerRkit)
     33cellranger_data_path = "/path/to/dir/with/outs/dir"
     34crm = load_cellranger_matrix(cellranger_data_path)
     35crm.matrix = as.matrix(exprs(crm))
     36write.table(crm.matrix, "My.cellranger.matrix.txt", sep="\t", quote=F)
     37}}}
     38
     39
     40=== Run quality control and filter cells ===
     41
     42We typically use the [https://satijalab.org/seurat/ Seurat] R package for these steps.
     43
     44    * Start out by loading the counts matrix from cellranger:
     45{{{
     46library("Seurat")
     47message("Loaded Seurat version", packageDescription("Seurat")$Version)
     48# Load the barcodes*, features*, and matrix* files in your 10x Genomics directory
     49counts.all <- Read10X(data.dir = input.counts.filename)
     50}}}
     51
     52
     53=== Export expression and dimensional analysis data for interactive viewing ===
     54
     55We prefer using [https://cellbrowser.readthedocs.io/ UCSC's Cell Browser] environment for this task.
     56
     57  * Prerequisites.  To make the most of this interactive viewing tool,
     58    * Run dimensional reduction (such as PCA, tSNE, UMAP).
     59    * Cluster/partition the cells (such as with Seurat's FindClusters()).
     60    * Identify cluster-specific marker genes (such as with Seurat'sFindAllMarkers()) and assemble/print information about them with commands such as
     61{{{
     62all.markers.forCB = cbind(as.numeric(all.markers$cluster), all.markers$gene, all.markers$p_val_adj, all.markers$avg_logFC,
     63                      all.markers$pct.1, all.markers$pct.2)
     64write.table(all.markers.forCB, file="all.markers.exported.txt", quote = FALSE, sep = "\t", row.names=F)
     65}}}
     66    * Add info/links about the marker genes with the CellBrowser command
     67{{{
     68cbMarkerAnnotate all.markers.exported.txt markers.txt
     69}}}
     70
     71  * Export the key data from the Seurat object:
     72{{{
     73ExportToCellbrowser(seurat, dir=export.dir, dataset.name=dataset.name, markers.file=markers.file, reductions=c("pca", "tsne", "umap"))
     74}}}
     75  * Run Cell Browser's [https://cellbrowser.readthedocs.io/basic_usage.html cbBuild] to create the web-viewable directory of files.
     76  * Move the cbBuild output to a web server, which creates a page that looks something like https://cells.ucsc.edu/
    377
    478=== These are links to scRNA-seq analysis tutorials and resources  ===