Version 13 (modified by 5 years ago) ( diff ) | ,
---|
Single cell RNA-Seq to quantify gene levels and assay for differential expression
Create a matrix of gene counts by cells
- For 10x Genomics experiments, we use cell ranger to get this counts matrix.
- The main command is cellranger count, which requires a reference transcriptome indexed specifically for cellranger.
- Pre-built reference transcriptomes are available from 10x Genomics. Several of them are available at Whitehead on tak under /nfs/genomes/[ASSEMBLY]/10x where ASSEMBLY is specific to our nomenclature. Note that only certain gene types are included in these pre-built references.
- Custom reference transcriptomes can be created with cellranger commands:
- Filter the gtf to include only a subset of the annotated gene biotypes, for example,
bsub cellranger mkgtf Homo_sapiens.GRCh38.93.gtf Homo_sapiens.GRCh38.93.filtered.gtf --attribute=gene_biotype:protein_coding
- Create the cellranger index using a command such as
bsub cellranger mkref --genome=MyGenome --fasta=genome.fa --genes=Genes.filtered.gtf --ref-version=1.0
- Filter the gtf to include only a subset of the annotated gene biotypes, for example,
- Run the actual cellranger count command using syntax like
bsub cellranger count –id=ID –fastqs=PATH –transcriptome=DIR –sample=SAMPLE_LIST –project=PROJECT
- The output of 'cellranger count' includes
- An indexed BAM file of all mapped reads (possorted_genome_bam.bam)
- A Loupe Browser visualization and analysis file (cloupe.cloupe)
- The quality control summary is "web_summary.html" in the 'outs' folder and has important quality metrics and graphs such as: Estimated Number of Cells, Mean Reads per Cell, Mean Reads per Cell, Sequencing Saturation, etc.
- The "matrix" output files are not in the usual matrix structure. To create a standard 2-dimensional matrix, one can use R commands such as
library(monocle) library(cellrangerRkit) cellranger_data_path = "/path/to/dir/with/outs/dir" crm = load_cellranger_matrix(cellranger_data_path) crm.matrix = as.matrix(exprs(crm)) write.table(crm.matrix, "My.cellranger.matrix.txt", sep="\t", quote=F)
Run quality control and filter cells
We typically use the Seurat R package for these steps.
- Start out by loading the counts matrix from cellranger:
library("Seurat") message("Loaded Seurat version", packageDescription("Seurat")$Version) # Load the barcodes*, features*, and matrix* files in your 10x Genomics directory counts.all <- Read10X(data.dir = input.counts.filename)
Export expression and dimensional analysis data for interactive viewing
We prefer using UCSC's Cell Browser environment for this task.
- Prerequisites. To make the most of this interactive viewing tool,
- Run dimensional reduction (such as PCA, tSNE, UMAP).
- Cluster/partition the cells (such as with Seurat's FindClusters()).
- Identify cluster-specific marker genes (such as with Seurat'sFindAllMarkers()) and assemble/print information about them with commands such as
all.markers.forCB = cbind(as.numeric(all.markers$cluster), all.markers$gene, all.markers$p_val_adj, all.markers$avg_logFC, all.markers$pct.1, all.markers$pct.2) write.table(all.markers.forCB, file="all.markers.exported.txt", quote = FALSE, sep = "\t", row.names=F)
- Add info/links about the marker genes with the CellBrowser command
cbMarkerAnnotate all.markers.exported.txt markers.txt
- Export the key data from the Seurat object:
ExportToCellbrowser(seurat, dir=export.dir, dataset.name=dataset.name, markers.file=markers.file, reductions=c("pca", "tsne", "umap"))
- Run Cell Browser's cbBuild to create the web-viewable directory of files.
- Move the cbBuild output to a web server, which creates a page that looks something like https://cells.ucsc.edu/
Links to recommended scRNA-seq analysis tutorials and resources
- Seurat vignettes and guided analysis
- Analysis of single cell RNA-seq data course, Hemberg Group.
- Analysis of single cell RNA-seq data workshop, Broad Institute
- 2017/2018 Single Cell RNA Sequencing Analysis Workshop at UCD,UCB,UCSF
- Single cell RNA sequencing, NYU.
- Awesome-single-cell, Sean Davis
Note:
See TracWiki
for help on using the wiki.