| 3 | |
| 4 | === Create a matrix of gene counts by cells === |
| 5 | |
| 6 | * For 10x Genomics experiments, we use [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger cell ranger] to get this counts matrix. |
| 7 | * The main command is [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count 'cellranger count'], which requires a reference transcriptome indexed specifically for cellranger. |
| 8 | * [https://support.10xgenomics.com/single-cell-gene-expression/software/release-notes/build#grch38_3.0.0 Pre-built reference transcriptomes] are available from 10x Genomics. Several of them are available at Whitehead on tak under /nfs/genomes/[ASSEMBLY]/10x where ASSEMBLY is specific to our nomenclature. Note that only certain gene types are included in these pre-built references. |
| 9 | * Custom reference transcriptomes can be created with cellranger commands: |
| 10 | * Filter the gtf to include only a subset of the annotated gene biotypes, for example, |
| 11 | {{{ |
| 12 | bsub cellranger mkgtf Homo_sapiens.GRCh38.93.gtf Homo_sapiens.GRCh38.93.filtered.gtf --attribute=gene_biotype:protein_coding |
| 13 | }}} |
| 14 | * Create the cellranger index using a command such as |
| 15 | {{{ |
| 16 | bsub cellranger mkref --genome=MyGenome --fasta=genome.fa --genes=Genes.filtered.gtf --ref-version=1.0 |
| 17 | }}} |
| 18 | * Run the actual [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count 'cellranger count'] command using syntax like |
| 19 | {{{ |
| 20 | bsub cellranger count –id=ID –fastqs=PATH –transcriptome=DIR –sample=SAMPLE_LIST –project=PROJECT |
| 21 | }}} |
| 22 | |
| 23 | * The [https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/overview output] of 'cellranger count' includes |
| 24 | * An indexed BAM file of all mapped reads (possorted_genome_bam.bam) |
| 25 | * A Loupe Browser visualization and analysis file (cloupe.cloupe) |
| 26 | |
| 27 | * The quality control summary is "web_summary.html" in the 'outs' folder and has important quality metrics and graphs such as: Estimated Number of Cells, Mean Reads per Cell, Mean Reads per Cell, Sequencing Saturation, etc. |
| 28 | |
| 29 | * The "matrix" output files are not in the usual matrix structure. To create a standard 2-dimensional matrix, one can use R commands such as |
| 30 | {{{ |
| 31 | library(monocle) |
| 32 | library(cellrangerRkit) |
| 33 | cellranger_data_path = "/path/to/dir/with/outs/dir" |
| 34 | crm = load_cellranger_matrix(cellranger_data_path) |
| 35 | crm.matrix = as.matrix(exprs(crm)) |
| 36 | write.table(crm.matrix, "My.cellranger.matrix.txt", sep="\t", quote=F) |
| 37 | }}} |
| 38 | |
| 39 | |
| 40 | === Run quality control and filter cells === |
| 41 | |
| 42 | We typically use the [https://satijalab.org/seurat/ Seurat] R package for these steps. |
| 43 | |
| 44 | * Start out by loading the counts matrix from cellranger: |
| 45 | {{{ |
| 46 | library("Seurat") |
| 47 | message("Loaded Seurat version", packageDescription("Seurat")$Version) |
| 48 | # Load the barcodes*, features*, and matrix* files in your 10x Genomics directory |
| 49 | counts.all <- Read10X(data.dir = input.counts.filename) |
| 50 | }}} |
| 51 | |
| 52 | |
| 53 | === Export expression and dimensional analysis data for interactive viewing === |
| 54 | |
| 55 | We prefer using [https://cellbrowser.readthedocs.io/ UCSC's Cell Browser] environment for this task. |
| 56 | |
| 57 | * Prerequisites. To make the most of this interactive viewing tool, |
| 58 | * Run dimensional reduction (such as PCA, tSNE, UMAP). |
| 59 | * Cluster/partition the cells (such as with Seurat's FindClusters()). |
| 60 | * Identify cluster-specific marker genes (such as with Seurat'sFindAllMarkers()) and assemble/print information about them with commands such as |
| 61 | {{{ |
| 62 | all.markers.forCB = cbind(as.numeric(all.markers$cluster), all.markers$gene, all.markers$p_val_adj, all.markers$avg_logFC, |
| 63 | all.markers$pct.1, all.markers$pct.2) |
| 64 | write.table(all.markers.forCB, file="all.markers.exported.txt", quote = FALSE, sep = "\t", row.names=F) |
| 65 | }}} |
| 66 | * Add info/links about the marker genes with the CellBrowser command |
| 67 | {{{ |
| 68 | cbMarkerAnnotate all.markers.exported.txt markers.txt |
| 69 | }}} |
| 70 | |
| 71 | * Export the key data from the Seurat object: |
| 72 | {{{ |
| 73 | ExportToCellbrowser(seurat, dir=export.dir, dataset.name=dataset.name, markers.file=markers.file, reductions=c("pca", "tsne", "umap")) |
| 74 | }}} |
| 75 | * Run Cell Browser's [https://cellbrowser.readthedocs.io/basic_usage.html cbBuild] to create the web-viewable directory of files. |
| 76 | * Move the cbBuild output to a web server, which creates a page that looks something like https://cells.ucsc.edu/ |