== Using reduced representation bisulfite sequencing (RRBS) to characterize genomic DNA methylation == === Background === Reduced-representation bisulfite sequencing (RRBS) is an experimental [https://academic.oup.com/nar/article/33/18/5868/2401288 protocol] to measure and compare genomic methylation patterns. These experiments use one or multiple restriction enzymes on genomic DNA to produce sequence specific fragments that are subsequently treated with bisulfite and sequenced. The workflow below illustrates how to use [https://academic.oup.com/bioinformatics/article/27/11/1571/216956 bismark] to analyze RRBS data on the resources at the Whitehead Institute. === Step by step analysis === * **Genome indexing** * Bismark relies on indexed reference genomes (for short read mapping using [https://genomebiology.biomedcentral.com/articles/10.1186/gb-2009-10-3-r25 bowtie] or [https://www.nature.com/articles/nmeth.1923 bowtie2]) that have been ''in silico'' bisulfite treated to create C-to-T and G-to-A versions of the reference. * Several reference genomes (e.g. human, mouse, fly) are already available in /nfs/genomes, so this indexing step will only be necessary if you are not using one of these assemblies. * For a custom reference genome, the Bismark indexing can be done with: {{{ bsub bismark_genome_preparation --bowtie2 /path/to/organism/reference/assembly/fasta }}} * The above command will write the indices needed by bismark to /path/to/organism/reference/assembly/fasta/Bisulfite_Genome. When running bismark, however, point to one directory level above this (i.e./path/to/organism/reference/assembly/fasta). In the example below, a reference on /nfs/genomes is indicated. * **Quality control** * Start by using [https://www.bioinformatics.babraham.ac.uk/projects/trim_galore Trim Galore] or [http://barcwiki.wi.mit.edu/wiki/SOPs/qc_shortReads another] read trimmer to apply quality filters and remove adapters. * See our [http://barcwiki.wi.mit.edu/wiki/SOPs/qc_shortReads QC and preprocessing guidelines] for details on running Trim Galore. * A command for running Trim Galore on paired end reads from RRBS experiments can look like: {{{ bsub trim_galore --paired --rrbs --fastqc -o trimmedReads /path/to/raw/data/reads_1.fq.gz /path/to/raw/data/reads_2.fq.gz }}} * **Alignment and methylation calls** * Bismark can (since version 0.6.beta1) use [https://www.nature.com/articles/nmeth.1923 bowtie2] to map short reads to reference genome(s). * In the command below, bismark will expect to find C-to-T and G-to-A versions of the reference genome. From the following the steps above, these will be located within /nfs/genomes/myOrganism/bowtie/ (for Institute-wide reference genomes) or /path/to/custom/genome/bowtie for other reference genomes. * [https://www.bioinformatics.babraham.ac.uk/projects/bismark/ Bismark] produces BAM file(s) of aligned reads and methylation calls. {{{ bsub bismark --genome /nfs/genomes/myOrganism/bowtie/ -1 trimmedReads_1.fq.gz -2 trimmedReads_2.fq.gz }}} * Bismark can (since version 0.6.beta1) use [https://www.nature.com/articles/nmeth.1923 bowtie2] to map short reads to reference genome(s). * In the command above, bismark will expect to find C-to-T and a G-to-A versions of the reference genome. * **Extract methylation percentages** * After running bismark, the methylation percentages can be extracted from the BAM output using bismark_methylation_extractor. * For paired ended data, use the "-p" flag: {{{ bsub bismark_methylation_extractor -p --gzip --bedGraph trimmedReads_bismark_bt2_pe.bam }}} * The resulting file, trimmedReads_bismark bt2_pe.bismark.cov.gz, summarizes coverage in the following [https://github.com/FelixKrueger/Bismark/blob/master/Docs/README.md format]: {{{ }}}