wiki:SOPs/RRBS

Using reduced representation bisulfite sequencing (RRBS) to characterize genomic DNA methylation

Background

Reduced-representation bisulfite sequencing (RRBS) is an experimental protocol to measure and compare genomic methylation patterns. These experiments use one or multiple restriction enzymes on genomic DNA to produce sequence specific fragments that are subsequently treated with bisulfite and sequenced. The workflow below illustrates how to use bismark to analyze RRBS data on the resources at the Whitehead Institute.

Step by step analysis

  • Genome indexing
    • Bismark relies on indexed reference genomes (for short read mapping using bowtie or bowtie2) that have been in silico bisulfite treated to create C-to-T and G-to-A versions of the reference.
    • Several reference genomes (e.g. human, mouse, fly) are already available in /nfs/genomes, so this indexing step will only be necessary if you are not using one of these assemblies.
    • For a custom reference genome, the Bismark indexing can be done with:
sbatch --partition=20 --job-name=bismark --mem=32G --wrap "bismark_genome_preparation --bowtie2 /path/to/organism/reference/assembly/fasta"
  • The above command will write the indices needed by bismark to /path/to/organism/reference/assembly/fasta/Bisulfite_Genome. When running bismark, however, point to one directory level above this (i.e./path/to/organism/reference/assembly/fasta). In the example below, a reference on /nfs/genomes is indicated.
sbatch --partition=20 --job-name=TG --mem=32G --wrap "trim_galore --paired --rrbs --fastqc -o trimmedReads /path/to/raw/data/reads_1.fq.gz /path/to/raw/data/reads_2.fq.gz"
  • Alignment and methylation calls
    • Bismark can (since version 0.6.beta1) use bowtie2 to map short reads to reference genome(s).
    • In the command below, bismark will expect to find C-to-T and G-to-A versions of the reference genome. From the following the steps above, these will be located within /nfs/genomes/myOrganism/bowtie/ (for Institute-wide reference genomes) or /path/to/custom/genome/bowtie for other reference genomes.
    • Bismark produces BAM file(s) of aligned reads and methylation calls.

sbatch --partition=20 --job-name=bismark--mem=32G --wrap "bismark --genome /nfs/genomes/myOrganism/bowtie/ -1 trimmedReads_1.fq.gz -2 trimmedReads_2.fq.gz"
  • Bismark can (since version 0.6.beta1) use bowtie2 to map short reads to reference genome(s).
  • In the command above, bismark will expect to find C-to-T and a G-to-A versions of the reference genome.
  • Extract methylation percentages
    • After running bismark, the methylation percentages can be extracted from the BAM output using bismark_methylation_extractor.
    • For paired ended data, use the "-p" flag:
sbatch --partition=20 --job-name=bme--mem=32G --wrap "bismark_methylation_extractor -p --gzip --bedGraph trimmedReads_bismark_bt2_pe.bam"
  • The resulting file, trimmedReads_bismark bt2_pe.bismark.cov.gz, summarizes coverage in the following format:

<chromosome> <start position> <end position> <methylation percentage> <count methylated> <count unmethylated>
Note: See TracWiki for help on using the wiki.