Context Navigation

Changes between Version 66 and Version 67 of SOPs/atac_Seq

Timestamp:: 07/07/21 11:47:32 (4 years ago)
Author:: byuan
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/atac_Seq

-              v66
+              v67
 MACS v2 is applicable for ATAC-Seq using the appropriate options/parameters.
   * If you have human (hg38, hg19) and mouse (mm10, mm9) samples with biological replicates, you run [[https://github.com/ENCODE-DCC/atac-seq-pipeline|ENCODE ATAC-seq Pipeline]]. The pipeline takes fastq files, cleans and maps the reads, filters aligned reads and does peak calls. Here is the [[https://www.encodeproject.org/pipelines/ENCPL787FUN/|workflow]].  In addition, it does quality controls. The steps below shows you how to run it on our Whitehead server.
       * content in sample.json:
+  * If you have human (hg38, hg19) and mouse (mm10, mm9) samples with biological replicates, you run [[https://github.com/ENCODE-DCC/atac-seq-pipeline|ENCODE ATAC-seq Pipeline]]. The pipeline takes fastq files, cleans and maps the reads, filters aligned reads and does peak calls. Here is the [[https://www.encodeproject.org/pipelines/ENCPL787FUN/|schema of the workflow]].  In addition, it does quality controls. The steps below shows you how to run it on our Whitehead server.
+      * content in input sample.json:
 {{{
 …
 # convert bam to bed
 bedtools bamtobed -i foo.bam > foo_pe.bed
 # shift reads
+# shift reads. Tn5 produces 5’ overhangs of 9 bases long: pos. strand +4 and neg strand -5
 cat foo.pe.bed | awk -F $'\t' 'BEGIN {OFS = FS}{ if ($6 == "+") {$2 = $2 + 4} else if ($6 == "-") {$3 = $3 - 5} print $0}' >| foo_tn5_pe.bed
 # call peaks.
 # --keep-dup all: since duplicates have been removed in previous step
 macs2 callpeak -t foo_tn5_pe.bed -n foo -f BED -g mm -q 0.01 --nomodel --shift -75 --extsize 150 --call-summits --keep-dup all
+}}}
+    * If you are working with NFR, macs2 BAMPE option also works. When using'BAMPE' option with paired-end reads, we let MACS run the pileup and calculate 'extsize'.
+}}}
+       * If you have biological replicates, you can identify reproducible peaks with idr function ( [[http://barcwiki.wi.mit.edu/wiki/SOPs/chip_seq_peaks#repro | Detail information]] )
+{{{
+idr --samples rep1.narrowPeak rep2.narrowPeak --input-file-type narrowPeak --output-file IDR.txt --plot
+}}}
+    * If you are working with the nucleosome free region (NFR) ( detail information can be found in the bottom of the page), macs2 BAMPE option also works. When using'BAMPE' option with paired-end reads, we let MACS run the pileup and calculate 'extsize'.
        * BAMPE format: there is no special format for BAMPE - MACS will treat PE reads as coming from the same fragment, from the manual,
        * "If the BAM file is generated for paired-end data, MACS will only keep the left mate(5' end) tag. However, when format BAMPE is specified, MACS will use the real fragments inferred from alignment results for reads pileup."
 …
  *  [[https://github.com/LiuLabUB/HMMRATAC | HMMRATAC]]
  *  [[https://github.com/jsh58/Genrich | Genrich]]
- * Run MACS2 with BAMPE option for PE reads, i.e. -f BAMPE, this asks MACS to pileup and infer the real fragment size from the alignment.  Note: this option can be used to find accessible regions but not suitable for finding exact cut sites.  Also, comparing this option with converting to BED format (above), the BAMPE peaks calls are more conservative and may miss real peaks.