Changes between Version 66 and Version 67 of SOPs/atac_Seq


Ignore:
Timestamp:
07/07/21 11:47:32 (4 years ago)
Author:
byuan
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/atac_Seq

    v66 v67  
    118118MACS v2 is applicable for ATAC-Seq using the appropriate options/parameters.
    119119
    120   * If you have human (hg38, hg19) and mouse (mm10, mm9) samples with biological replicates, you run [[https://github.com/ENCODE-DCC/atac-seq-pipeline|ENCODE ATAC-seq Pipeline]]. The pipeline takes fastq files, cleans and maps the reads, filters aligned reads and does peak calls. Here is the [[https://www.encodeproject.org/pipelines/ENCPL787FUN/|workflow]].  In addition, it does quality controls. The steps below shows you how to run it on our Whitehead server.
    121       * content in sample.json:
     120  * If you have human (hg38, hg19) and mouse (mm10, mm9) samples with biological replicates, you run [[https://github.com/ENCODE-DCC/atac-seq-pipeline|ENCODE ATAC-seq Pipeline]]. The pipeline takes fastq files, cleans and maps the reads, filters aligned reads and does peak calls. Here is the [[https://www.encodeproject.org/pipelines/ENCPL787FUN/|schema of the workflow]].  In addition, it does quality controls. The steps below shows you how to run it on our Whitehead server.
     121      * content in input sample.json:
    122122
    123123{{{
     
    175175# convert bam to bed
    176176bedtools bamtobed -i foo.bam > foo_pe.bed
    177 # shift reads
     177# shift reads. Tn5 produces 5’ overhangs of 9 bases long: pos. strand +4 and neg strand -5
    178178cat foo.pe.bed | awk -F $'\t' 'BEGIN {OFS = FS}{ if ($6 == "+") {$2 = $2 + 4} else if ($6 == "-") {$3 = $3 - 5} print $0}' >| foo_tn5_pe.bed
    179179# call peaks.
    180180# --keep-dup all: since duplicates have been removed in previous step
    181181macs2 callpeak -t foo_tn5_pe.bed -n foo -f BED -g mm -q 0.01 --nomodel --shift -75 --extsize 150 --call-summits --keep-dup all
    182  
    183 }}}
    184 
    185     * If you are working with NFR, macs2 BAMPE option also works. When using'BAMPE' option with paired-end reads, we let MACS run the pileup and calculate 'extsize'.
     182}}}
     183       * If you have biological replicates, you can identify reproducible peaks with idr function ( [[http://barcwiki.wi.mit.edu/wiki/SOPs/chip_seq_peaks#repro | Detail information]] )
     184{{{
     185idr --samples rep1.narrowPeak rep2.narrowPeak --input-file-type narrowPeak --output-file IDR.txt --plot
     186}}}
     187
     188    * If you are working with the nucleosome free region (NFR) ( detail information can be found in the bottom of the page), macs2 BAMPE option also works. When using'BAMPE' option with paired-end reads, we let MACS run the pileup and calculate 'extsize'.
    186189       * BAMPE format: there is no special format for BAMPE - MACS will treat PE reads as coming from the same fragment, from the manual,
    187190       * "If the BAM file is generated for paired-end data, MACS will only keep the left mate(5' end) tag. However, when format BAMPE is specified, MACS will use the real fragments inferred from alignment results for reads pileup."
     
    201204 *  [[https://github.com/LiuLabUB/HMMRATAC | HMMRATAC]]
    202205 *  [[https://github.com/jsh58/Genrich | Genrich]]
    203  * Run MACS2 with BAMPE option for PE reads, i.e. -f BAMPE, this asks MACS to pileup and infer the real fragment size from the alignment.  Note: this option can be used to find accessible regions but not suitable for finding exact cut sites.  Also, comparing this option with converting to BED format (above), the BAMPE peaks calls are more conservative and may miss real peaks.
    204 
    205206
    206207