Context Navigation

Changes between Version 109 and Version 110 of SOPs/atac_Seq

Timestamp:: 04/03/24 14:48:08 (15 months ago)
Author:: byuan
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/atac_Seq

-              v109
+              v110
 MACS v2 is applicable for ATAC-Seq using the appropriate options/parameters.
-  * If you have human (hg38, hg19) and mouse (mm10, mm9) samples with biological replicates, you run [[https://github.com/ENCODE-DCC/atac-seq-pipeline|ENCODE ATAC-seq Pipeline]]. The pipeline takes fastq files, cleans and maps the reads, filters aligned reads and does peak calls. Here is the [[https://www.encodeproject.org/pipelines/ENCPL787FUN/|schema of the workflow]].  In addition, it does quality controls. Here is a [[http://barc.wi.mit.edu/education/hot_topics/ChIPseq_ATACseq_2021/qc.html | sample QC report]]. The steps below shows you how to run it on our Whitehead server. Note: It only works on python2.
-      * content in input sample.json:
-{{{
+{
-    "atac.pipeline_type" : "atac",
-    "atac.genome_tsv" : "/nfs/BaRC_datasets/ENCODE_ATAC-seq_Pipeline/mm10/mm10.tsv",
-    "atac.fastqs_rep1_R1" : [
-        "/fullpath/sample_rep1_1.fastq.gz"
-    ],
-    "atac.fastqs_rep1_R2" : [
-        "/fullpath/sample_rep1_2.fastq.gz"
-    ],
-    "atac.fastqs_rep2_R1" : [
-        "/fullpath/sample_rep2_1.fastq.gz"
-    ],
-    "atac.fastqs_rep2_R2" : [
-        "/fullpath/sample_rep2_2.fastq.gz"
-    ],
-    "atac.paired_end" : true,
-    "atac.auto_detect_adapter" : true,
-    "atac.enable_tss_enrich" : true,
-    "atac.title" : "sample",
-    "atac.description" : "ATAC-seq mouse sample"
+}
-}}}
-      * Supported genome files for hg19, hg38, mm9 and mm10 can be found in /nfs/BaRC_datasets/ENCODE_ATAC-seq_Pipeline, and atac.genome_tsv used for .json is
-          * hg19: /nfs/BaRC_datasets/ENCODE_ATAC-seq_Pipeline/hg19/hg19.tsv
-          * hg38: /nfs/BaRC_datasets/ENCODE_ATAC-seq_Pipeline/hg38/hg38.tsv
-          * mm9: /nfs/BaRC_datasets/ENCODE_ATAC-seq_Pipeline/mm9/mm9.tsv
-          * mm10: /nfs/BaRC_datasets/ENCODE_ATAC-seq_Pipeline/mm10/mm10.tsv
-      * To initiate conda inside Whitehead:
-{{{
-# Be sure to keep the first dot in the command below:
-. /nfs/BaRC_Public/conda/start_barc_conda
-}}}
-      * Before running the ENCODE pipeline, verify there is no preexisting      conda startup code with the command below:
-{{{
-conda env list
-}}}
-       You have no preexisting conda if you get "conda: command not found". Otherwise, log out, log back in, start the new conda instance, and activate encode-atac-seq-pipeline
-      * Ignore the developer's instructions and use your home directory for conda and the pipeline.
-{{{
-conda activate encode-atac-seq-pipeline
-}}}
-      * Run. Files could be url or fullpath. [[https://github.com/ENCODE-DCC/atac-seq-pipeline/blob/master/docs/input.md | Detailed information about .json file]]
-{{{
-caper run /nfs/BaRC_Public/atac-seq-pipeline/atac.wdl -i sample.json
-# After the job finishes, you can deactivate conda with
-conda deactivate
-}}}
-      * The QC report is call-qc_report/execution/qc.html
-      * idr peaks files:
-           * rep1: call-idr_pr/shard-0/execution/rep1-pr1_vs_rep1-pr2.idr0.05.bfilt.narrowPeak.gz
-           * rep2: call-idr_pr/shard-1/execution/rep2-pr1_vs_rep2-pr2.idr0.05.bfilt.narrowPeak.gz
-           * Note: shard-0 refers to the first biological replicate, shard-1 refers to the 2nd biological replicate, and so on
-           * rep1 and rep2: call-idr/shard-1/execution/rep1_vs_rep2.idr0.05.bfilt.narrowPeak.gz
-Follow this for species other than human/mouse, or if no replicates
      * Run macs2 using pair-end bed as input and the options "--shift -75 --extsize 150". With those settings you will be creating a profile of reads around the cutting sites (one at each end of the fragment/paired read) that will result on peaks centered around the cutting sites (open chromatin). This is an important difference with ChIP-seq analysis. On ChIP-seq the binding event tends to be in the middle of the fragment; on ATAC-seq chromatin was opened where the cutting occurred and that is the end of the fragment. [[https://twitter.com/XiChenUoM/status/1336658454866325506|cutting/insertion sites enrichment in ATAC-seq]].
 {{{