== CUT&Tag == Cleavage Under Targets & Tagmentation (CUT&Tag) is a tethering method that uses a protein-A-Tn5 (pA-Tn5) transposome fusion protein. It is an alternative technique to ChIP-seq and CUT&Run for detecting enrichment of protein-DNA interactions or histone modifications. A detailed description of the [https://yezhengstat.github.io/CUTTag_tutorial/index.html experimental method] together with a [https://yezhengstat.github.io/CUTTag_tutorial/ protocol for computational analysis] have been published by the Henikoff laboratory. Our preferences for specific steps include: * As in the analysis protocol from the Henikoff lab, we recommend using an IgG control. * As an alternative to calling peaks with [https://epigeneticsandchromatin.biomedcentral.com/articles/10.1186/s13072-019-0287-4 SEACR], we recommend using [https://hbctraining.github.io/Intro-to-ChIPseq/lessons/05_peak_calling_macs.html MAC2] because the resulting peaks tend to be narrower and better capture the tagged regions. {{{ macs2 callpeak --keep-dup all -t sample.mapped.bam -g hs -f BAMPE -n OutputName }}} * We recommend not removing duplicates from any of the samples. * Spike-in calibration using the number of fragments mapped to the E. coli genome, as described in the [https://yezhengstat.github.io/CUTTag_tutorial/ analysis protocol] published by the Henikoff lab, is useful for visualization of the CUT&Tag profile with a genome browser. * Spike-in normalized bedgraph files are not an appropriate input for MACS2, since MACS2 will renormalize to the library size. * Spike-in normalization using the commands described in [https://github.com/macs3-project/MACS/wiki/Advanced:-Call-peaks-using-MACS2-subcommands/ Call peaks using MACS2 subcommands, step 4 ] hasn't worked well for us. * We recommend using the spike-in scale factors in subsequent steps when comparing binding between conditions using tools like DESeq2. For a working example for how to run the published analysis workflow using the computing resources at the Whitehead Institute, please follow /nfs/BaRC_Public/BaRC_code/pipelines/analyze_CUTnTag/README and find the associated scripts within the parent directory. To run the analysis for the same example input with one command using **nextflow** run the following commands on fry {{{ mkdir /nfs/BaRC_training/CUTTAG/yourUserName cd /nfs/BaRC_training/CUTTAG/yourUserName sbatch --partition=20 --job-name=NextF_CT --output=NextF_CT_1sample-%j.out --mem=150gb --nodes=1 --ntasks=1 --cpus-per-task=20 --wrap "/nfs/BaRC_Public/apps/nextflow/nextflow run nf-core/cutandrun -profile singularity --input /nfs/BaRC_Public/Hot_Topics/CUTandTag/nextFlow/samplesheet.csv --normalisation_mode CPM --igg_scale_factor 1 --peakcaller 'MACS2' --multiqc_title 'multiQCReport' --skip_removeduplicates true --skip_preseq false --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --email 'userName@wi.mit.edu' --genome GRCh38 --extend_fragments false --macs2_qvalue 0.1 --minimum_alignment_q_score 0 --outdir ./OutNextF_keepAllReads_CPM_q0" ###Alternative more stringent peak calling #Change these parameters to increase the stringency: # --minimum_alignment_q_score 20 #to filter out low quality mapping #and # --macs2_qvalue 0.01 or 0.001 #to increase macs2 stringency sbatch --partition=20 --job-name=NextF_CT --output=NextF_CT_1sample-%j.out --mem=150gb --nodes=1 --ntasks=1 --cpus-per-task=20 --wrap "/nfs/BaRC_Public/apps/nextflow/nextflow run nf-core/cutandrun -profile singularity --input /nfs/BaRC_Public/Hot_Topics/CUTandTag/nextFlow/samplesheet.csv --normalisation_mode CPM --igg_scale_factor 1 --peakcaller 'MACS2' --multiqc_title 'multiQCReport' --skip_removeduplicates true --skip_preseq false --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --email 'userName@wi.mit.edu' --genome GRCh38 --extend_fragments false --macs2_qvalue 0.01 --minimum_alignment_q_score 20 --outdir ./OutNextF_keepAllReads_CPM_q20" }}} These are our recommended options: {{{ --end_to_end FALSE --save_spikein_aligned TRUE --save_align_intermed TRUE --skip_removeduplicates true --skip_preseq false --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --normalisation_binsize 1 (default 50) }}} To run macs2 using the "--keep-dup auto" setting you can use a input a profile file like the one described below: macs2CustomCUTRUN.config {{{ process { withName: '.*:CUTANDRUN:MACS2_.*' { ext.args = [ '--keep-dup auto', '--nomodel', '--shift -75', '--extsize 150', '--format BAM', '--bdg ', '--qvalue 0.01' ].join(' ').trim() } } }}} The command to be run using that configuration file is: {{{ sbatch --partition=20 --job-name=NextF --output=NextF-%j.out --mem=300gb --nodes=1 --ntasks=1 --cpus-per-task=20 --wrap \ nextflow run nf-core/cutandrun -profile singularity --normalisation_binsize 1 --input ./samplesheet.csv -c macs2CustomCUTRUN.config --normalisation_mode CPM \ --save_align_intermed TRUE --peakcaller 'MACS2' --replicate_threshold 2 --end_to_end FALSE --multiqc_title 'multiQCReport' --skip_removeduplicates true \ --skip_preseq false --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --email 'username@wi.mit.edu' --genome GRCh38 \ --extend_fragments false --macs2_qvalue 0.01 --outdir ./nextFlow_macs2auto " }}} Pipeline reference pages: [https://github.com/nf-core/cutandrun/ nf-core CUT&Tag pipeline] [https://nf-co.re/cutandrun/3.2.2/parameters/ CUT&Tag pipeline parameters]