== CUT&Tag ==
Cleavage Under Targets & Tagmentation (CUT&Tag) is a tethering method that uses a protein-A-Tn5 (pA-Tn5) transposome fusion protein. It is an alternative technique to ChIP-seq and CUT&Run for detecting enrichment of protein-DNA interactions or histone modifications.
A detailed description of the [https://yezhengstat.github.io/CUTTag_tutorial/index.html experimental method] together with a
[https://yezhengstat.github.io/CUTTag_tutorial/ protocol for computational analysis] have been published by the Henikoff laboratory. Our preferences for specific steps include: 
  * As in the analysis protocol from the Henikoff lab, we recommend using an IgG control.
  * As an alternative to calling peaks with [https://epigeneticsandchromatin.biomedcentral.com/articles/10.1186/s13072-019-0287-4 SEACR], we recommend using [https://hbctraining.github.io/Intro-to-ChIPseq/lessons/05_peak_calling_macs.html MAC2] because the resulting peaks tend to be narrower and better capture the tagged regions. 
{{{
 macs2 callpeak --keep-dup all -t sample.mapped.bam -g hs -f BAMPE -n OutputName
}}}

  * We recommend not removing duplicates from any of the samples.
  * Spike-in calibration using the number of fragments mapped to the E. coli genome, as described in the [https://yezhengstat.github.io/CUTTag_tutorial/ analysis protocol] published by the Henikoff lab, is useful for visualization of the CUT&Tag profile with a genome browser. 
  * Spike-in normalized bedgraph files are not an appropriate input for MACS2, since MACS2 will renormalize to the library size.
  * Spike-in normalization using the commands described in [https://github.com/macs3-project/MACS/wiki/Advanced:-Call-peaks-using-MACS2-subcommands/ Call peaks using MACS2 subcommands, step 4 ] hasn't worked well for us. 
  * We recommend using the spike-in scale factors in subsequent steps when comparing binding between conditions using tools like DESeq2.
For a working example for how to run the published analysis workflow using the computing resources at the Whitehead Institute, please follow /nfs/BaRC_Public/BaRC_code/pipelines/analyze_CUTnTag/README and find the associated scripts within the parent directory.

To run the analysis for the same example input with one command using **nextflow** run the following commands on fry
{{{
mkdir /nfs/BaRC_training/CUTTAG/yourUserName
cd /nfs/BaRC_training/CUTTAG/yourUserName
sbatch --partition=20 --job-name=NextF_CT --output=NextF_CT_1sample-%j.out --mem=150gb --nodes=1 --ntasks=1 --cpus-per-task=20 --wrap "/nfs/BaRC_Public/apps/nextflow/nextflow run nf-core/cutandrun -profile singularity --input /nfs/BaRC_Public/Hot_Topics/CUTandTag/nextFlow/samplesheet.csv --normalisation_mode CPM --igg_scale_factor 1 --peakcaller 'MACS2' --multiqc_title 'multiQCReport' --skip_removeduplicates true --skip_preseq false --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --email 'userName@wi.mit.edu' --genome GRCh38 --extend_fragments false --macs2_qvalue 0.1 --minimum_alignment_q_score 0 --outdir ./OutNextF_keepAllReads_CPM_q0"

###Alternative more stringent peak calling
#Change these parameters to increase the stringency:
# --minimum_alignment_q_score 20  #to filter out low quality mapping 
#and
# --macs2_qvalue 0.01 or 0.001 #to increase macs2 stringency

sbatch --partition=20 --job-name=NextF_CT --output=NextF_CT_1sample-%j.out --mem=150gb --nodes=1 --ntasks=1 --cpus-per-task=20 --wrap "/nfs/BaRC_Public/apps/nextflow/nextflow run nf-core/cutandrun -profile singularity --input /nfs/BaRC_Public/Hot_Topics/CUTandTag/nextFlow/samplesheet.csv --normalisation_mode CPM --igg_scale_factor 1 --peakcaller 'MACS2' --multiqc_title 'multiQCReport' --skip_removeduplicates true --skip_preseq false --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --email 'userName@wi.mit.edu' --genome GRCh38 --extend_fragments false --macs2_qvalue 0.01 --minimum_alignment_q_score 20 --outdir ./OutNextF_keepAllReads_CPM_q20"
}}}

These are our recommended options:
{{{
--end_to_end FALSE  
--save_spikein_aligned  TRUE  
--save_align_intermed  TRUE 
--skip_removeduplicates true  
--skip_preseq false   
--skip_dt_qc false 
--skip_multiqc false 
--skip_reporting false 
--dump_scale_factors true
--normalisation_binsize 1 (default 50) 
}}}

To run macs2 using the "--keep-dup auto" setting you can use a input a profile file like the one described below:
macs2CustomCUTRUN.config
{{{
process {
    withName: '.*:CUTANDRUN:MACS2_.*' {
        ext.args   = [
            '--keep-dup auto',
            '--nomodel',
            '--shift -75',
            '--extsize 150',
            '--format BAM',
            '--bdg ',
            '--qvalue 0.01'
        ].join(' ').trim()

    }
}
}}}

The command to be run using that configuration file is:
{{{
sbatch --partition=20 --job-name=NextF --output=NextF-%j.out  --mem=300gb   --nodes=1 --ntasks=1 --cpus-per-task=20 --wrap \
nextflow run nf-core/cutandrun -profile singularity --normalisation_binsize 1  --input ./samplesheet.csv -c macs2CustomCUTRUN.config  --normalisation_mode CPM \
--save_align_intermed  TRUE --peakcaller 'MACS2' --replicate_threshold 2  --end_to_end FALSE  --multiqc_title 'multiQCReport' --skip_removeduplicates true \
--skip_preseq false   --skip_dt_qc false --skip_multiqc false --skip_reporting false --dump_scale_factors true --email 'username@wi.mit.edu' --genome GRCh38 \
--extend_fragments false --macs2_qvalue 0.01 --outdir  ./nextFlow_macs2auto  "
}}}


Pipeline reference pages:

[https://github.com/nf-core/cutandrun/ nf-core CUT&Tag pipeline]

[https://nf-co.re/cutandrun/3.2.2/parameters/ CUT&Tag pipeline parameters]