Using HiChIP experiments to characterize genome-wide chromatin contacts between regulatory elements
The HiChIP method combines the Hi-C technique for high-throughput chromosome conformation capture with chromatin immunoprecipitation-sequencing (ChIP-seq) to characterize genome-wide chromatin contacts between regulatory elements, such as those marked by specific histone modifications or bound by other proteins (e.g. cohesin). MAPS is an analysis pipeline that can be used to extract such significant interactions from HiChIP (or the closely related PLAC-seq) data and visualize them in genome browsers.
Analysis outline
Set up the configuration file
- The MAPS pipeline is run from a shell script that specifies important configuration settings, including those for file pathways to interpreters and software for manipulating sequencing data. For the specific case of data collected using the HiChIP kit from Arima Genomics, the pipeline comes with the Arima-MAPS_v2.0.sh shell script, which should be edited before running on the Whitehead cluster to include the following:
python_path=/usr/bin/python Rscript_path=/usr/bin/Rscript MACS2_path=/usr/local/bin/python3.6/macs2
Submit data processing to the LSF batch queue
The command below is an example for how to run MAPS and should be used when submitting these computations to the LSF batch queue. In this example, ChIP peaks are provided to the pipeline, rather than being called by it (using MACS2) and the reference genome (hg19) is for human (-o specifies "organism" here).
Arima-MAPS_v2.0.sh -C 0 -I /path/to/fastqFiles/fastqFileNamePrefix -O /path/to/output -m /path/to/peaks/peaks.bed -o hg19 -b /nfs/genomes/human_gp_feb_09_no_random/bwa_alt_name/hg19.fa -t 8 -f 1