4 | | The [[https://science.sciencemag.org/content/326/5950/289 | Hi-C method]] Hi-C method generalizes earlier experimental techniques, such as 3C or 5C, for characterizing contacts between specific chromosomal loci, to enable unbiased identification of chromatin interactions across an entire genome. Two software pipelines for analyzing data from Hi-C experiments are [[https://github.com/aidenlab/juicer | juicer]] and [[https://github.com/nservant/HiC-Pro | HiC-Pro]]. An example for using HiC-Pro on Whitehead computing resources with data collected with a kit from Arima Genomics is outlined below. Note that in this example, a reference genome is used that excludes unlocalized or unplaced contigs from the assembly. This choice is taken to ease downstream analysis. Please see the HiC-Pro [[http://nservant.github.io/HiC-Pro/ | documentation]] for additional examples. |
| 4 | The [[https://science.sciencemag.org/content/326/5950/289 | Hi-C method]] Hi-C method generalizes earlier experimental techniques, such as 3C or 5C, for characterizing contacts between specific chromosomal loci, to enable unbiased identification of chromatin interactions across an entire genome. Two software pipelines for analyzing data from Hi-C experiments are [[https://github.com/aidenlab/juicer | juicer]] and [[https://github.com/nservant/HiC-Pro | HiC-Pro]]. |
| 5 | |
| 6 | Instructions for running juicer on Whitehead computing resources is as follows, using juicer 1.6 with a shell script designed to be run on a slurm cluster. Start in the folder with your fastq files. |
| 7 | |
| 8 | {{{ |
| 9 | 1 - Set up directory and file structure expected by juicer |
| 10 | |
| 11 | # Link to the slurm scripts folder |
| 12 | ln -s /nfs/BaRC_Public/apps/juicer/juicer-1.6/scripts |
| 13 | # Link to the genome files (for TAIR10) or create your own, with a similar organization |
| 14 | ln -s /nfs/BaRC_Public/apps/juicer/juicer-1.6/genome |
| 15 | ln -s /nfs/BaRC_Public/apps/juicer/juicer-1.6/references |
| 16 | ln -s /nfs/BaRC_Public/apps/juicer/juicer-1.6/restriction_sites |
| 17 | # Create a folder called 'fastq' and symlink to your fastq sequences |
| 18 | # Fastq sequence files needs to be text (not gzipped) and have names like [SAMPLE]_R1.fastq and [SAMPLE]_R2.fastq |
| 19 | mkdir fastq; cd fastq |
| 20 | ln -s ../*.fastq . |
| 21 | # Go back to your original working directory |
| 22 | cd .. |
| 23 | |
| 24 | 2 - Run the main juicer command, replacing MY_WORKING_DIR with your current working directory (which contains the scripts, fastq, genome, etc. folders) and using names of your desired genome |
| 25 | ./scripts/juicer.sh -g TAIR10 -z MY_WORKING_DIR/references/TAIR10.fa -q 20 -l 20 -s DpnII -p MY_WORKING_DIR/genome/TAIR10.chrom.sizes -y MY_WORKING_DIR/restriction_sites/TAIR10_DpnII.txt -D MY_WORKING_DIR -t 8 |
| 26 | }}} |
| 27 | |
| 28 | An example for using HiC-Pro on Whitehead computing resources with data collected with a kit from Arima Genomics is outlined below. Note that in this example, a reference genome is used that excludes unlocalized or unplaced contigs from the assembly. This choice is taken to ease downstream analysis. Please see the HiC-Pro [[http://nservant.github.io/HiC-Pro/ | documentation]] for additional examples. |