Changes between Version 8 and Version 9 of SOPs/chip_seq_peaks
- Timestamp:
- 05/14/14 16:03:50 (11 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SOPs/chip_seq_peaks
v8 v9 13 13 14 14 * See the [[http://barcwiki.wi.mit.edu/wiki/SOPs/mapping|mapping SOP]] for more details. 15 16 === Step 2: Call peaks (bound regions) === 15 === Step 2: Strand cross correlation analysis === 16 * The goal of this step is to asses the quality of the IP and to estimate the fragment size of the DNA immunoprecipitated. 17 * For a detailed explanation on strand cross-correlation analysis see box 2 of this paper ([[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003326|Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data ]]). 18 {{{ 19 /nfs/BaRC_Public/phantompeakqualtools/run_spp.R -c=TreatmentIP.bam -savp -out=TreatmentIP.run_spp.out 20 }}} 21 22 * After this analysis a good ChIP-seq experiment will have a second peak (reflecting the fragment size) at least as tall as the first peak (reflecting read length). This is how the graph should look: ([[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/figure/F4/|Fig4E ]]). If the second peak is smaller that the first, ([[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/figure/F4/|like the example shown in Fig4G Marginal ]]), macs will not estimate fragment size correctly. In that case we recommend running macs with parameters "--nomodel" and "--shiftsize=half_of_the_fragment_size". The fragment size is detected on the strand cross correlation analysis. 23 24 === Step 3: Call peaks (bound regions) === 17 25 Some of the parameters to consider when comparing programs are: 18 26 * Adjustment of sequence tags to better represent the original DNA fragment (by shifting tags in the 3′ direction or by extending tags to the estimated length of the original fragments) … … 24 32 Based on our ChIP-Seq bake off and on a published review ([[http://www.plosone.org/article/info:doi/10.1371/journal.pone.0011471|Evaluation of ChIP-Seq performance]]), MACs and SISSRs are good programs to try. 25 33 26 ==== MACS ====34 ==== MACS14 ==== 27 35 * For MACS to work the header of the sequences has to have no spaces. 28 36 * macs points to macs14 on WIBR local machines … … 32 40 }}} 33 41 42 34 43 [[http://liulab.dfci.harvard.edu/MACS/|MACS]] ([[http://liulab.dfci.harvard.edu/MACS/00README.html|README]]) ([[http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Search&db=pubmed&term=18798982|Zhang et al., 2008]])[[br]][[br]] 35 44 … … 37 46 38 47 Sample commands to run MACS (current version as of March 5 2012): 1.4.2 using mapped reads in map or sam format: 39 40 48 {{{ 41 49 macs -t IP_mapped.map -c Control_mapped.map -g 1.87e9 --name=outputName --format=BOWTIE --tsize=36 --wig --space=25 --mfold=10,30 42 50 macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM --tsize=36 --wig --space=25 --mfold=10,30 43 }}} 51 macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM --tsize=36 --wig --space=25 --nomodel --shiftsize=100 52 }}} 53 44 54 45 55 The parameters used on the command are: … … 52 62 * --mfold=MFOLD Select the regions within MFOLD range of high-confidence enrichment ratio against background to build model. The regions must be lower than upper limit, and higher than the lower limit. DEFAULT:10,30 53 63 * -g GSIZE Effective genome size. It can be 1.0e+9 or 1000000000, or shortcuts:'hs' for human (2.7e9), 'mm' for mouse (1.87e9), 'ce' for C. elegans (9e7) and 'dm' for fruitfly (1.2e8), Default:hs 54 * --keep-dup=1 Controls the MACS behavior towards duplicate tags at the exact same location. DEFAULT: 1 in MACS 1.4; auto in MACS2. 64 * --keep-dup=1 Controls the MACS behavior towards duplicate tags at the exact same location. DEFAULT: 1 in MACS 1.4; auto in MACS2. 65 * --nomodel whether or not to build the shifting model. If True, MACS will not build model. by default it means shifting size = 100. 66 * --shiftsize The arbitrary shift size in bp. When nomodel is true, MACS will use this value as 1/2 of fragment size. DEFAULT: 100. 55 67 56 68 {{{ … … 59 71 }}} 60 72 61 '''Note:''' The wig files that macs14 generates are not normalized. 73 74 ''Note'': The wig files that macs14 generates are not normalized. 75 76 77 ==== MACS2 ==== 78 79 * MACS2 is appropriate for both proteins like transcription factors that may have narrow peaks, as well as histone modifications that may affect broader regions. For broader peaks we recommend using --nomodel, --nolambda (if there's no control), and using the fragment size calculated on the strand cross correlation analysis. We recommend using macs2 rather than macs14 for broad peaks. 80 {{{ 81 macs2 callpeak -t IP_reads.mapped_only.bam -c Control_reads.mapped_only.bam -f BAM -g mm -n Name --nomodel -B 82 }}} 83 * --nomodel whether or not to build the shifting model. If True, MACS will not build model. by default it means shifting size = 100. 84 * -f Input format 85 * -B create bedgraph output files 62 86 ==== SISSRs ==== 63 87 [[http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/|SiSSRs]] ([[http://nar.oxfordjournals.org/cgi/content/full/36/16/5221|Reference]]) ([[http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/SISSRs-Manual.pdf|Manual]]) … … 96 120 97 121 98 === Step 3: Linking bound regions to genes ===122 === Step 4: Linking bound regions to genes === 99 123 Both MACS and SISSRs provide bed files with the set of peaks, presumably indicating bound regions. 100 124 … … 195 219 196 220 221