Changes between Version 8 and Version 9 of SOPs/chip_seq_peaks


Ignore:
Timestamp:
05/14/14 16:03:50 (11 years ago)
Author:
ibarrasa
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/chip_seq_peaks

    v8 v9  
    1313
    1414 * See the [[http://barcwiki.wi.mit.edu/wiki/SOPs/mapping|mapping SOP]] for more details.
    15 
    16 === Step 2: Call peaks (bound regions) ===
     15=== Step 2: Strand cross correlation analysis ===
     16 * The goal of this step is to asses the quality of the IP and to estimate the fragment size of the DNA immunoprecipitated.
     17 * For a detailed explanation on strand cross-correlation analysis see box 2 of this paper ([[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003326|Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data ]]).
     18{{{
     19/nfs/BaRC_Public/phantompeakqualtools/run_spp.R -c=TreatmentIP.bam -savp -out=TreatmentIP.run_spp.out
     20}}}
     21
     22  * After this analysis a good ChIP-seq experiment will have a second peak (reflecting the fragment size) at least as tall as the first peak (reflecting read length). This is how the graph should look: ([[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/figure/F4/|Fig4E ]]). If the second peak is smaller that the first, ([[http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3431496/figure/F4/|like the example shown in Fig4G Marginal ]]),  macs will not estimate fragment size correctly. In that case we recommend running macs with parameters "--nomodel" and "--shiftsize=half_of_the_fragment_size". The fragment size is detected on the strand cross correlation analysis.
     23 
     24=== Step 3: Call peaks (bound regions) ===
    1725Some of the parameters to consider when comparing programs are:
    1826  * Adjustment of sequence tags to better represent the original DNA fragment (by shifting tags in the 3′ direction or by extending tags to the estimated length of the original fragments)
     
    2432Based on our ChIP-Seq bake off and on a published review ([[http://www.plosone.org/article/info:doi/10.1371/journal.pone.0011471|Evaluation of ChIP-Seq performance]]), MACs and SISSRs are good programs to try.
    2533
    26 ==== MACS ====
     34==== MACS14 ====
    2735  * For MACS to work the header of the sequences has to have no spaces.
    2836  * macs points to macs14 on WIBR local machines
     
    3240}}}
    3341
     42
    3443[[http://liulab.dfci.harvard.edu/MACS/|MACS]] ([[http://liulab.dfci.harvard.edu/MACS/00README.html|README]]) ([[http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Search&db=pubmed&term=18798982|Zhang et al., 2008]])[[br]][[br]]
    3544
     
    3746
    3847Sample commands to run MACS (current version as of March 5 2012): 1.4.2 using mapped reads in map or sam format:
    39 
    4048{{{
    4149macs -t IP_mapped.map -c Control_mapped.map -g 1.87e9 --name=outputName --format=BOWTIE --tsize=36 --wig --space=25  --mfold=10,30
    4250macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM    --tsize=36 --wig --space=25  --mfold=10,30
    43 }}}
     51macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM    --tsize=36 --wig --space=25  --nomodel --shiftsize=100
     52}}}
     53
    4454
    4555The parameters used on the command are:
     
    5262  * --mfold=MFOLD      Select the regions within MFOLD range of high-confidence enrichment ratio against background to build model. The regions must be lower than upper limit, and higher than the lower limit. DEFAULT:10,30
    5363  * -g GSIZE  Effective genome size. It can be 1.0e+9 or 1000000000, or shortcuts:'hs' for human (2.7e9), 'mm' for mouse (1.87e9), 'ce' for C. elegans (9e7) and 'dm' for fruitfly (1.2e8), Default:hs
    54   * --keep-dup=1   Controls the MACS behavior towards duplicate tags at the exact same location.  DEFAULT: 1 in MACS 1.4; auto in MACS2. 
     64  * --keep-dup=1   Controls the MACS behavior towards duplicate tags at the exact same location.  DEFAULT: 1 in MACS 1.4; auto in MACS2.   
     65  * --nomodel  whether or not to build the shifting model. If True, MACS will not build model. by default it means shifting size = 100.                     
     66  * --shiftsize The arbitrary shift size in bp. When nomodel is true, MACS will use this value as 1/2 of fragment size. DEFAULT: 100.
    5567
    5668{{{
     
    5971}}}
    6072
    61 '''Note:''' The wig files that macs14 generates are not normalized.
     73
     74''Note'': The wig files that macs14 generates are not normalized.
     75
     76
     77==== MACS2 ====
     78
     79  * MACS2 is appropriate for both proteins like transcription factors that may have narrow peaks, as well as histone modifications that may affect broader regions. For broader peaks we recommend using --nomodel, --nolambda (if there's no control), and using the fragment size calculated on the strand cross correlation analysis.  We recommend using macs2 rather than macs14 for broad peaks.
     80{{{
     81macs2  callpeak -t IP_reads.mapped_only.bam  -c Control_reads.mapped_only.bam -f BAM -g mm  -n Name --nomodel   -B
     82}}}
     83  * --nomodel  whether or not to build the shifting model. If True, MACS will not build model. by default it means shifting size = 100.                     
     84  * -f Input format
     85  * -B create bedgraph output files
    6286==== SISSRs ====
    6387[[http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/|SiSSRs]] ([[http://nar.oxfordjournals.org/cgi/content/full/36/16/5221|Reference]]) ([[http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/SISSRs-Manual.pdf|Manual]])
     
    96120
    97121
    98 === Step 3: Linking bound regions to genes ===
     122=== Step 4: Linking bound regions to genes ===
    99123Both MACS and SISSRs provide bed files with the set of peaks, presumably indicating bound regions.
    100124
     
    195219
    196220
     221