Changes between Version 19 and Version 20 of SOPs/InProgress


Ignore:
Timestamp:
05/14/14 10:03:10 (11 years ago)
Author:
gbell
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/InProgress

    v19 v20  
    1 == Get reproducible peaks from multiple chip-seq replicates with IDR if macs is used for peak calls. ==
     1== Get reproducible peaks from multiple ChIP-seq replicates with IDR (after using macs for peak calling) ==
    22
    33For more information about the method, see the main [[https://sites.google.com/site/anshulkundaje/projects/idr IDR page]]
     
    1111# -p 1e -3 => Set p-value cutoff to 1e-3 (which is more relaxed than the default setting)
    1212
    13 bsub macs2 callpeak -t IP_1.bam -c control_1.bam  -f BAM -g hs -n IP.1_vs_control.1 -B -p 1e-3
    14 bsub macs2 callpeak -t IP_2.bam -c control_2.bam  -f BAM -g hs -n IP.2_vs_control.2 -B -p 1e-3
     13bsub macs2 callpeak -t IP_1.bam -c control_1.bam -f BAM -g hs -n IP.1_vs_control.1 -B -p 1e-3
     14bsub macs2 callpeak -t IP_2.bam -c control_2.bam -f BAM -g hs -n IP.2_vs_control.2 -B -p 1e-3
    1515}}}
    1616
    17 === Sort peaks in .narrowPeak files (created by macs2) from best to worst using the -log10(p-value) column (column 8), and only keep the top 100k peaks (at most) ===
     17=== Sort peaks in .narrowPeak files (created by macs2) ===
     18
     19Sort from best to worst using the -log10(p-value) column (column 8), and only keep the top 100k peaks (at most).
    1820
    1921{{{
     
    3739# ranking.measure => p.value is recommended
    3840#
     41
    3942batch-consistency-analysis.r IP.1_vs_control.1.regionPeak.gz IP.2_vs_control.2.regionPeak.gz -1 rep1_vs_rep2_IDR 0 F p.value chromInfo.txt
    4043}}}
     
    4447# USAGE: batch-consistency-plot.r [npairs] [output.prefix] [input.file.prefix1] [input.file.prefix2] [input.file.prefix3] ....
    4548# This method can plot 1 or more pairs of replicates
     49
    4650batch-consistency-plot.r 1 rep1_vs_rep2_IDR_plot rep1_vs_rep2_IDR
    4751}}}
     
    4953=== Generate a conservative and an optimal final set of peak calls ===
    5054{{{
    51  # Use an IDR cutoff of 0.01 to 0.05, depending on the number of pre-IDR peaks and size of the genome:
    52  # IDR <=0.05 for < 100K pre-IDR peaks for large genomes (human/mouse)
    53  # IDR <= 0.01 or 0.02 for ~15K to 40K peaks in smaller genomes such as worm
     55# Use an IDR cutoff of 0.01 to 0.05, depending on the number of pre-IDR peaks and size of the genome:
     56# IDR <= 0.05 for < 100K pre-IDR peaks for large genomes (human/mouse)
     57# IDR <= 0.01 or 0.02 for ~15K to 40K peaks in smaller genomes such as worm
    5458
    5559awk '{ if($NF < 0.05) print $0 }' rep1_vs_rep2_IDR-overlapped-peaks.txt > rep1_vs_rep2_conserved_peaks_by_IDR.txt