Changes between Version 56 and Version 57 of SOPs/chip_seq_peaks


Ignore:
Timestamp:
06/15/21 13:03:47 (4 years ago)
Author:
ibarrasa
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/chip_seq_peaks

    v56 v57  
    8080[[https://github.com/taoliu/MACS/wiki | MACS2 wiki]]
    8181
    82 ==== MACS14 ====
    83 We now recommend using macs2. Macs2 is more effective at peak finding especially for broad peaks. If you still want to use macs1.4, below are our previous recommendations on how to use it.
    84   * For MACS to work the header of the sequences can have no spaces.
    85   * The command 'macs' points to macs14 on WIBR local machines
    86   * MACS may have trouble with a SAM file from bowtie if it contains unmapped reads (which it generally does).  As a result, you may need to filter out unmapped reads with a command like
    87 {{{
    88 samtools view -hS -F 4 all_reads.sam > mapped_reads.sam
    89 }}}
    90 
    91 
    92 [[http://liulab.dfci.harvard.edu/MACS/|MACS]] ([[http://liulab.dfci.harvard.edu/MACS/00README.html|README]]) ([[http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Search&db=pubmed&term=18798982|Zhang et al., 2008]])[[br]][[br]]
    93 
    94   * MACS is appropriate for both proteins like transcription factors that may have narrow peaks, as well as histone modifications that may affect broader regions.  However, for broader peaks changing values of the parameters may be needed: eg. using --nomodel, --nolambda (if there's no control), --call-subpeaks.  Running MACS with different parameters and viewing in IGV the results can help in choosing the optimal values.
    95 
    96 Sample commands to run MACS (current version as of March 5 2012): 1.4.2 using mapped reads in map or sam format:
    97 {{{
    98 macs -t IP_mapped.map -c Control_mapped.map -g 1.87e9 --name=outputName --format=BOWTIE --tsize=36 --wig --space=25  --mfold=10,30
    99 macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM    --tsize=36 --wig --space=25  --mfold=10,30
    100 macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM    --tsize=36 --wig --space=25  --nomodel --shiftsize=100
    101 }}}
    102 
    103 
    104 The parameters used on the command are:
    105   * -t TFILE Treatment file
    106   * -c CFILE Control file
    107   * --name=NAME Experiment name, which will be used to generate output file names. DEFAULT: "NA"
    108   * --format=FORMAT       Format of tag file, "BED" or "ELAND" or "ELANDMULTI" or "ELANDMULTIPET" or "SAM" or "BAM" or "BOWTIE". DEFAULT: "BED"
    109   * --tsize=TSIZE         Tag size. DEFAULT: auto detected tag size
    110   * --wig                 Whether or not to save shifted raw tag count at every bp into a wiggle file
    111   * --mfold=MFOLD      Select the regions within MFOLD range of high-confidence enrichment ratio against background to build model. The regions must be lower than upper limit, and higher than the lower limit. DEFAULT:10,30
    112   * -g GSIZE  Effective genome size. It can be 1.0e+9 or 1000000000, or shortcuts:'hs' for human (2.7e9), 'mm' for mouse (1.87e9), 'ce' for C. elegans (9e7) and 'dm' for fruitfly (1.2e8), Default:hs
    113   * --keep-dup=1   Controls the MACS behavior towards duplicate tags at the exact same location.  DEFAULT: 1 in both MACS 1.4 and MACS2.   
    114   * --nomodel  whether or not to build the shifting model. If True, MACS will not build model. by default it means shifting size = 100.                     
    115   * --shiftsize The arbitrary shift size in bp. When nomodel is true, MACS will use this value as 1/2 of fragment size. DEFAULT: 100.
    116 
    117 {{{
    118 bsub "macs14 -t IP_mapped.map -c Control_mapped.map --name=outputName --format=BOWTIE --tsize=36 --wig --space=25 --mfold=10,30"
    119 bsub "macs14 -t IP_mapped.sam -c Control_mapped.sam --name=outputName --format=SAM    --tsize=36 --wig --space=25 --mfold=10,30"
    120 }}}
    121 
    122 
    123 ''Note'': The wig files that macs14 generates are not normalized.
    124 
    125 
    126 ==== SISSRs ====
    127 [http://sissrs.rajajothi.com/ SiSSRs] ([https://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/SISSRs-Manual.pdf manual])
    128 [[chipSeqBakeOff|ChIP-Seq bake off]]
    129 
    130 SISSRs uses strand bimodality to try to find the summit of the peak. The summit should be very close to the DNA bound by the TF.  It is more appropriate for TFs because they tend to bind in specific narrow regions.
    131 
    132 Map with Bowtie, use --sam parameter to get a SAM output file
    133 {{{
    134 bsub "bowtie -t -m 3 -n 3 -l 36 --strata --best --solexa1.3-quals --sam inputSeq bowtieOutput.sam"
    135 }}}
    136 
    137 SISSRs input is a bed file. Convert mapped reads from SAM to BAM and from BAM to bed format
    138 
    139 {{{
    140 bsub "samtools view -S -b -o bowtieOutput.bam bowtieOutput.sam"
    141 -S       input is SAM
    142 -b       output BAM
    143 }}}
    144 
    145 {{{
    146 bsub "bamToBed -i bowtieOutput.bam > bowtieOutput.bed"
    147 }}}
    148 
    149 Run SISSRs with a sample command like
    150 {{{
    151 sissrs.pl -i bowtieOutput.bed -o outputFile -s 2716965481 -b Background.bed -L 200
    152  }}}
    153    
    154 The parameters used in the sample command:
    155   * -s is the size of the genome
    156   * -L is the maximum length of the fragment
    157   * -m is the percentage of mappable bps. Default is .8 for Eland in human.
    158 
    159 For more detailed description of parameters see our [chipSeqBakeOff ChIP-Seq bake off].
    16082
    16183