82 | | ==== MACS14 ==== |
83 | | We now recommend using macs2. Macs2 is more effective at peak finding especially for broad peaks. If you still want to use macs1.4, below are our previous recommendations on how to use it. |
84 | | * For MACS to work the header of the sequences can have no spaces. |
85 | | * The command 'macs' points to macs14 on WIBR local machines |
86 | | * MACS may have trouble with a SAM file from bowtie if it contains unmapped reads (which it generally does). As a result, you may need to filter out unmapped reads with a command like |
87 | | {{{ |
88 | | samtools view -hS -F 4 all_reads.sam > mapped_reads.sam |
89 | | }}} |
90 | | |
91 | | |
92 | | [[http://liulab.dfci.harvard.edu/MACS/|MACS]] ([[http://liulab.dfci.harvard.edu/MACS/00README.html|README]]) ([[http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Search&db=pubmed&term=18798982|Zhang et al., 2008]])[[br]][[br]] |
93 | | |
94 | | * MACS is appropriate for both proteins like transcription factors that may have narrow peaks, as well as histone modifications that may affect broader regions. However, for broader peaks changing values of the parameters may be needed: eg. using --nomodel, --nolambda (if there's no control), --call-subpeaks. Running MACS with different parameters and viewing in IGV the results can help in choosing the optimal values. |
95 | | |
96 | | Sample commands to run MACS (current version as of March 5 2012): 1.4.2 using mapped reads in map or sam format: |
97 | | {{{ |
98 | | macs -t IP_mapped.map -c Control_mapped.map -g 1.87e9 --name=outputName --format=BOWTIE --tsize=36 --wig --space=25 --mfold=10,30 |
99 | | macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM --tsize=36 --wig --space=25 --mfold=10,30 |
100 | | macs -t IP_mapped.sam -c Control_mapped.sam -g 1.87e9 --name=outputName --format=SAM --tsize=36 --wig --space=25 --nomodel --shiftsize=100 |
101 | | }}} |
102 | | |
103 | | |
104 | | The parameters used on the command are: |
105 | | * -t TFILE Treatment file |
106 | | * -c CFILE Control file |
107 | | * --name=NAME Experiment name, which will be used to generate output file names. DEFAULT: "NA" |
108 | | * --format=FORMAT Format of tag file, "BED" or "ELAND" or "ELANDMULTI" or "ELANDMULTIPET" or "SAM" or "BAM" or "BOWTIE". DEFAULT: "BED" |
109 | | * --tsize=TSIZE Tag size. DEFAULT: auto detected tag size |
110 | | * --wig Whether or not to save shifted raw tag count at every bp into a wiggle file |
111 | | * --mfold=MFOLD Select the regions within MFOLD range of high-confidence enrichment ratio against background to build model. The regions must be lower than upper limit, and higher than the lower limit. DEFAULT:10,30 |
112 | | * -g GSIZE Effective genome size. It can be 1.0e+9 or 1000000000, or shortcuts:'hs' for human (2.7e9), 'mm' for mouse (1.87e9), 'ce' for C. elegans (9e7) and 'dm' for fruitfly (1.2e8), Default:hs |
113 | | * --keep-dup=1 Controls the MACS behavior towards duplicate tags at the exact same location. DEFAULT: 1 in both MACS 1.4 and MACS2. |
114 | | * --nomodel whether or not to build the shifting model. If True, MACS will not build model. by default it means shifting size = 100. |
115 | | * --shiftsize The arbitrary shift size in bp. When nomodel is true, MACS will use this value as 1/2 of fragment size. DEFAULT: 100. |
116 | | |
117 | | {{{ |
118 | | bsub "macs14 -t IP_mapped.map -c Control_mapped.map --name=outputName --format=BOWTIE --tsize=36 --wig --space=25 --mfold=10,30" |
119 | | bsub "macs14 -t IP_mapped.sam -c Control_mapped.sam --name=outputName --format=SAM --tsize=36 --wig --space=25 --mfold=10,30" |
120 | | }}} |
121 | | |
122 | | |
123 | | ''Note'': The wig files that macs14 generates are not normalized. |
124 | | |
125 | | |
126 | | ==== SISSRs ==== |
127 | | [http://sissrs.rajajothi.com/ SiSSRs] ([https://dir.nhlbi.nih.gov/papers/lmi/epigenomes/sissrs/SISSRs-Manual.pdf manual]) |
128 | | [[chipSeqBakeOff|ChIP-Seq bake off]] |
129 | | |
130 | | SISSRs uses strand bimodality to try to find the summit of the peak. The summit should be very close to the DNA bound by the TF. It is more appropriate for TFs because they tend to bind in specific narrow regions. |
131 | | |
132 | | Map with Bowtie, use --sam parameter to get a SAM output file |
133 | | {{{ |
134 | | bsub "bowtie -t -m 3 -n 3 -l 36 --strata --best --solexa1.3-quals --sam inputSeq bowtieOutput.sam" |
135 | | }}} |
136 | | |
137 | | SISSRs input is a bed file. Convert mapped reads from SAM to BAM and from BAM to bed format |
138 | | |
139 | | {{{ |
140 | | bsub "samtools view -S -b -o bowtieOutput.bam bowtieOutput.sam" |
141 | | -S input is SAM |
142 | | -b output BAM |
143 | | }}} |
144 | | |
145 | | {{{ |
146 | | bsub "bamToBed -i bowtieOutput.bam > bowtieOutput.bed" |
147 | | }}} |
148 | | |
149 | | Run SISSRs with a sample command like |
150 | | {{{ |
151 | | sissrs.pl -i bowtieOutput.bed -o outputFile -s 2716965481 -b Background.bed -L 200 |
152 | | }}} |
153 | | |
154 | | The parameters used in the sample command: |
155 | | * -s is the size of the genome |
156 | | * -L is the maximum length of the fragment |
157 | | * -m is the percentage of mappable bps. Default is .8 for Eland in human. |
158 | | |
159 | | For more detailed description of parameters see our [chipSeqBakeOff ChIP-Seq bake off]. |