Context Navigation

Changes between Version 17 and Version 18 of SOPs/atac_Seq

Timestamp:: 08/25/20 13:09:47 (5 years ago)
Author:: byuan
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/atac_Seq

-              v17
+              v18
 }}}
+=== Motifs analysis ===
+Search motifs with [[http://homer.ucsd.edu/homer/ngs/peakMotifs.html | homer findMotifsGenome.pl]]
+By default, it performs de novo motif discovery as well as check the enrichment of known motifs (By default, known.motifs in the downloaded homer folder is used).
+Note: findMotifsGenome.pl calls tab2fasta.pl, which sharing the same name as one of our BaRC script. Make sure that you calls the script from homer.
+{{{
+findMotifsGenome.pl peak.bed hg38 out_dir -size 300 -S 2 -p 5 -cache 100 -fdr 5 -mask -mknown Jaspar_hs_core_homer.motifs -mcheck Jaspar_hs_core_homer.motifs
+input parameters:
+-mask: use the repeat-masked sequence
+-size: (default 200). Explanation from homer website: "If analyzing ChIP-Seq peaks from a transcription factor, Chuck would recommend 50 bp for establishing the primary motif bound by a given transcription factor and 200 bp for finding both primary and "co-enriched" motifs for a transcription factor.  When looking at histone marked regions, 500-1000 bp is probably a good idea (i.e. H3K4me or H3/H4 acetylated regions).
+-mknown <motif file> (known motifs to check for enrichment.
+-mcheck <motif file> (known motifs to check against de novo motifs,
+-S: Number of motifs to find (default 25)
+-p Number of processors to use (default 1)
+}}}
+To download species specific Jaspar motifs, convert to homer motif format, and save to motif file.
+In this example, download human core motifs from JASPAR2016, and saved to Jaspar_hs_core_homer.motifs
+{{{
+library(TFBSTools)
+opts["collection"] <- "CORE"
+opts["species"] = 9606
+Jaspar_hs_core <- getMatrixSet(JASPAR2016::JASPAR2016, opts)
+# convert to homer motif format:
+library(universalmotif)
+write_homer (Jaspar_hs_core, file="Jaspar_hs_core_homer.motifs")
+}}}
+The findMotifsGenome.pl creates two html files, one for de novo identified motifs, the other is known motifs.
+Annotated motifs with homer [[http://homer.ucsd.edu/homer/ngs/quantification.html | annotatePeaks.pl]]
+{{{
+annotatePeaks.pl peak.bed hg38 -m input_motifs -mbed motif.bed > annotated_motifs.txt
+Where:
+-m: motifs can be combined first and save as a file. In the output file, this will link motifs associated with a peak together.
+-mbed <filename> (Output motif positions to a BED file to load at genome browser)
+}}}
+For each peak, it gives the distance to nearest feature, categorized them into promoter, intergenic, intron#, exon#, TSS)
 === Other recommendations for ATAC-Seq ===