Changes between Version 17 and Version 18 of SOPs/atac_Seq


Ignore:
Timestamp:
08/25/20 13:09:47 (5 years ago)
Author:
byuan
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/atac_Seq

    v17 v18  
    118118}}}
    119119
     120=== Motifs analysis ===
     121
     122Search motifs with [[http://homer.ucsd.edu/homer/ngs/peakMotifs.html | homer findMotifsGenome.pl]]
     123
     124By default, it performs de novo motif discovery as well as check the enrichment of known motifs (By default, known.motifs in the downloaded homer folder is used).
     125
     126Note: findMotifsGenome.pl calls tab2fasta.pl, which sharing the same name as one of our BaRC script. Make sure that you calls the script from homer.
     127
     128{{{
     129findMotifsGenome.pl peak.bed hg38 out_dir -size 300 -S 2 -p 5 -cache 100 -fdr 5 -mask -mknown Jaspar_hs_core_homer.motifs -mcheck Jaspar_hs_core_homer.motifs
     130
     131input parameters:
     132-mask: use the repeat-masked sequence
     133-size: (default 200). Explanation from homer website: "If analyzing ChIP-Seq peaks from a transcription factor, Chuck would recommend 50 bp for establishing the primary motif bound by a given transcription factor and 200 bp for finding both primary and "co-enriched" motifs for a transcription factor.  When looking at histone marked regions, 500-1000 bp is probably a good idea (i.e. H3K4me or H3/H4 acetylated regions).
     134-mknown <motif file> (known motifs to check for enrichment.
     135-mcheck <motif file> (known motifs to check against de novo motifs,
     136-S: Number of motifs to find (default 25)
     137-p Number of processors to use (default 1)
     138
     139}}}
     140
     141
     142To download species specific Jaspar motifs, convert to homer motif format, and save to motif file.
     143
     144In this example, download human core motifs from JASPAR2016, and saved to Jaspar_hs_core_homer.motifs
     145
     146{{{
     147
     148library(TFBSTools)
     149
     150opts["collection"] <- "CORE"
     151opts["species"] = 9606
     152
     153Jaspar_hs_core <- getMatrixSet(JASPAR2016::JASPAR2016, opts)
     154
     155# convert to homer motif format:
     156
     157library(universalmotif)
     158
     159write_homer (Jaspar_hs_core, file="Jaspar_hs_core_homer.motifs")
     160
     161}}}
     162
     163
     164The findMotifsGenome.pl creates two html files, one for de novo identified motifs, the other is known motifs.
     165
     166
     167Annotated motifs with homer [[http://homer.ucsd.edu/homer/ngs/quantification.html | annotatePeaks.pl]]
     168
     169
     170{{{
     171annotatePeaks.pl peak.bed hg38 -m input_motifs -mbed motif.bed > annotated_motifs.txt
     172
     173Where:
     174-m: motifs can be combined first and save as a file. In the output file, this will link motifs associated with a peak together.
     175-mbed <filename> (Output motif positions to a BED file to load at genome browser)
     176
     177}}}
     178
     179For each peak, it gives the distance to nearest feature, categorized them into promoter, intergenic, intron#, exon#, TSS)
     180
    120181
    121182=== Other recommendations for ATAC-Seq ===