= Searching for patterns, motifs, or profiles in a DNA or protein sequence = This is a traditional bioinformatics task, any many tools do this in a variety of ways. One main determinant of tool is your representation of what you're looking for. == Search for a pattern (text, with optional choices at some positions) == [http://emboss.sourceforge.net/apps/cvs/emboss/apps/dreg.html dreg] (EMBOSS suite) - for nucleic acids (where "pattern" is a regular expression) {{{ dreg -pattern "GGCC[ACGT]" -sequence My_promoters.fa -outfile My_promoters.GGCCN.dreg_out.txt }}} [http://emboss.sourceforge.net/apps/cvs/emboss/apps/preg.html preg] (EMBOSS suite) - for proteins (where "pattern" is a regular expression) {{{ dreg -pattern "LPE[ACS]G" -sequence My_proteins.fa -outfile My_proteins.fa.LPEMG.preg_out.txt }}} [http://emboss.sourceforge.net/apps/cvs/emboss/apps/fuzznuc.html fuzznuc] (EMBOSS suite) - for nucleic acids (where "pmismatch" is the number of mismatches in the pattern) {{{ fuzznuc -pattern "nnnGGCCTnnn" -sequence My_promoters.fa -pmismatch 1 -outfile My_promoters.GGCCT.1mis.fuzznuc_out.txt }}} [http://emboss.sourceforge.net/apps/cvs/emboss/apps/fuzzpro.html fuzzpro] (EMBOSS suite) - for proteins (where "pmismatch" is the number of mismatches in the pattern) {{{ fuzzpro -pattern "xxxxLPEAGxxxx" -sequence My_proteins.fa -pmismatch 1 -outfile My_proteins.LPEAG.1mis.fuzzpro_out.txt }}} == Search for a profile (a probability matrix, with choices at all positions) == These searches are generally a two-step process, one step to create the motif and one step to search with it. There are several choices of detailed options, so check out the documentation. [http://emboss.sourceforge.net/apps/cvs/emboss/apps/prophecy.html prophecy] + [http://emboss.sourceforge.net/apps/cvs/emboss/apps/profit.html profit] (EMBOSS suite) - for proteins {{{ prophecy -sequence Aligned_protein_sites.fa -type F -name MyProfile -outfile MyProfile.txt -filter profit -infile MyProfile.txt -sequence My_proteins.fa -outfile My_proteins.MyProfileprofit_out.txt }}} [http://hmmer.org/ HMMER] - for proteins or nucleic acids {{{ # Create a HMM from an aligned set of proteins or nucleic acids (fasta or other common format) hmmbuild MyProfile.hmm Aligned_protein_sites.fa # Use the HMM to search a fasta file of proteins hmmsearch MyProfile.hmm Protein_set.fa > Protein_set.MyProfile.hmmsearch_out.txt }}}