Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of SOPs/multipleSequenceAlignment

Timestamp:: 01/23/13 16:49:43 (12 years ago)
Author:: trac
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/multipleSequenceAlignment

               v1
+== Review Articles ==
+Review article by [http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0030123 Cedric Notredame][[br]]
+[http://www.genomeweb.com/alignment-algorithms-0 Genome Technology article] by Fran and George
+== Start with sequences from a database ==
+Useful alignment algorithms - available on the web, for desktop computers, for linux systems
+[http://www.drive5.com/muscle/ muscle][[BR]]
+[http://www.tcoffee.org/ t_coffee][[BR]]
+[http://mafft.cbrc.jp/alignment/software/ mafft]
+The output from these programs can then be visualized in [http://www.clustal.org/ ClustalX] or [http://www.jalview.org/download Jalview].
+Our favorite method is to use the T-COFFEE suite (more specifically, M-Coffee) to run multiple alignment methods and then create a consensus alignment, a sort of a meta-alignment. This can be done with a single command like
+{{{
+t_coffee my_proteins.fa -method=t_coffee_msa,mafft_msa,probcons_msa,muscle_msa -output=fasta_aln
+}}}
+The final consensus alignment will appear in the file my_proteins.fasta_aln, which can them be viewed in ClustalX.
+== Start with genome coordinates ==
+This method will detail how to extract a slice of a pre-computed genome-genome alignment from the UCSC Genome Browser.
+* Select region in UCSC Genome Browser (ex: chr19:100,000-100,100 in human hg19 assembly)
+* Click on Tools > Table Browser
+* In the Table Browser, make these selections:
+  * group => Comparative Genomics
+  * table => Multiz Align
+  * region => position (which should show the region you began with in the genome browser)
+  * output format => maf
+* Click "get output" button
+* Copy/paste or save MAF file (ex: MyRegion.maf)
+* Few programs understand MAF format, so you may need to convert the alignment to a format like fasta.
+    * By selecting the "-" strand, the program will reverse-complement the alignment.
+    * 'refGenome' should be the UCSC Bioinformatics assembly name (like hg19 or mm9).
+{{{
+# USAGE: maf_alignment_to_fasta.pl mafFile refGenome regionName strand[+-] > regionName.fa
+Sample command:
+  /nfs/BaRC_Public/BaRC_code/Perl/maf_alignment_to_fasta/maf_alignment_to_fasta.pl MyRegion.maf hg19 myRegion + > Gata4.NEW.fa
+}}}