Changes between Initial Version and Version 1 of SOPs/multipleSequenceAlignment


Ignore:
Timestamp:
01/23/13 16:49:43 (12 years ago)
Author:
trac
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/multipleSequenceAlignment

    v1 v1  
     1== Review Articles ==
     2
     3Review article by [http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0030123 Cedric Notredame][[br]]
     4[http://www.genomeweb.com/alignment-algorithms-0 Genome Technology article] by Fran and George
     5
     6== Start with sequences from a database ==
     7
     8Useful alignment algorithms - available on the web, for desktop computers, for linux systems
     9
     10[http://www.drive5.com/muscle/ muscle][[BR]]
     11
     12[http://www.tcoffee.org/ t_coffee][[BR]]
     13
     14[http://mafft.cbrc.jp/alignment/software/ mafft]
     15
     16The output from these programs can then be visualized in [http://www.clustal.org/ ClustalX] or [http://www.jalview.org/download Jalview].
     17
     18Our favorite method is to use the T-COFFEE suite (more specifically, M-Coffee) to run multiple alignment methods and then create a consensus alignment, a sort of a meta-alignment. This can be done with a single command like
     19
     20
     21{{{
     22t_coffee my_proteins.fa -method=t_coffee_msa,mafft_msa,probcons_msa,muscle_msa -output=fasta_aln
     23}}}
     24
     25
     26The final consensus alignment will appear in the file my_proteins.fasta_aln, which can them be viewed in ClustalX.
     27
     28
     29== Start with genome coordinates ==
     30
     31This method will detail how to extract a slice of a pre-computed genome-genome alignment from the UCSC Genome Browser.
     32
     33* Select region in UCSC Genome Browser (ex: chr19:100,000-100,100 in human hg19 assembly)
     34* Click on Tools > Table Browser
     35* In the Table Browser, make these selections:
     36  * group => Comparative Genomics
     37  * table => Multiz Align
     38  * region => position (which should show the region you began with in the genome browser)
     39  * output format => maf
     40* Click "get output" button
     41* Copy/paste or save MAF file (ex: MyRegion.maf)
     42* Few programs understand MAF format, so you may need to convert the alignment to a format like fasta. 
     43    * By selecting the "-" strand, the program will reverse-complement the alignment. 
     44    * 'refGenome' should be the UCSC Bioinformatics assembly name (like hg19 or mm9).
     45
     46{{{
     47# USAGE: maf_alignment_to_fasta.pl mafFile refGenome regionName strand[+-] > regionName.fa
     48Sample command:
     49  /nfs/BaRC_Public/BaRC_code/Perl/maf_alignment_to_fasta/maf_alignment_to_fasta.pl MyRegion.maf hg19 myRegion + > Gata4.NEW.fa
     50}}}