| 1 | == Review Articles == |
| 2 | |
| 3 | Review article by [http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0030123 Cedric Notredame][[br]] |
| 4 | [http://www.genomeweb.com/alignment-algorithms-0 Genome Technology article] by Fran and George |
| 5 | |
| 6 | == Start with sequences from a database == |
| 7 | |
| 8 | Useful alignment algorithms - available on the web, for desktop computers, for linux systems |
| 9 | |
| 10 | [http://www.drive5.com/muscle/ muscle][[BR]] |
| 11 | |
| 12 | [http://www.tcoffee.org/ t_coffee][[BR]] |
| 13 | |
| 14 | [http://mafft.cbrc.jp/alignment/software/ mafft] |
| 15 | |
| 16 | The output from these programs can then be visualized in [http://www.clustal.org/ ClustalX] or [http://www.jalview.org/download Jalview]. |
| 17 | |
| 18 | Our favorite method is to use the T-COFFEE suite (more specifically, M-Coffee) to run multiple alignment methods and then create a consensus alignment, a sort of a meta-alignment. This can be done with a single command like |
| 19 | |
| 20 | |
| 21 | {{{ |
| 22 | t_coffee my_proteins.fa -method=t_coffee_msa,mafft_msa,probcons_msa,muscle_msa -output=fasta_aln |
| 23 | }}} |
| 24 | |
| 25 | |
| 26 | The final consensus alignment will appear in the file my_proteins.fasta_aln, which can them be viewed in ClustalX. |
| 27 | |
| 28 | |
| 29 | == Start with genome coordinates == |
| 30 | |
| 31 | This method will detail how to extract a slice of a pre-computed genome-genome alignment from the UCSC Genome Browser. |
| 32 | |
| 33 | * Select region in UCSC Genome Browser (ex: chr19:100,000-100,100 in human hg19 assembly) |
| 34 | * Click on Tools > Table Browser |
| 35 | * In the Table Browser, make these selections: |
| 36 | * group => Comparative Genomics |
| 37 | * table => Multiz Align |
| 38 | * region => position (which should show the region you began with in the genome browser) |
| 39 | * output format => maf |
| 40 | * Click "get output" button |
| 41 | * Copy/paste or save MAF file (ex: MyRegion.maf) |
| 42 | * Few programs understand MAF format, so you may need to convert the alignment to a format like fasta. |
| 43 | * By selecting the "-" strand, the program will reverse-complement the alignment. |
| 44 | * 'refGenome' should be the UCSC Bioinformatics assembly name (like hg19 or mm9). |
| 45 | |
| 46 | {{{ |
| 47 | # USAGE: maf_alignment_to_fasta.pl mafFile refGenome regionName strand[+-] > regionName.fa |
| 48 | Sample command: |
| 49 | /nfs/BaRC_Public/BaRC_code/Perl/maf_alignment_to_fasta/maf_alignment_to_fasta.pl MyRegion.maf hg19 myRegion + > Gata4.NEW.fa |
| 50 | }}} |