C. Sequence Comparision and Alignment:
1. Computational problem with a pair of aligned sequences
-
There is a certain
intuitive sense that two similar sequences can be lined up so that identical
bases (or amino acids) are all matched
-
-
However from a
computer's point of view the alignment process is far from trivial.
-
-
If gaps are allowed,
there are a tremendous number of different alignments possible for any
two sequences.
-
Seq 1: TT - ACTTGCC
-
Seq 2: ATGAC - - GAC
-
-
A "dynamic programming"
method is usually able to produce a nearly optimal alignment by testing
only a small subset of all possibilities.
-
-
By altering penalties
for gaps vs. mismatches, it is possible to use this technique to give very
good alignments without requiring tremendous computing power.
-
-
Also, remember
that the "optimal" alignment according to the computer is often not the
"correct" biological alignment.
2. Multiple Sequence Alignments
-
When the alignment
problem is expanded to multiple sequences, we are once again confronted
with a computationally huge problem.
-
Seq 1: TT - ACTTGCC
-
Seq 2: ATGAC - - GAC
-
Seq 3: CT - AGCCTGA
-
-
Rather than try
to optimally align a bunch of sequences at once (dealing with nearly infinite
permutations), a simplified "Heuristic" algorithm is used" known as "progressive
alignment"
-
rank all sequences
according to similarity to each other
-
align the most
similar two sequences using the pairwise algorithm
-
create a consensus
sequence from this pairwise alignment
-
take the next
most similar sequence and align to the consensus, but remember to insert
gaps in the individual sequences that make up the consensus
-
continue to add
sequences until all are aligned
-
-
One result of
this type of algorithm is that the order that the sequences are added to
the alignment can affect the final outcome
-
-
Also, since the
problem is so complex, it is quite difficult to mathematically define a
truly optimal alignment of multiple sequences.
Computers in Molecular Microbiology
Bharat Patel, Biomolecular & Biomedical Sciences, Griffith University
Comments to: bharat@trishul.sci.gu.edu.au