Aligning rRNA Sequences
Today, generating DNA sequences is an extremely easy task as can be seen from the genome sequencing projects. The data is usually generated in chunks of approximately between 300 to 1000base pairs. The real challenge does not come from DNA sequencing but from the analysis of this data.
Genes are typically tens of thousands of bp long, so the the ensuing data must be overlapped and assembled into much longer segments known as "contigs".
The DNA sequences usually have errors of both incorrectly determined
bases and insertions and deletions. The rate of errors is the highest at the
beginning and the end of the reads - precisely where the regions must be
overlapped. Another complication is that sequence from cloning vectors is often present at the end of the reads
(NOTE: You will be given examples of such problematic sequences in a different
exercise).
Since contig assembly is such a critical problem, a lot of effort has been put
into developing software to aid DNA sequencing projects.
Based on their faith in the sequence assembly software, researchers have taken one of three
different approaches to planning sequencing projects.
People who don't trust the software use a "directed cloning" strategy, carefully
preparing ordered overlapping fragments.
A second strategy known as "primer walking" requires very fast and accurate analysis
of sequence reads since each sequencing reaction uses information from the previous read.
A third strategy, know as "shotgun sequencing" takes maximum advantage of the speed
and low cost of automated sequencing, but relies totally on software to assembly a jumble of
sequence reads into a coherent and accurate contig.
You have been given the a set of
sequences generated by primer walking a 16S rRNA gene (in text format). These sequences have been trimmed (cleaned up) and do not contain vector sequences and / or any errors. It is now possible to assemble the sequences using the "Contig Assembly Program" (CAP), which forms part of the Bioedit suite of programs.
Comments and suggestions to:Dr. Bharat Patel
<bharat@trishul.sci.gu.edu.au
A>>
HTML'd by Bharat Patel
[Created: 1 Aug 1997]
[Updated: 17 July 2000]