Microbial Genomes - Overview

  1. Introduction
    1. Genomics-study of the molecular organization of genomes, their information content, and the gene products they encode
      1. Structural genomics-study of the physical nature of genomes (i.e., determine and analyze the DNA sequence of the genome)
      2. Functional genomics-study of the way the genome functions
      3. Comparative genomics-compares genomes from different organisms; helps discern patterns in function and regulation and provides information about microbial evolution
    2. Genomics will decrease the amount of time microbiologists spend cloning genes; instead they will generate new questions and hypotheses by computer analyses of genome data


  1. Determining DNA Sequences
    1. Sanger Method
      1. Uses dideoxynucleoside triphosphates (ddNTP) in DNA synthesis; these lack a 3'-hydroxyal and terminate DNA synthesis
      2. Single strands of DNA are mixed with a primer, DNA polymerase I, four deoxynucleoside triphophates (one is labeled), and a small amount of one of the ddNTP; DNA synthesis begins with primer but terminates each time a ddNTP is added to the chain
      3. Four reactions are run, each with a different ddNTP; these reactions generate DNA fragments of different length, because the site at which the ddNTP is inserted is random
      4. Newly synthesized DNA fragments are separated electrophoretically, the gel is autoradiographed and the sequence read
    2. Automated systems now exist, which make rapid sequencing possible


  1. Whole-Genome Shotgun Sequencing
    1. Sequencing a genome by the whole-genome shotgun approach is a multi-step process
      1. Library construction-chromosomes are broken into gene-sized fragments, inserted into plasmids, and transformed into special E. coli strains
      2. Random sequencing-the cloned fragments are sequenced
      3. Fragment alignment and gap closure-DNA fragments are clustered and assembled into longer stretches of sequence by comparing nucleotide sequence overlaps between fragments; this produces larger contiguous sequences called contigs; the contigs are aligned in the proper order to form the completed genome sequence; gaps in the sequence are filled
      4. Editing-sequence is proofread to resolve any ambiguities in the sequence
    2. Annotation is done once the sequence is obtained; annotation involves identifying open reading frames (ORFs), determining potential amino acid sequences and comparison to known proteins; such comparison allows tentative assignment of gene function as well as identification of transposable elements, operons, and repeat sequences, and the detection of various metabolic pathways


  1. Bioinformatics
    1. The field concerned with the management and analysis of biological data using computers
    2. DNA sequence data is stored in large databases such as the International Nucleic Acid Sequence Data Library (GenBank)
  2. General Characteristics of Microbial Genomes
    1. Analysis of a number of microbial genomes has enabled scientists to: develop hypothese about minimal genome size, identify large numbers of genes for which function is unknown, and formulate hypotheses regarding the evolutionary relationships among the three domains of life
    2. Findings of particular interest
      1. Deinococci-bacteria with remarkable resistance to radiation-have the same array of DNA repair genes as other bacteria; they just have more copies of the repair genes
      2. Rickettsia prowazekii-a bacterium thought to be related to the ancient bacterium that gave rise to eucaryotic mitochondria-has sequences consistent with this hypothesis
      3. Chlamydiae-bacteria that have a unique life cycle, are often referred to as energy parasites, and lack peptidoglycan-have a genome that is similar to many other bacteria and even contains some genes for ATP synthesis and peptidoglycan synthesis; however, these bacteria lack a gene long held to be required for septum formation during cell division
      4. Treponema pallidum-the causative agent of syphilis, which has not been cultured outside the human body-has a genome that shows it is metabolically crippled, both anabolically and catabolically; it also has a family of genes that encode surface proteins, suggesting that this bacterium may be able to change its surface proteins and thereby avoid attack by the host immune system
      5. Mycobacterium tuberculosis-the causative agent of tuberculosis-has a very large genome with more than 250 genes devoted to lipid metabolism; it also has a large number of regulatory genes, suggesting that the infection process is very complex; other genes may enable the bacterium to change its antigens and thus elude the host immune system
      6. Mycobacterium leprae-the causative agent of leprosy-has a much smaller genome and about half of its genome is devoid of functional genes
    3. General Patterns
      1. Despite the conservation of protein sequences, genome organization is quite variable in the Bacteria and Archaea
      2. Considerable horizontal gene transfer, especially of housekeeping genes, has occurred in the evolution of these microbes


  1. Functional Genomics
    1. Genome annotation-used to tentatively identify genes; allows analysis of the kinds of genes and functions present in the microorganism
    2. Evaluation of RNA-level gene expression
      1. DNA microarrays (DNA chips)-solid supports (e.g., glass) that have DNA attached in highly organized arrays; in commercial chips, the array consists of many expressed sequence tags (a partial gene sequence unique to the gene that can be used to identify and position the gene during genomic analysis)
      2. The mRNA or cDNA to be analyzed (target mixture) is isolated, labeled with fluorescent reporter groups, and incubated with the DNA chip; fluorescence at an address on the chip indicates that the DNA probe on the chip is bound to a mRNA or cDNA in the target mixture; analysis of the hybridization pattern shows which genes are being transcribed
      3. Using this procedure the characteristic expression of whole sets of genes during differentiation or in response to environmental changes can be observed; patterns of gene expression can be detected and functions can be tentatively assigned based on expression
    3. Evaluation of protein-level gene expression
      1. Proteome-entire collection of proteins that an organism produces; proteomics is the study of the proteome
      2. The traditional method for studying the proteome is two-dimensional electrophoresis, which can resolve thousands of proteins in a mixture
      3. Differences in the proteins produced under various conditions can be detected
    4. Although functional genomics already has provided valuable information, there are still problems that must be solved


  1. The Future of Genomics
    1. New methods are needed for the large-scale analysis of genes and proteins so that more organisms can be studied
    2. All new information about DNA and protein sequences, variations in mRNA and protein levels, and protein interactions must be integrated in order to understand genome organization and the workings of a cell
    3. Genomics can be used to provide insights into pathogenicity and suggest treatments for infectious disease.
    4. Pharmacogenomics should produce many new drugs to treat disease
    5. The nature of horizontal gene transfer and the process of microbial evolution can be studied by comparing a wide variety of genomes
    6. There are numerous industrial applications (e.g., identification of novel enzymes)
    7. Genomics will probably impact agriculture (e.g., identification of new biopesticides)