Coner

MB 451 Microbial Diversity

Department of Microbiology - NC State University

Home | Announcements | Course Info | Lectures | Labs | Exams | Term Project | Grades | ~~~null pyro

audio Audio recording of this lecture
Previous or Next lecture

Readings for this lecture:


Molecular Ecology: Population-based Approaches

Molecular phylogenetic surveys give a much larger picture of the diversity of an environment than the older cultivation-dependent methods, but a real drawback is the time, expense, and energy required. The result of this is that these molecular surveys are almost always performed one at a time; a single snapshot of a point in the environment. But microbial ecologists know that a lot of the interesting stuff is the variation in the microbial populations from one place to the next (measured in micrometers to thousands of kilometers) or from one time to another (measured in minutes to thousands of years). How, for example, do you survey the dynamics of the microbial population in each layer of a Yellowstone microbial mat, from it's source pool to it's drainage into a cold stream, during the diurnal and seasonal cycles? Or follow the changes in microbial population in the subsurface as a gasoline plume forms from a leaky storage tank? The old-fashioned common answer is DGGE; denaturing gradient gel electrophoresis. The emerging alternative is terminal-RFLP. We'll talk toay about these technologies and examples of how they can be used.


DGGE analysis of bacteria in a Tibetan hot spring.

Denaturing Gradient Gel Electrophoresis

DGGE starts out like almost molecular phylogenetic analysis does these days; by the isolation of DNA from environmental samples, followed by PCR of ssu-rRNA genes. Rather than cloning and sequencing from this pool of genes, however, they are first separated into unique sequences based on their denaturation properties.

DGGE is carried out in polyacrylamide gels in which the concentration of urea and formamide increases from top to bottom in the gel; i.e. the gel contains a gradient of denaturants. (Remember that denaturation of DNA means separation of the two strands.) The PCR-amplified ssu-rDNA is loaded in wells at the top of the gel, where the concentration of urea/formamide is too low to denature the DNA. As the ssu-rDNA migrates down the gel during electrophoresis, the concentration of urea/formamide increases until, at some point, it is high enough to denature the DNA. At this point, the ssu-rDNA band essentially stops moving (it slows way down). Because every ssu-rDNA sequence will have a different denaturation point, they will denature at different levels of the gel and separate into distinct bands despite the fact that the ssu-rDNAs in all of the bands are all the same size.

DGGE

A technical improvement that has become standard in this method is the incorporation of a long tail of G=C basepairs to the end of one of the PCR primers. This "GC clamp" keeps the denatured strands of the ssu-rDNA from becoming completely separated, effectively doubling the length of the single-straded DNA (slowing it to a near stop), and so that the two strands don't begin to separate into two distinct bands in the gel, confusing the issue.

Another version of DGGE is TGGE; temperature gradient gel electrophoresis, in which the denaturant is temperature instead of urea/formamide. The electrophoresis unit is designed so that the gel is heated in a controllable fashion, usually to a higher temperature at the bottom than at the top.

The gels are stained once run and visualized as usual. Each visible band represents an abundant organism in that environment. The pattern of bands is a "fingerprint" of the environment, as well. The intensity of each band represents the abundance of the organism, to some extent (assuming no bias in the PCR reaction), and can be followed from place to place or time to time. In order to actually identify the organism represented by each band, you can cut the band from a gel, re-PCR amplify it, and sequence it.

Different bacteria in different temperature zones of a Tibetan hot spring.

Yim LC, Hongmei J, Aitchison JC & Pointing SB 2006 Highly diverse community structure in a remote central Tibetan geothermal spring does not display monotonic variation to thermal stress. FEMS Microbiol Ecol. 57:80-91

Question: How do the bacterial communities in various hot springs and temperature zones differ?

The authors try in this paper to examine two aspects of diversity in this Tibetan hot spring: how the diversity varies with temperature, and how the organisms differ from those of similar hotsprings elsewhere in the world. For the first, they hypothesize that the populations do not vary directly ('monotonically') according to thermal stress; in other words, that the populations do not get simpler with each increase in temperature. Secondly, they compare the sequences they get with GenBank, including sequences from hotsprings all over the world, to see if they can detect any distinctness of their sequences; in other words, do they see 'phylogeographic' groups.

This last issue is an important one that is usually ignored in microbiology - the notion that certain bacterial species or strains might have a geographic 'range', in the same way that macroscopic plants and animals do. It's usually assumed that the local environment controls the microbial species present, and that there is enough contamination of every environment with every organism that any place the same local environment exists will have the same species and strains of microbes. In the introduction of the paper, the authors describe several cases where this is and is not true for hotspring organisms.

mat

If this look familiar, it should - this is a lot like the typical neutral pH low sulfide hotsprings in Yellowstone, such as Octopus Spring.

So, in this paper, they took samples in transects across thermal gradients. They used PCR primers specific for Bacteria, Archaea, cyanobacteria, and Chloroflexi (green non-sulfur Bacteria) to amplify ssu-rRNA sequences, then separated them by DGGE:

fig1

Each distinct band from each sample was cut out of the gel, reamplified, and sequenced so that the 'phylotype' of the organism it represents could be determined. They then got a rough idea about what the organisms were by BLASTing the sequences to identify the closest relative in GenBank, and then by generating phylogenetic trees. Here are BLAST results, given in the online data supplement:

GenBank Accession Number

 

Sequence code*

Closest BLAST match

Identity

Location#

GenBank Accession number

%

Similarity

DQ001343

TS4-a5-70

Pyrobaculum aerophilum IM2

Boiling marine water, Italy

AE009843

93

DQ001344

TM3-a5-63

Thermoproteus sp. IC-061

Hot spring, Japan

AB081846

73

DQ001345

TM3-a6-63

Uncultured archaeon GA55

Japan

AB046184

94

DQ001346

TS1-a4-65

Uncultured archaeon SAGMA-2

Deep South African gold mines

AB050233

84

DQ001347

TS4-a2-70

Uncultured crenarchaeote GoM_5202R-15

Marine sediment, Mexico

AY324539

80

DQ001348

TS1-a1-65

Uncultured archaeon pCIRA-X

Hydrothermal field, Japan

AB095133

77

DQ001349

TM3-a2-63

Uncultured archaeon pCIRA-X

Hydrothermal field, Japan

AB095133

75

DQ001350

TS5-a5-83

Unidentified archaeon pMC2A17

Deep sea hydrothermal vent, Japan

AB019747

79

DQ001351

TM3-a1-63

Uncultured euryarchaeote VAL31-1

Feshwater lake water, Valkekotinen Lake, Finland

AJ131276

81

DQ001352

TM5-a3-69

Archaeoglobus profundus DSM5631

Deep sea hydrothermal vent, Mexico

AF322392

91

DQ001353

TM4-a4-65

Uncultured archaeon AK49

Hot spring, Thailand

AY555828

93

DQ001354

TM3-b4-63

Uncultured Aeromonas sp.

Lake Waiau, Hawaii

AY345531

98

DQ001355

TM5-b11-69

Uncultured Aeromonas sp.

Lake Waiau, Hawaii

AY345531

94

DQ001356

TM5-b1-69

Uncultured Aeromonas sp.

Lake Waiau, Hawaii

AY345531

99

DQ001357

TM5-b12-69

Uncultured Aeromonas sp.

Lake Waiau, Hawaii

AY345531

99

DQ001358

TS4-b11-70

Corbulabacter subterraneus

Thermal aquifer, Australia

AY078053

92

DQ001359

TS2-b18-70

Sphingomonas sp. M3C203B-B

Ice core, Taylor Dome, Antarctica

AF395031

99

DQ001360

TS4-b8-70

Acidobacteriaceae sp.

Peaty grassland soil

Y12597

79

DQ001361

TM1-g1-52

Uncultured Acidobacteria sp. GFP1

GFP hot spring (67oC), YNP, USA

AF130858

95

DQ001362

TS1-b1-65

Cytophaga sp.

Deep-see sediment, Japan

AB015265

85

DQ001363

TS3-b3-70

Flexibacter rubber

Hot spring, Japan

AB078065

96

DQ001364

TS2-b4-70

Arcocella aquatica

Freshwater lake, Moscow, Russia

AJ535729

78

DQ001365

TS2-b13-70

Methylothermus sp.

Underground hot springs, Hungary

U89299

79

DQ001366

TS1-b2-65

Unidentified sp.

Obsidian Pool, YNP, USA

AF018187

86

DQ001367

TM4-b10-65

Desulfotomaculum sp.

Underground oil-storage cavity, Japan

AB074934

79

DQ001368

TS3-b5-70

Cytophage sp.

Deep-see sediment, Japan

AB015265

78

DQ001369

TM3-b3-63

Unidentified Cytophales/GSB sp. OPB56

Obsidan Pool, YNP, USA

AF027009

86

DQ001370

TM4-b16-65

Unidentified GSB sp. OPS77

Obsidian Pool, YNP, USA

AF027011

97

DQ001371

TM3-b8-63

Chlorobium tepidium TLS

High-sulfide hot spring, New Zealand

AE012963

86

DQ001372

TS2-b6-70

Prosthecochloris aestuarri

Hypersaline Chiprana Lake, Spain

AJ291826

86

DQ001373

TM5-b15-69

Thermosipho japonicus

Deep-sea hydrothermal vent, Okinawa, Japan

AB024932

90

DQ001374

TM4-b9-65

Eubacterium sp. OS type L

Octopus Spring, YNP, USA

L04707

95

DQ001375

TS2-b14-70

Eubacterium sp. OS type L

Octopus Spring, YNP, USA

L04707

95

DQ001376

TS5-g5-83

Uncultured Eubacterium sp. clone SM2G08

Mammoth Springs, YNP, USA

AF445740

95

DQ001377

TM4-b14-65

Eubacterium sp. OS type L

Oregon Spring, YNP, USA

L04707

95

DQ001378

TM4-b13-65

Candidate Division OP9 clone OPB72

Obsidan Pool, YNP, USA

AF027086

98

DQ001379

TM5-g4-69

Unidentified Thermodesulfobacterium sp. OPB45

Hot spring, YNP, USA

AF027096

92

DQ001380

TS2-b15-70

Thermus scotoductus

Hot spring, Iceland

Y18410

98

DQ001381

TS5-b17-83

Thermus scotoductus

Hot spring, Iceland

Y18410

98

DQ001382

TS5-b16-83

Thermus scotoductus

Hot spring, Iceland

Y18410

98

DQ001383

TM5-b5-69

Roseiflexus castenholzii

Hot spring, Japan

AB041226

96

DQ001384

TM1-b6-52

Roseiflexus castenholzii

Hot spring, Japan

AB041226

81

DQ001385

TM4-b17-65

Roseiflexus castenholzii

Hot spring, Japan

AB041226

96

DQ001386

TM3-b18-63

Chloroflexus aggregans

Hot spring, Japan

AJ308499

94

DQ001387

TS1-b10-65

Uncultured GNS sp.

CASM vent field, Axial Volcano, Juan de Fuca Ridge

AJ441227

82

DQ001388

TM2-c10-62

Uncultured cyanobacterium

Mammoth hot springs, YNP, USA

AF445722

95

DQ001388

TS3-c3-70

Synechococcus sp. clone OH 32

Hot spring, Oregon, USA

AF285243

95

DQ001389

TS3-c4-70

Synechococcus sp. clone OH 32

Hot spring, Oregon, USA

AF285243

95

DQ001390

TS2-c1-65

Synechococcus sp. clone OH 32

Hot spring, Oregon, USA

AF285243

95

DQ001391

TM1-c9-52

Synechococcus sp. C9

Octopus Spring, YNP, USA

AF132773

99

DQ001392

TM3-c11-63

Synechococcus sp. C9

Octopus Spring, YNP, USA

AF132773

96

DQ001393

TM5-c1-69

Synechococcus sp. clone OH 32

Hot spring, Oregon, USA

AF285243

89

DQ001394

TS3-b7-70

Leptolyngbya sp.

Semi-arid deserts, western USA

AY239602

88

DQ001395

TM2-b7-62

Leptolyngbya sp.

Semi-arid deserts, western USA

AY239602

89

DQ001396

TM2-c3-62

Thermosynechococcus elongatus

Hot spring, Japan

AP005376

95

DQ001397

TM3-c4-63

Thermosynechococcus elongatus

Hot spring, Japan

AP005376

95

DQ001398

TM3-c2-63

Thermosynechococcus elongatus

Hot spring, Japan

AP005376

98

DQ001400

TM2-c5-62

Thermosynechococcus elongatus

Hot spring, Japan

AP005376

95

TM prefix denotes sequences derived from mats, whilst TS prefix denotes those derived from streamers.  Centre letter/number is an identifier.  Numeric suffixes denote temperature from which sequence was recovered. YNP denotes Yellowstone National Park.

Most of the rest of the paper, then, is a series of phylogenetic trees:

fig2

BTW, the authors make a point of saying that 19% of their sequences were archaeal - but keep in mind that they used distinct archaeal and bacterial specific primers for the PCRs they used in the DGGE gels. In fact, the authors use the percentage of their total numbers of sequences all the time as an implicit measure of abundance; but it is the relative intensities of bands, not the frequecies of sequences, that count. Even so, it's unclear that their PCRs are designed to be quantitative; real quantitation would require either FISH or realtime PCR, which they have not done.

fig3

 

fig4

fig5

The last Figure is an assessment of diversity along the temperature gradient. They use several measures of 'diversity', and attempt to show that the most diverse samples are from the 63-70C temperature samples: in other words, that diversity does not decrease evenly with increased thermal stress. This seems to me like a "strawman" argument (a weak 'hypothesis' created specifically to be overthrown) - I don't know anyone who argues that diversity has to decrease quantitatively with environmental stresses, only that diversity tends to decrease in harsher environments. This decrease in diversity is expected to be most obvious near the limits (well above 70C), and nobody I know of expects it to be a smooth gradient. We see this in Yellowstone, for example: the color changes take place stepwise, as bands (not smoothly), as the temperature changes.

fig6

I think their other point is weak - the notion that the sequences they see are distinct from anything in GenBank and therefore represent novel species found only in this area. Where's the control experiment for this? This conclusion would require similar samplings from other similar hotsprings worldwide, some close to one another and some remote, and then seeing is different sequences are specific to specific geographic ranges.


T-RFLP analysis of peridontal disease

Terminal Restriction Fragment Length Polymorphism (t-RFLP) analysis

t-RFLP is a method similar to DGGE in that it generates fingerprints of a populations, but unlike DGGE, the bands can (ideally) be assigned to specific organisms directly, without the need for sequencing.

Imagine the simplest case of a pure culture "unknown". You amplify the ssu-rRNA with some set of primers, e.g. 515F and 1492R (as in lab), and one of the primers (515F in this example) is fluorescently labeled. You digest this ssu-rDNA with several different restriction enzymes and separate the products out on a sequencing gel:

null

The sizes of the labeled fragments are compared to a database of potential fragments of ssu-rRNA sequences that would be generated from PCR products from those primers digested with those enzymes. If the restriction enzymes were carefully chosen, a computer program should be able to sift through the database and identify your organism based on the observed (from the gel) sizes of the labeled fragments. For example, there might be 100 organisms who's ssu-rDNA, if amplified with labeled 515F and unlabeled 1492R and digested with HaeIII, should give a 201bp fragment. There might also be another 100 organisms that would have a 571bp MspI fragment, but only one name on both lists - that's your organism. This identification might be verified by the presence of a predictable 823bp Sau3AI fragment.

This should be pretty easy, but now imagine doing the same thing with a population of organisms from a natural environment. Now you have several abundant organisms, some more common than others, creating a pattern of bands in each digest. However, the computer can, if the experiment is properly set up, sift through the peaks and determine what mixture of organisms would create that pattern of bands.

null

Your ability to sift through the microbial population using t-RFLP is basically limited only by your choice of primers (what kind of organisms they'll amplify ssu-rDNA from), your ability to choose the best restriction enzymes to use, and the database you're fitting your data to. t-RFLP is an emerging technology, so there is plenty of room for improvement in all of these aspects, but this approach is already very useful, and incredibly promising. Probably the ultimate limitation will be PCR primers; this is a limitation shared by all of the molecular phylogenetic approaches we've talked about.

Changes in oral microflora after treatment for peridontal disease

Sakamoto M, Huang Y, Ohnishi M, Umeda M, Ishikawa I & Benno Y. 2004 Changes in oral microbial profiles after peridontal treatment as determined by molecular analysis of 16S rRNA genes. J. Med. Microbiol. 53:563-571.

Question: How does the peridontal microflora change after treatment?

In this paper, the authors use t-RFLP to study subgingival plaque of 3 peridontal disease patients before and after treatment (lessons in oral hygiene and very complete teeth cleaning ("scaling and planing") both above and below the gums). Samples were taken from the subgingiva of 3 or 4 teeth before treatment and 3 months after treatment.

Although the authors use realtime PCR and ssu-rDNA clone libraries as well as t-RFLP, let's go through the t-RFLP data first.

t-RFLP was started by PCR amplification from the samples using 6-FAM-labeled 27F (bacterial-specific) and 1492R (universal). Samples of the PCR products were digested with HhaI (GCG^C) and MspI (C^CGG), and run on an ABI PRISM sequencing machine (these are the same machines MWG uses for the sequencing of our lab PCR products).

Fig 1 is an example of their data from the HhaI digests:

null

The code is that the first letter represents the patient (A, B or C), the second represents whether the sample was plaque (P) or saliva (S), and the number is before (1) or after (2) treatment. At this point, these are viewed as fingerprints; the identities of the organisms represented by the peaks are not important. What shows up is that there are differences before & after treatment. These are pretty subtle for patient A, but clear in the cases of patients B and C.

Fig 3 is a better example of how the data is examined. The top panel (a) are the HhaI digests from one site on one patient, before (top) and after (bottom) treatment. The bottom panel (b) is the same thing with the MspI digest. Notice that after treatment (this is patient B), the Peptostreptococcus, Porphyromonas ginginvalis, and Prevotella intermedia (know problem organisms) all disappear or diminish. The other organisms, which are creatures that the authors showed using these same methods are common in healthly individuals, remain abundant.

Given that t-RFLP is new technology, they used realtime PCR and more traditional ssu-rDNA clone libraries to confirm the results of their t-RFLP. They discuss these at some length in the paper, but we'll keep it brief. First they use realtime PCR with species-specific primers to determine to numbers of cells of several species in each sample:

null

Realtime PCR is a method by which the progress of a PCR reaction can be monitored in each round of cycling; because the amount of PCR product in each rounds is a reflection of the amount of starting template DNA, realtime PCR can be used to quantitate the dose of the target gene in original DNA, and by extension the number of cells from which the DNA came.

Notice that they're looking specifically for some common oral spirochaetes. Notice also that although patient A didn't have much change in these problem organisms, patients B and C had dramatic decreases in them all. However, notice as well that these particular organisms (the spirochaetes) did not show up in their t-RFLP experiments; this is probably why they did these realtime PCRs.

They also cloned sequences from their PCRs, and counted species found before & after treatment:

null

So this is the traditional ssu-rRNA microbial survey method. They look at a relatively small number of clones, 90 before and 88 after treatment in a single tooth of a single patient, but it looks even here like you can see pathogens decreasing or disappearing, being replaced by commensals. However, given the small numbers (rarely more than a handful of any one organism), this table really should have included some statistics to show whether or not these are significant.

The take-home message is that these confirm the observations of the t-RFLP; known harmful organisms are reduced and commensalistic ones remain constant or increase.


Questions for thought

  • How well do you think the computer would be able to identify members of a population from t-RFLP patterns if some of the organisms rRNA sequences aren't in the database? Would this just result in unidentified peaks, or could it foul up it's ability to identify organisms it does have in the database?

  • What might be the advantages or disadvantages of labeling one primer verses the other in a t-RFLP experiment? What about labeling both primers? How would you do this experiment?

  • What might you do if the restriction enzymes you use don't identify the organisms definitively? In other words, what if a set of bands could be one or more of several organisms?

  • How might you deal with the fact that many of the identifications in a t-RFLP experiment are likely to be from uncultivated organisms?

  • Why do you think the realtime PCR was able to measure changes in the spirocchaete populations, but these populations weren't detected by t-RFLP?

  • What would a DGGE (or TGGE) gel look like if the denaturing gradient were horizontal instead of vertical?

  • Given DGGE technology, can you think of an environment that might be interesting to examine across space and/or time?

  • What limitation(s) of molecular phylogenetic analysis does DGGE not improve upon?

Previous or Next lecture

Last updated April 03, 2009 by James W Brown