OUP user menu

Reliability for detecting composition and changes of microbial communities by T-RFLP genetic profiling

Martin Hartmann , Franco Widmer
DOI: http://dx.doi.org/10.1111/j.1574-6941.2007.00427.x 249-260 First published online: 1 February 2008

Abstract

Terminal restriction fragment length polymorphism (T-RFLP) analysis is commonly used for profiling microbial communities in various environments. However, it may suffer from biases during the analytic process. This study addressed the potential of T-RFLP profiles (1) to reflect real community structures and diversities, as well as (2) to reliably detect changing components of microbial community structures. For this purpose, defined artificial communities of 30 SSU rRNA gene clones, derived from nine bacterial phyla, were used. PCR amplification efficiency was one primary bias with a maximum variability factor of 3.5 among clones. PCR downstream analyses such as enzymatic restriction and capillary electrophoresis introduced a maximum bias factor of 4 to terminal restriction fragment (T-RF) signal intensities, resulting in a total maximum bias factor of 14 in the final T-RFLP profiles. In addition, the quotient between amplification efficiency and T-RF size allowed predicting T-RF abundances in the profiles with high accuracy. Although these biases impaired detection of real community structures, the relative changes in structures and diversities were reliably reflected in the T-RFLP profiles. These data support the suitability of T-RFLP profiling for monitoring effects on microbial communities.

Keywords
  • genetic profiling
  • T-RFLP
  • bias
  • artificial community
  • diversity
  • community structure

Introduction

Terminal restriction fragment length polymorphism (T-RFLP) analysis (Liu et al., 1997) has been shown to be a consistent, high-resolution, and high-throughput cultivation-independent technique to monitor environmental and anthropogenic effects on microbial community structures (Moeseneder et al., 1999; Tiedje et al., 1999; Dunbar et al., 2000; Osborn et al., 2000; Braker et al., 2001; Buckley & Schmidt, 2001; Casamayor et al., 2002; Pesaro et al., 2004; Graff & Conrad, 2005; Hartmann et al., 2005, 2006; Widmer et al., 2006). However, it is still critically discussed in the scientific literature to what extent T-RFLP profiles may be subjected to biases. PCR amplification and downstream analyses such as enzymatic restriction appear to induce prominent biases to T-RFLP profiles (Egert & Friedrich, 2003; Kanagawa, 2003; Lueders & Friedrich, 2003; Frey et al., 2006; Hartmann et al., 2007). These biases may originate from preferential, unspecific, or inhibited amplification (Reysenbach et al., 1992; Wagner et al., 1994; Polz & Cavanaugh, 1998; Schmalenberger et al., 2001), template reannealing (Mathieu-Daudé, 1996; Suzuki & Giovannoni, 1996), generation of chimeric sequences (Kopczynski et al., 1994; Wang & Wang, 1997; Qiu et al., 2001; Hugenholtz & Huber, 2003), formation of heteroduplexes (Judo et al., 1998; Thompson et al., 2002), nucleotide misincorporation (Cariello et al., 1991; Eckert & Kunkel, 1991), partially single-stranded DNA (Egert & Friedrich, 2003, 2005), and residual polymerase activity during restriction (Hartmann et al., 2007). Careful optimization of PCR and digestion protocols may reduce but not completely exclude the impact of these biases and will probably make the detection of real community structures, i.e. as they occur in the environment, very difficult.

The reliability of T-RFLP profiles to represent real community compositions and adequately reflect relative changes in the community structures has rarely been investigated. Artificially designed model communities represent one way to assess the extent of biases induced during T-RFLP analysis. Analysis of defined pairwise mixtures of five ruminal bacterial cultures by Frey et al. (2006) has demonstrated that DNA extraction and PCR amplification may induce major biases. However, high similarity between sequences, i.e. four Firmicutes and one Fibrobacteres, and an unusually high analytical variation of up to 94% among replicates have not allowed to draw detailed conclusions about the potential and limitations of T-RFLP profiling. Lueders & Friedrich (2003) have reported high quantitative reflection of the real community composition in the T-RFLP profiles of a four-member archaeal community when targeting rRNA genes, but not when targeting methyl-coenzyme M reductase-encoding genes. This has mainly been explained by the differences in primer degeneracy used for detecting the two markers. From these studies, it remained unclear, however, how more complex and diverse communities and changes in their structures would be reflected in a T-RFLP profile.

T-RFLP analyses represent an approach of sequentially linked single analysis steps, which may influence detection of real structures of complex and phylogenetically diverse communities. It is, however, important to know the extent to which T-RFLP profiles provide representative images of microbial communities. In this study, the potential of T-RFLP genetic profiling was investigated (1) to represent real community structures and diversities of complex communities and (2) to reliably reflect relative changes induced to individual components of these communities. For this purpose, artificial communities were designed consisting of 30 defined 16S bacterial rRNA gene clones deriving from nine different phyla. Real-time PCR and T-RFLP analyses were used to resolve analytical biases on apparent community structures.

Materials and methods

Characterization of the clones used

Artificial communities (ACs) were designed by mixing cloned 16S rRNA gene sequences that were derived from a gene library constructed in a previous study (Hartmann & Widmer, 2006). Briefly, nucleic acids were extracted from an agricultural soil managed according to biodynamic guidelines. Bacterial 16S rRNA genes were amplified from soil DNA extracts using primers 27F (5′-AGAGTTTGATCMTGGCTCAG-3′) and 1378R (5′-CGGTGTGTACAAGGCCCGGGAACG-3′), and PCR products were cloned using the pGEM®-T Easy Vector System cloning kit according to the manufacturer's recommendation (Promega, Madison, WI). Vector inserts were partially analyzed by sequencing both strands using primer UNI-516-rev (5′-TACCGCGGC[G/T]GCTGGCA-3′: modified from Giovannoni et al., 1988) and corresponding vector primers T7 (5′-TAATACGACTCACTATAGGG-3′) or SP6 (5′-ATTTAGGTGACACTATAG-3′), and by T-RFLP analysis using primers 27F (FAM-labeled) and 1378R as described (Hartmann & Widmer, 2006).

Thirty out of 600 sequences were selected from the gene library (Hartmann & Widmer, 2006), which fulfilled two criteria. First, the sizes of terminal restriction fragments (T-RFs) had to be evenly distributed between 50 and 500 basepairs (bp). Second, sequences had to originate from several different phyla as determined by sequence affiliation using the RDP-II database (Cole et al., 2005). The 30 selected clones (Table 1) derived from nine different phylogenetic groups, i.e. Alphaproteobacteria, Betaproteobacteria, Gammaproteobacteria, and Delta-/Epsilonproteobacteria, Acidobacteria, Actinobacteria, Firmicutes, Verrucomicrobia, and Genera_incertae_sedis_WS3. They yielded 30 different in silico T-RFs, with sizes ranging from 66 to 494 bp. The sequence identity of the fragments defined by primers 27F and UNI-516-rev ranged between 63% and 95%. The experimental T-RF sizes of the individual sequences differed from the theoretical in silico T-RF sizes by an average of 3.5±1.9 relative migration units (rmu), with a maximum SD of 0.1 rmu among replicates (Table 1). In silico T-RF size (T-RFsize), GC content (T-RFGC), molecular weight of the T-RFs (T-RFweight), GC content of the forward primer site (PrimerGC), and GC content of the sequenced insert between primers 27F and UNI-516-rev (InsertGC) were determined using bioedit 7.0 (Hall, 1999). GCInsert was considered to be representative of the GC content of the total SSU rRNA gene amplicon.

View this table:
1

Description of the 30 sequences used to assemble the artificial communities

Sequence specificationsArtificial communities
SequenceGenBank accessionPhylogenetic groupT-RFsize*T-RFweight (Da)PrimerGC (%)T-RFGC§ (%)InsertGC (%)Tm (°C)Template fraction (ng plasmid DNA)
(bp)(rmu)AC1AC2AC3AC4AC5
Clone 01DQ827724Acidobacteria150148.445 4254551.358.190.70.300.250.200.150.10
Clone 02DQ827727Delta-/Epsilonproteobacteria132127.640 0684556.858.390.30.300.250.200.150.10
Clone 03DQ827728Actinobacteria8074.524 3564558.862.391.50.300.250.200.150.10
Clone 04DQ827729Betaproteobacteria488482.6147 9024554.354.989.30.300.250.200.150.10
Clone 05DQ827745Alphaproteobacteria439434.1133 2814555.155.890.50.300.350.400.450.50
Clone 06DQ827748Actinobacteria141137.342 9395057.561.591.30.300.250.200.150.10
Clone 07DQ827775Actinobacteria280281.584 50147††57.959.690.70.300.350.400.450.50
Clone 08DQ827785Acidobacteria9389.028 3015057.054.790.20.300.350.400.450.50
Clone 09DQ827795Actinobacteria6660.720 1004556.160.291.00.300.350.400.450.50
Clone 10DQ827816Actinobacteria170168.151 7924555.959.490.80.300.350.400.450.50
Clone 11DQ827830Gammaproteobacteria494487.1149 9255055.756.289.50.300.250.200.150.10
Clone 12DQ827831Verrucomicrobia208205.663 0265050.554.590.20.300.250.200.150.10
Clone 13DQ827842Alphaproteobacteria437430.4132 8725057.958.490.30.300.350.400.450.50
Clone 14DQ827869Alphaproteobacteria401395.3121 6295055.956.289.70.300.250.200.150.10
Clone 15DQ827873Alphaproteobacteria121116.236 8734561.258.990.30.300.350.400.450.50
Clone 16DQ827894Acidobacteria297293.390 1994558.359.690.80.300.250.200.150.10
Clone 17DQ827896Delta-/Epsilonproteobacteria216211.965 6075057.459.490.50.300.350.400.450.50
Clone 18DQ827997Actinobacteria165164.349 6704552.457.090.50.300.350.400.450.50
Clone 19DQ828000Alphaproteobacteria128123.2390474556.354.790.00.300.350.400.450.50
Clone 20DQ828010Delta-/Epsilonproteobacteria187184.956 7625055.157.790.70.300.350.400.450.50
Clone 21DQ828028Firmicutes457452.8138 6845053.854.589.00.300.350.400.450.50
Clone 22DQ828033Actinobacteria221219.567 2775055.759.190.80.300.250.200.150.10
Clone 23DQ828053Alphaproteobacteria447442.5135 6105057.958.691.00.300.250.200.150.10
Clone 24DQ828069WS3‡‡225224.768 3315052.455.790.00.300.350.400.450.50
Clone 25DQ828072Alphaproteobacteria160158.748 5254556.358.090.50.300.250.200.150.10
Clone 26DQ828093Verrucomicrobia479474.1145 2105056.056.690.20.300.250.200.150.10
Clone 27DQ828202Gammaproteobacteria178177.053 9584555.157.189.80.300.350.400.450.50
Clone 28DQ828214Acidobacteria265264.580 4894552.156.290.50.300.350.400.450.50
Clone 29DQ828304Betaproteobacteria430425.9130 3735055.655.989.70.300.250.200.150.10
Clone 30DQ828309Actinobacteria276277.383 1774556.958.891.30.300.250.200.150.10
  • * T-RF sizes were determined in silico (basepairs; bp) and experimentally (rmu, relative migration units, SD ≤0.1 among triplicates).

  • Molecular weight (Da) of the T-RFs as calculated from the in silico sequence information.

  • Different GC contents (%) of the forward primer site. The reverse primer contained no degenerate position.

  • § GC content (%) of the T-RF.

  • GC content (%) of the sequence between primers 27F and UNI-516-rev, which was considered to be representative for the full-length amplicon.

  • Melting temperature of the amplicon determined experimentally (SD ≤0.3 among triplicates).

  • †† ††At the 27F primer site of clone07, the degenerate position was missing.

  • ‡‡ ‡‡Genera incertae sedis.

Plasmids were isolated using the Wizard Plus SV Miniprep Kit (Promega) according to the manufacturer's recommendation. Isolated plasmid DNA was quantified with PicoGreen® (Molecular Probes, Eugene, OR) on a luminescence spectrometer (Perkin Elmer LS 30, Rotkreuz, Switzerland) as previously described (Hartmann et al., 2005). Samples were adjusted to 3 ng μL−1 plasmid DNA and quality was controlled by electrophoresis in agarose gels (1% w/v) and ethidium bromide staining. For all subsequent steps, equal volumes of products were handled.

Real-time PCR quantification

Amplification efficiency of each of the 30 target sequences was determined with real-time PCR using the iQ SYBR Green Supermix Kit (Bio-Rad Laboratories, Hercules, CA). For this purpose, each target sequence was amplified in triplicate using 0.03 ng template DNA with primers 27F and 1378R in a total volume of 15 μL containing 1 × iQ SYBR Green Supermix (Bio-Rad Laboratories) and 0.2 μM of each primer. PCR was performed and monitored using an iCycler iQ Multicolor Real Time PCR Detection System (Bio-Rad Laboratories) and the icycler iq 3.1. software (Bio-Rad Laboratories). PCR amplification was performed using initial denaturation for 3 min at 95 °C, followed by 25 cycles with denaturation for 30 s at 94 °C, annealing for 30 s at 48 °C, and extension for 2 min at 72 °C, with a final extension for 5 min at 72 °C. Melting temperatures (Tm) of amplification products were determined by a step-wise increase of temperature from 55 °C to 95 °C with increments of 0.5 °C for 10 s each.

To compensate for small differences after the concentration adjustment of each sample, a specific vector sequence on the pGEM®-T Easy vector of each plasmid, i.e. from 1603 to 2992 bp, was amplified in triplicate using real-time PCR with primers pGEM1620F (5′-GAGTAAGTAGTTCGCCAG-3′; position 1603–1620 on pGEM T-Easy) and pGEM2975R (5′-ACTGGCCGTCGTTTTACA-3′; position 2975–2992 on pGEM T-Easy). The same template quantities, PCR ingredients, and cycling conditions were used as described for insert amplification, except for the annealing temperature, which was at 54 °C.

Amplification efficiency was evaluated by determining the cycle threshold (Ct) value of each SSU rRNA insert sequence (Ct Insert), followed by normalization with the Ct value of the corresponding vector sequence (Ct Vector). Ct values were determined at 40 relative fluorescence units (rfu) in the ‘PCR baseline subtracted curve fitting’ mode. In addition, amplification product quantity of each insert sequence (ProductInsert) was determined and normalized by the product quantity of the corresponding vector sequence (ProductVector). ProductInsert and ProductVector were measured as fluorescence signal intensities of the corresponding insert or vector sequence after 20 cycles, the cycle number that was used for subsequent T-RFLP analysis.

Generation of artificial communities

Five artificial communities (AC1–AC5) were generated by mixing plasmid preparations in defined concentrations (Table 1). For each of the five ACs, a total of 9 ng template plasmid DNA was added to the PCR reaction and therefore each of the five ACs displayed the same richness and total abundance but a different evenness. AC1 was composed of equal amounts of all 30 sequences, i.e. 0.3 ng plasmid DNA each (30 × 0.3 ng=9 ng). In the other four community types AC2 to AC5, 15 stochastically (http://www.randomizer.org) selected AC members were either down- or up-regulated in their concentration while maintaining the total template quantity at 9 ng (Table 1).

T-RFLP analysis

For T-RFLP analysis, each of the 30 SSU rRNA gene sequences was individually analyzed, i.e. defined as T-RFInd, or in mixtures defined as AC1 to AC5, i.e. T-RFAC1 to T-RFAC5. T-RFInd of each sequence was generated using 0.3 ng template DNA in order to ensure the same individual template quantity as that used for AC1. Plasmid inserts were amplified with primers 27F (FAM-labeled) and 1378R in a total volume of 20 μL containing 1 × PCR buffer (Qiagen, GmbH, Hilden Germany), 2 mM MgCl2, 0.4 μM of each primer, 0.4 mM dNTP, and 2 U HotStar Taq DNA polymerase (Qiagen). PCR amplification was performed using initial denaturation for 15 min at 95 °C, followed by 20 cycles with denaturation for 30 s at 94 °C, annealing for 30 s at 48 °C, and extension for 2 min at 72 °C, followed by a final extension for 10 min at 72 °C and cooling to 10 °C. The quality and quantity of amplification products were examined by electrophoresis in agarose gels (1% w/v) and ethidium bromide staining. Before digestion, PCR products was purified with Montage PCRμ96 plates (Millipore, Billerica, MA) according to the manufacturer's recommendation. Ten microlitres of purified PCR product were digested overnight at 37 °C with 6 U restriction endonuclease MspI (Promega) in 20 μL supplied 1 × restriction enzyme buffer (Promega). Before electrophoresis, equal volumes of the 30 T-RFInd MspI digestion products were mixed in order to provide electrophoretic conditions comparable to those used for the ACs. This step also allowed application of the same statistical normalization steps on all data. T-RFs were analyzed on an ABI Prism 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA) equipped with 36 cm capillaries filled with POP-7 along with a GeneScan-500 ROX size standard (Applied Biosystems) as previously described (Hartmann et al., 2005). T-RF sizes (rmu) and T-RF quantities (peak heights as rfu) were determined with the genescan analysis software v 3.7 (Applied Biosystems) using a signal detection threshold of 50 rfu. Manual peak calling was performed using genotyper v 3.7 NT (Applied Biosystems).

Data normalization and statistics

T-RF peak height values in a profile were divided by the sum of all peak height values in the corresponding profile. This data normalization allowed to correct for small differences in sample load, which would result in differences in the overall profile intensity among samples (Hartmann et al., 2005). For comparison of different data sets, e.g. obtained from real-time PCR and T-RFLP analyses, data sets were z-transformed where necessary (standardized normal distribution: Sokal & Rohlf, 1987; Siegel & Castellan, 1988). All discriminative statistics were performed using statistica version 6.1 (StatSoft, Tulsa, OK). Correlations were calculated by Pearson's Product–Moment Correlations. The effects of categorical factors, i.e. sequence type and phylogenetic affiliation, were tested for significance using one-way anova. The effects of continuous factors, i.e. Tm, T-RFsize, T-RFweight, T-RFGC, PrimerGC, and InsertGC were tested using multiple regression models. Pairwise sequence-specific differences in T-RF intensities between T-RFInd and T-RFAC1 were determined using two-sided t-test statistics. The diversity and distribution of the five AC profiles were assessed using the Shannon Diversity (H) index (Shannon & Weaver, 1963). All illustrations were generated using sigmaplot 8.02 (SYSTAT Software, Chicago, IL).

Results

Amplification efficiency and product quantity

The 30 plasmid preparations that were adjusted to a concentration of 3 ng μL−1 revealed equal band intensities in agarose gel electrophoresis (data not shown). The averaged reciprocal cycle threshold (1/Ct (Insert/Vector)) of the 30 target sequences ranged from 0.96 to 1.06 rfu, showing a mean SD of 0.014±0.007 among triplicates (Fig. 1, black circles). The different clones significantly (P<0.001) influenced 1/Ct (Insert/Vector) and explained 81% (R2) of the variance. The phylogenetic affiliation had no significant (P>0.05) influence on 1/Ct (Insert/Vector).

1

Amplification efficiency of the 30 SSU rRNA gene sequences (described in Table 1) assessed by real-time PCR. Normalized reciprocal cycle threshold values of the 30 SSU rRNA gene sequences (1/Ct Insert/Vector; ●) and the corresponding normalized amplification product quantities after 20 PCR cycles (ProductInsert/Vector; open bars) are displayed as averages of triplicates with corresponding SDs.

Averaged quantities of amplification products after 20 amplification cycles (Product(Insert/Vector)) varied between 0.20 and 0.35 rfu, showing a mean SD of 0.02±0.02 among triplicates (Fig. 1, bars). The different clones significantly (P<0.001) influenced the Product(Insert/Vector) and explained 72% (R2) of the variance. The phylogenetic affiliation also revealed a significant influence (P<0.05) on the Product(Insert/Vector) and explained 18% (R2) of the variance.

1/Ct (Insert/Vector) and Product(Insert/Vector) revealed a correlation of r=0.90 (P<0.001), showing that sequences with a lower cycle threshold revealed higher quantities of amplification product after 20 cycles, a point in PCR when amplification curves had not reached their plateaus yet (Fig. 1). The PCR products obtained after 20 cycles were subsequently used for T-RFLP analyses. The estimated GC content (InsertGC) and the experimentally determined melting temperature (Tm) of the insert sequences (Table 1) were highly correlated (r=0.93; P<0.001) and revealed no significant (P>0.05) influence on 1/Ct (Insert/Vector) or Product(Insert/Vector). The different sequences of forward primer sites 27F (expressed as different GC contents; PrimerGC) due to one A/C degenerate position, and the orientation of the insert within the cloning vector revealed no significant (P>0.05) influence on 1/Ct (Insert/Vector) or Product(Insert/Vector).

Differences in T-RF abundance

If all 30 T-RF would give identical signal intensities, each T-RF would contribute with 3.3% to the total T-RFLP profile signal. The averaged relative abundance of T-RFInd varied between 0.7 and 9.7%, with a mean SD of 0.36±0.27%. The relative T-RFInd abundances were significantly correlated (r=0.57, P<0.001) with the amplicon quantity (ProductInsert) as determined after 20 cycles of PCR amplification (Fig. 2a). Sixteen (53%) clones revealed significant (P<0.05) differences between T-RFInd abundances and ProductInsert (asterisks, Fig. 2a). The size of the T-RF (T-RFsize) was perfectly correlated (r=1.00, P<0.001) with the molecular weight of the T-RF (T-RFweight) and both parameters were negatively correlated (r=−0.65, P<0.001) with the relative abundance of the corresponding T-RFInd (Fig. 2b). The GC content of the T-RF (T-RFGC) revealed no significant correlation (r=0.03, P>0.05). Based on the fact that ProductInsert and T-RFsize (equivalent to T-RFweight) revealed the most predominant and opposed correlation with the relative T-RFInd abundances, the quotient between these two factors was calculated (Fig. 2c). Because ProductInsert was positively and T-RFsize negatively correlated with the T-RFInd abundance, the quotient of these two factors may allow to predict the shape of the T-RFLP profile more precisely. The quotient ‘ProductInsert/T-RFsize’ revealed a highly significant correlation with the T-RFInd abundance (r=0.86, P<0.001) and showed only 11 (37%) statistically significant differences between the quotient and T-RFInd abundances (asterisks, Fig. 2c).

2

T-RF abundances (T-RFInd; open bars) of the 30 SSU rRNA gene sequences (described in Table 1) in relation to (a) product quantities after 20 cycles of PCR amplification (ProductInsert; ●), (b) T-RF size (T-RFsize; ▪) and (c) quotient between ProductInsert and T-RFsize (♦). All values are displayed as z-transformed averages of triplicates, sorted according to T-RFInd abundance, and assigned with corresponding SDs. Statistically significant (P<0.05) differences are labeled with an asterisk.

Competition during T-RFLP analysis

Competition during PCR amplification was assessed by comparing T-RF profiles of the individually amplified sequences (T-RFInd) with T-RF profiles of the sequences in the equally mixed and amplified AC1 (T-RFAC1) (Fig. 3a). Although the 30 target T-RFs were unambiguously detectable in the profiles of AC1, these profiles were moderately biased by nontarget peaks. Five T-RFs with signal intensities above the threshold of 50 rfu were detected in the AC profiles, which could not be affiliated to clone sequences (asterisks, Fig. 3a). These signals were below the threshold in the profiles of T-RFInd. Overall, abundances of T-RFInd were similar to those obtained from T-RFAC1 (Fig. 3b), revealing a correlation of r=0.88 (P<0.001). Nevertheless, 20 (67%) sequences revealed significant (P<0.05) differences in the relative T-RF abundances between T-RFInd and T-RFAC1 (Fig. 3b, asterisks).

3

Comparison of T-RFLP profiles of 30 SSU rRNA gene sequences (described in Table 1) either after individual amplification and digestion followed by combined electrophoresis (T-RFInd) or after amplification of artificial community 1 (AC1) with equal amounts of each clone. (a) Comparison of original profiles of T-RFInd (upper lane) and T-RFAC1 (lower lane) where each T-RF peak is labeled with the corresponding clone number. Artifact peaks that could not be assigned to the 30 target peaks and that exceeded the detection threshold of 50 rfu are marked with an asterisk. (b) Relative z-transformed abundances of T-RFInd (●) and T-RFAC1 (○) displayed as averages of triplicates with corresponding SDs and sorted according to T-RFInd abundance. T-RFs with statistically significant (P<0.05) different abundances in the two analyses are labeled with asterisks.

Changes in community structures and diversity

Each of the 30 target T-RFs was unambiguously detected in all ACs and each of them reflected the concentration incline, i.e. increase or decrease, that was adjusted in the templates of AC2–AC5 (Fig. 4). The 30 T-RFs displayed a total average fluorescence signal of 40 002±7334 rfu, revealing only small differences in sample loads in the range of ±18%. Differences in intensities of individual T-RFs were mostly significant (P<0.05) for all ACs. The only nonsignificant (P>0.05) changes were observed for clone 18 between AC3 and AC4 as well as AC4 and AC5, which also displayed relatively large SDs (Fig. 4). The adjusted change in the template quantity of each community member was 0.05 ng plasmid DNA, which corresponded to a 16.7% difference per inclination step. Differences in T-RF abundances experimentally detected between the different ACs revealed an averaged increase of 16.5±0.5% and a decrease of 16.8±0.6% per step.

4

Relative changes of the 30 T-RF abundances in the five different artificial communities AC1 (●), AC2 (○), AC3 (▪), AC4 (◻), and AC5 (▲) with compositions as described in Table 1. Changes are displayed as percent difference to the T-RF signals in rfu of AC1. Values are displayed as averages of triplicates with corresponding SDs.

The experimentally detected Shannon diversity indices (Hdetected) strongly correlated (r=0.99, P<0.001) with the assembled diversities (Hassembled; Fig. 5). Differences in Hdetected values were significant (P<0.01) between all ACs, except between AC1 and AC2, which also had the smallest difference in Hassembled. Hdetected consistently underestimated Hassembled, which allowed determination of a correction factor of 1.040±0.004 for this series of ACs.

5

Shannon diversity index (H) values of the five different artificial communities AC1–AC5 described in Table 1. H values were calculated for the artificially assembled communities (Hassembled; hatched bars) and for the detected T-RFLP profile diversity (Hdetected; blank bars). Values for Hdetected are displayed as averages of triplicates with corresponding SDs. Different letters indicate artificial communities with significantly (P<0.05) different Hdetected values.

Discussion

Differences in amplification efficiencies

Cycle threshold values and PCR product quantities revealed substantial differences among clones (Fig. 1). The differences observed may have been introduced by preferential amplification, a previously described PCR bias that may be caused by stochastic and systematic factors (Reysenbach et al., 1992; Wagner et al., 1994; Suzuki & Giovannoni, 1996; Polz & Cavanaugh, 1998).

Low template concentrations or limiting quantities of PCR reagents may lead to stochastic fluctuations in PCR (Chandler et al., 1997). Long amplicons may also introduce variability and amplicon lengths of only a few hundred basepairs are recommended for robust PCR amplification (Bustin, 2000; Fortin et al., 2001; Giulietti et al., 2001). However, stochastic variations would result in low reproducibility among replicates. The small variability of Ct values and PCR product quantities among triplicates revealed the reliability of PCR and allowed to exclude strong stochastic variability. Homogeneous amplicon length of c. 1400 bp as amplified by primers 27F and 1378R appeared not to introduce strong stochastic variation. Therefore, variability in amplification efficiency among clones may rather be introduced by systematic factors.

Degenerate primer positions cause differences in binding energies and therefore may lead to preferential amplification of sequences containing G/C-rich primer sites (Polz & Cavanaugh, 1998). The A/C degenerate position of the 27F primer (Table 1; PrimerGC) indicated different binding energies at this site that would result in a c. 3 °C difference in melting temperature. However, the influence of the GC content of the forward primer on the amplification efficiency was not significant. The reverse primer contained no degenerate positions. Lueders & Friedrich (2003) have demonstrated that primer binding efficiencies affect the relative amplicon frequency of different target genes if highly degenerate primers are used, e.g. nine positions per primer pair (equivalent to Tm difference of c. 16 °C). When moderately degenerate primers were applied, e.g. one position per primer pair (Tm difference of 4 °C), no effects on amplification efficiencies were detected. In order to minimize this bias, highly degenerate primers should be avoided for community profiling.

Preferential amplification may also be induced by different flanking sequences of the target (Hansen et al., 1998) or different accessibility of the target within the genome (Farrelly et al., 1995; Trotha et al., 2002). Using cloned homologous 16S rRNA gene fragments, the influenc of these factors was intentionally excluded. This allowed for quantification of differences in amplification efficiencies in relation to the different but homologous 16S rRNA gene fragments. Although the orientation of the insert within the vector might influence amplification efficiency, no correlation was found between insert orientation and amplification efficiency.

Discrepancies in template to product ratios may originate from different PCR plateau levels reached by each sequence at larger cycle numbers, e.g. induced by substrate limitation, polymerase inactivation or inhibition, and different template denaturation and reannealing kinetics (Mathieu-Daudé, 1996; Suzuki & Giovannoni, 1996; Kainz, 2000). It has been demonstrated that different educt quantities may result in similar product quantities, if amplification reaches the plateau (Suzuki & Giovannoni, 1996; Kurata et al., 2004). Therefore, the plateau effect may strongly limit the comparability of product quantities at larger cycle numbers. Other studies, however, reported a small influence of increasing cycle number on the reproducibility of bacterial (Osborn et al., 2000) and archaeal (Lueders & Friedrich, 2003) T-RFLP profiles. Small PCR cycle numbers as applied in the current study, i.e. 20 cycles, allowed for the comparison of product quantities in the linear phase of amplification before approaching the PCR plateau (data not shown) and therefore should mainly exclude this effect as suggested previously (Trotha et al., 2002).

Differences in the GC contents and consequently in the melting temperatures of the target sequence may lead to variations in amplification efficiency and product quantity (Reysenbach et al., 1992; Dutton et al., 1993). This bias may become more dominant with larger cycle numbers. The highly significant correlation (r=0.93, P<0.001) of GC contents and melting temperatures of the sequences (Table 1) indicated the representativeness of the GC content of the first 500 bp of the amplicons for the melting temperature of the total amplicon. However, the influence of GC content or melting temperature on the amplification efficiency was not significant. Therefore, other possible explanations for sequence-specific variability of amplification efficiencies such as the formation of different secondary structures of target sequences or amplicons that may affect primer accessibility or elongation efficiency during PCR may apply (Hansen et al., 1998).

These considerations indicated that differences in amplification efficiencies of the 30 sequences were neither based on sequence-specific factors such as primer binding efficiencies, melting temperatures, or flanking regions, nor on stochastic events, but rather appeared to depend on unknown clone-related differences such as the formation of secondary structures. Adjustment of PCR conditions by adding sufficient template and PCR components, applying low cycle numbers, and using primers with high binding temperatures will help in reducing PCR-related biases although not in avoiding them. Based on the results presented, preferential amplification will induce remarkable shifts in the T-RFLP community profiles and limit the quantitative assessment of real community compositions using PCR-based approaches. The maximum variability of a factor of 3.5±0.1 between the sequence with the lowest (clone 16) and the highest (clone 18) amplification efficiency (Fig. 2a) may give an estimate of the extent of the bias induced by PCR amplification, while the small SD indicated the reproducibility among replicates. Furthermore, it is difficult to predict the extent of this bias for more complex environmental communities that may be differently influenced by co- and counteractive effects of the factors listed above.

Non-PCR-based differences in T-RF abundances

Strong differences of a factor of 14.1±2.7 in T-RFInd abundances (between clone 03 and clone 21) or 14.2±0.3 in T-RFAC1 abundances (between clone 03 and clone 16) were observed although PCR templates were adjusted to the same quantities (Fig. 3b). Additional steps required for the preparation of T-RFLP profiles of individual sequences resulted in increased SDs as compared with AC1. The factor 14 between T-RF abundances could not be solely explained by the different amplification efficiencies of the individual sequences, i.e. the factor 3.5 mentioned above. Other influences of PCR downstream analyses appeared to affect the T-RFLP profiles to a comparable extent (Fig. 2). Artifact peaks observed in the T-RFLP profiles (Fig. 3a, asterisks) may refer to nondigestible single-stranded DNA (Egert & Friedrich, 2003) or formation of heteroduplexes and chimeras (Wang & Wang, 1996; Qiu et al., 2001; Kanagawa, 2003). The detection of artifact T-RFs predominantly in the artificial community samples and not in the individually amplified profile suggested that some heteroduplexes and/or chimeras were generated, however, to a rather small extent (Fig. 3a). In contrast to communities with low complexity of four different species evaluated by Lueders & Friedrich (2003), more complex communities may be affected more strongly by the formation of chimeras and heteroduplexes. In addition, incomplete or unspecific enzymatic restrictions that generate artifact T-RFs may also bias T-RFLP profiles (Osborn et al., 2000), but can mostly be avoided by adding sufficient restriction enzymes and optimizing the digestion conditions.

Electrophoresis of the fragmented PCR product may influence the detection of T-RF abundances. Reduced efficiency in capillary electrophoretic injection (Irwin et al., 2003) and reduced peak heights due to broader peaks (Kitts, 2001), both occurring at larger T-RF sizes, appeared to bias the detection of T-RF abundances (Fig. 2b). It is still being critically discussed in the scientific literature whether analyses of peak height or peak area are preferable (Grant & Ogilvie, 2003; Lueders & Friedrich, 2003; Kurata et al., 2004). Combining amplification efficiency (Fig. 2a) and T-RF size (Fig. 2b) into one quotient allowed the prediction of the T-RF abundance of each sequence with high accuracy (r=0.86, P<0.001) and few exceptions (Fig. 2c). This result indicated that independent of the restriction endonuclease used, it is the T-RF size that influences T-RF signal intensities. Therefore, sequences that reveal higher amplification efficiency and produce smaller T-RFs will show higher T-RF abundance. For this study, the maximum overall T-RFLP bias was a factor of about 14. PCR contributed with a factor of 3.5, and therefore post PCR analysis, including digestion and electrophoresis, accounted for a bias of about a factor 4. This indicated similar extents of biases related to PCR and post PCR analyses.

Competition in multi-template samples

Although T-RF abundances significantly varied between individual and combined amplified sequences in 67% of all cases, the profiles were highly correlated (r=0.85, P<0.001) and showed only a few T-RFs with larger discrepancies (Fig. 3b). These differences can mainly be assigned to amplification competition during PCR in the mixed samples. Amplification competition in complex mixtures of sequences may be caused by variations in reaction kinetics introduced by different (1) primer site affinities due to degenerate positions and mismatches, (2) melting temperatures of templates due to different GC contents, and (3) secondary structures and accessibilities (Reysenbach et al., 1992; Hansen et al., 1998; Polz & Cavanaugh, 1998). The interaction of these factors may alter the amplification efficiency of individual sequences in multi-template samples and appeared to have influenced the T-RFLP profiles in this study. These differences between individual and multi-template profiles were highly reproducible among triplicates.

Reliability for detecting quantitative changes in community structures

Although absolute quantities of the individual members of the artificial communities were not reflected in the T-RFLP profiles, relative changes in the structures (Fig. 4) and diversities (Fig. 5) were reliably detected. The 120 quantitative changes introduced into the community were significantly reflected in the profiles with only two (1.7%) exceptions (Fig. 4). Furthermore, the relative differences in the community composition (16.7% per step) were quantitatively and precisely reflected by an overall mean of 16.6±0.5% per step (Fig. 4). The constant ratio between adjusted and experimentally detected diversities allowed for determination of a correction factor. It is important to notice, however, that in this study all community members were detectable in the T-RFLP profiles, which is certainly not the case when analyzing highly complex soil microbial communities (Hartmann & Widmer, 2006). Therefore, application of correction factors to diversity values determined for environmental communities may not be feasible. This is supported by results from applying diversity indices to T-RFLP, which have indicated that data may be strongly biased and that analyses of community composition may represent a preferable approach (Hartmann & Widmer, 2006; Blackwood et al., 2007).

It has been suggested that profiling biases may remain constant between different samples of the same analysis (Casamayor et al., 2002; Zhou et al., 2002) and that profiles of bacterial or archaeal SSU rRNA genes may be relatively stable to variations in PCR conditions such as cycle numbers (Osborn et al., 2000; Ramakrishnan et al., 2000; Lueders & Friedrich, 2003). Low variability of T-RF peak height or peak area among replicates was reported in several studies analyzing SSU rRNA genes of bacterial or archaeal communities (Lukow et al., 2000; Osborn et al., 2000; Dollhopf et al., 2001; Dunbar et al., 2001; Lueders & Friedrich, 2003). It is important to notice that the magnitude of T-RFLP biases may strongly depend on the target gene or PCR primer sets used (Schmalenberger et al., 2001; Casamayor et al., 2002; Lueders & Friedrich, 2003).

This study demonstrated that differences in amplification efficiency and capillary electrophoresis are primary factors along the T-RFLP analysis chain that may alter the relative abundance of T-RFs in the profiles by factors up to 14. Owing to these biases, representation of the real microbial community structure and diversity in a given sample will probably not be possible. However, the T-RFLP approach appeared to reliably detect the relative changes in microbial community structures with a high quantitative precision, which is a prerequisite for monitoring effects on microbial communities.

Acknowledgements

The authors wish to acknowledge Roland Kölliker for support in statistical analyses and Jürg Enkerli for helpful comments on this manuscript. The project was supported by funding from the Swiss National Science Foundation (SNF).

Footnotes

  • Editor: Jim Prosser

References

View Abstract