OUP user menu

Neutral assembly of bacterial communities

Stephen Woodcock, Christopher J. van der Gast, Thomas Bell, Mary Lunn, Thomas P. Curtis, Ian M. Head, William T. Sloan
DOI: http://dx.doi.org/10.1111/j.1574-6941.2007.00379.x 171-180 First published online: 1 November 2007


Two recent, independent advances in ecology have generated interest and controversy: the development of neutral community models (NCMs) and the extension of biogeographical relationships into the microbial world. Here these two advances are linked by predicting an observed microbial taxa–volume relationship using an NCM and provide the strongest evidence so far for neutral community assembly in any group of organisms, macro or micro. Previously, NCMs have only ever been fitted using species-abundance distributions of macroorganisms at a single site or at one scale and parameter values have been calibrated on a case-by-case basis. Because NCMs predict a malleable two-parameter taxa-abundance distribution, this is a weak test of neutral community assembly and, hence, of the predictive power of NCMs. Here the two parameters of an NCM are calibrated using the taxa-abundance distribution observed in a small waterborne bacterial community housed in a bark-lined tree-hole in a beech tree. Using these parameters, unchanged, the taxa-abundance distributions and taxa–volume relationship observed in 26 other beech tree communities whose sizes span three orders of magnitude could be predicted. In doing so, a simple quantitative ecological mechanism to explain observations in microbial ecology is simultaneously offered and the predictive power of NCMs is demonstrated.

  • community assembly
  • dispersal
  • insular comunities
  • mathematical model
  • neutral model


Scale is a problem in microbial ecology. Even using the most-up-to-date molecular methods, one is limited to observing and characterizing very small samples from what are ostensibly very large naturally occurring microbial communities. The considerable technical sophistication and skill required to collect, analyse and enumerate microbial populations correctly in environmental samples can sometimes obscure just how small, in relative terms, samples are. Take, for example, a large clone library of say 500 clones derived from a 1 mg soil sample; the soil sample itself may contain as many as 109 individual organisms and applied microbial ecologists will generally be interested in the services provided by the communities at a scale somewhat larger than a single 1 g sample. By analogy, when there are currently 6 × 109 humans in the world a single sample of a few hundred individuals is unlikely to be sufficient to characterize the global distribution of any human traits unless it is extremely homogeneous. Molecular methods are advancing so quickly that in the near future it may be possible to get close to a complete census in a sample (Sogin et al., 2006) but, even then, a 1 g sample is small if one's aspirations are to characterize an entire field of soil. This disparity between sample size and community size is enormous and far greater than any comparable sampling issues in mainstream ecology. Consequently, patterns are perceived through a sparse, often distorted (Sloan et al., 2007) map of the microbial world.

Thus, the modus operandi in microbial ecology is extrapolation from very small samples and the fact that microbial systems are always observed at a scale much smaller than ultimately aimed at to characterize them, amplifies some of challenges faced in classical ecology. In scaling from a leaf to the ecosystem to the landscape and beyond (Jarvis & McNaughton, 1986), there must be an understanding of how information is transferred from fine scales to broad scales and vice versa (Levin, 1992). This problem of cascading information and ecological process understanding through a hierarchy of different scales is being tackled using mathematical models by ‘landscape ecologists’ (Wu & Hobbs, 2002). The models integrate mathematical descriptions of plausible plot-scale ecological processes to form patterns at the landscape scale. In going from rRNA genes in a sample to the sample itself and beyond, microbial ecologists face similar technical and conceptual challenges (Sloan et al., 2007) but with the added challenge of having an uncertain picture of the broad-scale patterns (Woodcock et al., 2006). This has consequences for the complexity of the models that can be aspired to be used. Simon Levin in his McArthur Award Lecture (Levin, 1992) on ‘The problem of pattern and scale in ecology’ described the essence of modelling thus ‘to facilitate the acquisition of this understanding (scaling), by abstracting and incorporating just enough detail to produce observed patterns. A good model does not attempt to reproduce every detail of the biological system; the system itself suffices for that. Rather, the objective of a model should be to ask how much detail can be ignored without producing results that contradict specific sets of observations …’ This judicious paradigm has been embraced by theoretical ecologists and a wide variety of conceptual models of potential ecological pattern-forming mechanisms have been encoded into mathematical models and then shown to produce observed patterns in the spatial distribution and relative abundance of taxa. All are simplifications of the system being modelled. The majority serve to, in some way, whittle down the set of plausible mechanisms that can lead to a particular pattern; theoretical ecologists are all too aware of the ‘same behaviour implies same mechanism’ fallacy and, since their representations of the ecological processes are rarely calibrated against observations, few say emphatically that their models are correct.

While validating the plausibility of a model through mathematics is vital, the paucity of attempts to go beyond this and validate the models themselves is frustrating (Belovsky et al., 2004); indeed Schoener (1972) cautioned against the ‘constipating accumulation of untested models’ more that 30 years ago. In microbial ecology, one is less constrained by the broad-scale patterns because one can find them so difficult to identify and thus theoretical microbial ecology, if pursued in the same fashion, unconstrained by biological reality, could significantly add to this uncomfortable blockage of untested and perhaps untestable models. Thus one further proviso is required; for a model to ultimately be of some practical use it should be predictive. This means that a model calibrated at one site, one scale or on the basis of one set of ecological processes should be capable of predicting phenomena at different sites, at different scales or that pertain to seemingly unrelated mechanisms. Harte (2004) implicitly combines the paradigm cited by Levin (1992), which calls for parsimony, with the requirement for prediction by suggesting that theories are of most interest when the ratio of the number of predictions that they make to the number of assumptions and adjustable parameters is large. It is for this reason that when it was chosen to investigate the roles of chance and dispersal limitation on patterns in microbial community composition the simplest possible conceptual model of community assembly that incorporated these factors was selected (Curtis et al., 2006; Sloan et al., 2006, Sloan et al., 2007); a simple neutral community assembly model (Hubbell, 2001) where the composition at a local scale is shaped only by random immigration, birth and death events.

Neutral community assembly models (NCMs) (Bell, 2000; Hubbell, 2001) have been shown to reproduce the distribution of taxa abundances in a wide range of different biological communities. However, in most previous applications of neutral theory the model parameters are selected to minimize the difference between observed and predicted taxa-abundance distributions. The merit of neutral theory, over and above other hypotheses on the formation of biological communities is then argued on the basis of (often small) differences in a goodness of fit statistic for calibrated taxa-abundance distributions (McGill, 2003; Volkov et al., 2003; Volkov et al., 2006; Chave et al., 2006). These arguments can seem rather arcane when there has been little attempt to validate the models (Harte, 2003). In addition, microbial ecologists were until recently precluded from the debate because, for most environments, only a small fraction of the diversity can be experimentally defined (Curtis et al., 2002). Despite the advances in molecular methods for characterizing naturally occurring microbial communities in situ, the disparity in scale between sample and community size and some inherent limitations of the methods conspire to make a purely empirical definition of a taxa-abundance distribution at a single site very difficult. Sloan et al. (2006) circumvented this problem for microbial communities by deriving a method for calibrating Hubbell's neutral theory based on a theoretical relationship between the mean relative abundance of common taxa and the frequency with which they are expected to appear in multiple similarly sized samples. Thus, in (Sloan et al. 2006, Sloan et al. 2007), the criterion of a parsimonious calibrated model capable of reproducing patterns is met. However, neutral theory is controversial and its parsimony still grates with many who seek to provide evidence that it cannot explain all the variance in real communities of macroorganisms (e.g. McGill et al., 2006). Calibrating an NCM at one site or one scale is not a convincing endorsement of the model's underlying assumptions and many alternative models could potentially reproduce either the taxa-abundance distributions (McGill, 2003) or the abundance-frequency relationships (He & Gaston, 2003) observed. A strong test of neutral theory has to demonstrate its predictive power and this has never previously been achieved (Condit et al., 2002; Gotelli & McGill, 2006; McGill et al., 2006). The authors provide the first demonstration of an NCM calibrated using microbial taxa abundances at one site and one scale accurately predicting very different taxa-abundances distributions and the observed taxa–volume relationship across a range of scales and sites.

Hubbell's neutral model makes predictions about how the richness and abundance distribution of taxa on island-like communities will be affected by community size and immigration. Indeed, the genesis of his NCM came about through an attempt to present a unified theory that combined the Theory of Island Biogeography (MacArthur & Wilson, 1967), which makes predictions on species richness within insular communities, with predictions on relative abundance of taxa. Ascertaining whether these predictions are borne out in reality requires taxa abundance data from a set of insular communities for which community size or immigration varies significantly but which are very similar in all other respects. If the NCM presents an adequate representation of the ecological process that shape the community structure then it should be possible to explain a significant proportion of the variance in the taxa-abundance distributions for all the communities by employing a single set of parameters calibrated at one site. However, datasets for insular communities of significantly different sizes or with different degrees of isolation that are housed in very similar ecosystems are rare. Bell et al. (2005) published just such a dataset for water-borne bacteria living in tree holes in beech trees in the same woodland. Samples were taken from 29 rainwater-filled, bark-lined holes, each of which housed a small ecosystem. The range of volumes of these habitats spanned three orders of magnitude; the smallest was a mere 50 mL, the largest 18 000 mL. Bell et al. (2005) reported that bacterial species richness increased with tree hole volume in a manner that could be modelled using a single power law relationship, which hints at some consistent process of community assembly. All the fluid was removed from the tree holes and was homogenized by stirring. Bacterial richness was determined from denaturing gradient gel electrophoresis (DGGE) analysis (Bell et al., 2005). Epifluorescence microscopy was performed (Porter & Feig, 1980) and the density of organisms in the tree holes was revealed to be around 105 mL−1. The sample size analyzed for all the tree holes was 5 mL (c. 5 × 105 individuals). Physically and chemically, the bacterial communities were similar; they were all supported by similar nutrients (decaying leaf litter), relatively stagnant, but subject to invasion events from either airborne or rainwater-borne microorganisms. The greatest geographic distance between any two trees in the study was around two miles. The distributions of the relative abundance of taxa in the samples was not reported in Bell et al. (2005) but are used here (e.g. Figs 1 and 2). These were determined by the relative intensity of bands on the DGGE gels. Because of detection limitations inherent to the DGGE analysis (e.g. Cocolin et al., 2000; Leclerc et al., 2004; Woodcock et al., 2006) used in the initial study, only the top few ranked taxa were observed at each site, hence the abundances in the dataset were normalized relative only to the total abundances of these most common taxa. Quantification of the absolute abundances of taxa in a sample using DGGE is open to criticism (Heuer et al., 2001); however, exactly the same technique was used for each tree hole and, therefore, exactly the same biases applied to each sample. Thus, for the comparative analysis of relative abundances presented here, it is fair to say that significant shifts in DGGE patterns reflect real shifts in the bacterial community composition. What is immediately striking from these data is just how dramatically and systematically the shape of the taxa-abundance distributions change between tree holes, with large communities exhibiting a much more even distribution than small ones. This is all the more remarkable because the sample size at each site was exactly the same.


The ranked taxa-abundance distribution (squares) observed in a 5 mL sample from the smallest tree hole which had a volume of 50 mL. The line shows the expected rank abundance distribution obtained by averaging 1000 realizations of ranked abundances simulated by Hubbell's neutral model with NT=5 × 105, θ=15 and m=1.0 × 10−6.


Ranked taxa-abundance distributions for a selection of seven of the 29 tree holes ranging in volume from 11 000–50 mL. The lines represent taxa-abundance distributions predicted by the neutral model with m=10−6 and θ=15 calibrated using data from the 50 mL tree hole.

Given the proximity of tree holes and the similarity of their environments, what affects the difference in the taxa-abundance distributions? If one were to assume, initially, that the tree hole environments were identical in everything except for their volume and that the same forces act to shape the community composition, then can volume alone explain the differences? It is shown that, accounting for tree hole volume, the distributions are significantly different and this hypothesis is rejected. Thus the distribution of taxa abundances in tree hole samples does not derive from the same underlying distribution. Therefore, there is no benefit to be gained in testing, what many commentators believe should be the null hypothesis in any study of taxa abundances (McGill, 2003; Gotelli & McGill, 2006), that a particular arbitrary parameter distribution, such as the lognormal, fits the data. Then the hypothesis that the tree holes house distinct, homogeneous, island-like communities that are neutrally assembled from a single metacommunity with a consistent rate of random immigrations into each tree hole is tested and could not be rejected. This is achieved by calibrating a NCM using the taxa-abundance distribution from the smallest tree hole, predicting the taxa-abundance distributions from all other tree holes and then testing at the 5% significance level whether the simulated and observed taxa-abundance distributions are the same.

Materials and methods

Do the samples derive from the same distribution?

Firstly, the hypothesis that the same structuring forces shaped the bacterial tree hole communities and that, consequently, the sample taxa-abundance distributions derive from the same underlying distribution was tested. Synthetic populations were generated for each environmental sample by selecting 5 × 105 individuals with replacement from the observed taxa-abundance distributions. Species were indexed by their ranked abundance, with the most abundant being ranked 1, the second most abundant 2, etc. The abundances reported for the observed data were relative to the total abundance of the taxa that appeared on the DGGE gels. Therefore the abundances of synthetic populations were also normalized by the same number of top ranked taxa. A Kolmogorov–Smirnoff test was then applied to every combination of two synthetic samples to determine how likely they were to have come from the same underlying distribution.

Is the community neutrally assembled?

Secondly, the hypothesis that the samples are from neutrally assembled communities fed by immigrants from a single source community with a constant immigration rate was tested. To do this, Hubbell's NCM was calibrated using the taxa-abundance distribution in the sample drawn from the smallest tree hole. In Hubbell's model it is assumed that the distribution of taxa abundances in the source metacommunity is described by a log-series distribution with a single parameter θ, which Hubbell calls the fundamental biodiversity number because it indexes the overall biodiversity. In local communities, which are assumed to be saturated with individuals, when an individual organisms dies it is either replaced with probability m by an immigrant drawn at random from the source community or, alternatively, by reproduction from within the local community with probability 1−m. Given local reproduction, the probability that any particular taxon reproduces depends on its relative abundance, which requires knowledge of the number of individuals in the local community, NT. Thus the shape of the taxa-abundance distribution for a neutrally assembled community depends on the values of the three parameters: NT, θ and m. NT was estimated using the tree hole volume and the density of organisms [O(105) mL−1]. θ and m were considered free parameters that were adjusted to give the best least squares fit between the observed and simulated expected taxa-abundance distribution for the smallest tree hole. Least squares fitting was adopted because it is liable to be biased towards fitting the model to the higher observed relative taxa abundances; the authors have more confidence in these than the lower abundances estimated from DGGE band intensities. To simulate the sample distribution, a realization of the relative abundance of taxa in the meta community Embedded Image where SM is the number of different taxa in the meta community, was first generated. To do this, it is noted (after Volkov et al. (2003)) that provided θ/SM is small, the log-series distribution can be approximated by a γ distribution. It is shown in the appendix that this leads to a simple method for generating realizations of relative abundance of taxa in the metacommunity Embedded Image by sampling at random from gamma distributions. For any given realization of Embedded Image then Sloan et al. (2007) show that the distribution of taxa abundances in the local neutrally assembled community Embedded Image is Dirichlet Embedded Image and give a simple algorithm for generating a realization of Embedded Image. Sloan et al. (2007) also show that the distribution of taxa abundances in a sample of size NS (i.e. what is observed on the DGGE gel) from that distribution is Dirichlet Embedded Image. Therefore, given any pair of parameters m and θ, it was straightforward to simulate 1000 realizations of the taxa abundance distribution in a sample of NS individuals from a tree hole comprising NT individuals. These were then averaged to give the expected taxa-abundance distribution. For the purposes of a comparative analysis, since the abundances reported for the observed data were relative to the total abundance of the taxa that appeared on the DGGE gels, the abundance of synthetic populations were also normalized by the same number of top-ranked taxa.


Two hypotheses were tested. Firstly, that the taxa-abundance distributions observed in all the tree holes derive by randomly sampling from the same distribution. The P-values were so low that the hypothesis that the samples are all from the same underlying distribution at the 0.05% level (i.e. all P-values <0.0005) can be confidently rejected. Secondly, the hypothesis that the communities are neutrally assembled from the same source community and that the taxa-abundance distributions could be reproduced by a NCM was tested. A stringent test of this was adopted in that, rather than seeking parameter values on the basis of all the data from all the tree holes, it was decided to calibrate two free parameters the immigration probability, m, and index to the biodiversity in the source community, θ, in Hubbell's NCM using data from only one tree hole; the smallest. The least-squares best fit to the relative abundance of the observable taxa in a sample (5 mL) from the smallest (50 mL) tree hole was obtained with θ=15 and m=1.0 × 10−6 (Fig. 1). These parameters values were then used to predict the expected abundance distributions in all other tree holes, the only parameter that changed between tree holes was NT, the total number of individuals.

Figure 2 gives seven examples of the remarkably good match between the observed and predicted taxa-abundances that was obtained, in the majority of tree holes. A quantitative measure of goodness of fit was obtained simulating an additional 500 realizations of the tree hole taxa-abundance distributions. Pearson's statistic for goodness of fit was calculated for these and for the observed distribution

Embedded Image

where E(i) is the expected abundance of the ith ranked taxon and x(i) is its abundance in the simulation or observed dataset. A P-value was then estimated from the proportion of these 500 trials that produced a goodness of fit statistic greater than that calculated for the observed data (Table 1). Hypothesis testing at the 5% significance level suggested (Table 1) that for 27 of the 29 tree hole communities, there was no evidence to reject the neutral model. There was no reason to assume any anomalies in either environmental conditions or sampling procedure for the two tree holes where the neutral model was rejected.

View this table:

Estimated P-values for the goodness of fit using a NCM calibrated against the smallest site, tree hole 21

Tree-hole numberVolume (mL)P-value
1911 0000.158
  • * Not statistically significant.

  • The parameter pair used was (15, 10−6).

The success of the neutral model in predicting taxa-abundance distributions over a range of different scales (Fig. 2, Table 1) demonstrates its potential as a predictive tool. Much of the excitement about Hubbell's neutral model stems from its ability to link the prediction on different ecological phenomena. Indeed, Hubbell refers to his theory as the ‘unified theory of biodiversity and biogeography’ because of its potential to link predictions on the shape of taxa-abundance distributions to a relationship between taxa richness and area sampled (taxa–area relationship). This link has never previously been explicitly demonstrated. In Fig. 3 it is shown that the predicted richness of taxa in each of the tree hole samples closely matches the observed richness. These predictions have again been produced using the parameter values calibrated using data from the smallest tree hole (θ=15, m=10−6) a detection threshold of 0.005 on the relative abundance of taxa was assumed; this was the minimum relative abundance to appear on any of the DGGE gels. Bell et al. (2005) fitted the phenomenological model of a power-law relationship to their observed taxa–area relationship, which is reproduced in Fig. 3a. When a similar model is fitted the predicted taxa richnesses (Fig. 3b), an almost identical relationship is obtained. The greatest deviation between observed and predicted is, perhaps unsurprisingly, in the largest tree hole where the neutral model was rejected (Fig. 3c). For the most part, however, the link between the taxa–area relationship and taxa-abundance distributions suggested by Hubbell is borne out in the tree hole data set. the smallest tree hole was deliberately selected to calibrate the model parameters because it offers the greatest information on the rate of immigration, m. According to Hubbell's model, as community sizes increase, the systems increasingly resemble the source community and the effects of immigration become obscured. Thus, the smallest site offers the greatest opportunities to quantify immigration into the systems. The value of θ=15 calibrated on the smallest tree hole is consistent with calibrating on any other tree hole. Independently, calibrating the model using all of the other tree holes suggests that θ lies in the range 15≤θ≤25 and it transpires that the predictions on both the taxa-abundance distributions and taxa–volume relationship across all the tree holes were insensitive to changes in this range. This insensitivity is not a property of the neutral model itself. Rather it is an artefact of the experimental methods available to microbial ecologists. As discussed in the introduction, microbial ecologists are limited to viewing a small percentage of the overall diversity in an environmental sample using rapid community profiling techniques such as DGGE. This is a generic problem in microbial ecology that is discussed in more detail elsewhere (Curtis et al., 2006; Woodcock et al., 2006; Sloan et al., 2007). However, in the context of this application the reason for the insensitivity to θ is demonstrated in Fig. 4. This shows the taxa-abundance distributions that one would expect from a random samples from two log-series distributed source communities: one with θ=15 and the other with θ=25. If there were no dispersal limitation, then these are the distributions one might expect in all the tree holes. In the entire sample of 5 × 105, there are significant differences in the overall richness of taxa and in the distribution of taxa-abundances as a function of θ. However, using DGGE it is impossible to detect all the taxa, only those with abundance greater than some threshold can be detected. A detection limit of 0.5% relative abundance is displayed in Fig. 4a and b the taxa abundance distribution for taxa whose abundances are greater than this limit are displayed. The distributions are quite similar and thus the abundance distribution of detectable diversity is quite insensitive to θ. Sloan et al. (2007) point out that it is difficult to determine the underlying taxa-abundance distribution from such small samples and, therefore, it may be that the source community abundances are not in fact log-series distributed; other source distributions might produce similar results. Thus the success of the neutral model in predicting the taxa-abundance distributions in the tree holes should not be seen as a validation of Hubbell's model in its entirety. Sufficient information is not available about rare taxa to conclude that the log-series is the source community's taxa-abundance distribution, let alone verify Hubbell's conceptual model for the maintenance of biodiversity in the source community. However, given that there is some underlying source community distribution, the first test showed that the tree hole communities are not merely random samples from that source community; the abundance distributions of detectable diversity all differ significantly from one another. Some ecological process must be affecting these differences. This could be a function of the environment, but part of the attraction in examining the tree-hole communities is that their environments are all so similar. Besides, the greatest perceptible difference between the tree holes is their volume and hence the size of the communities they house. In the predictions presented in this paper, the source community abundance distribution and the immigration rate are held constant and the only parameter that changes from tree hole to tree hole is the community size, NT, which is estimated to be the product of the measured bacterial density and volume. Thus the ecological mechanism that effects the difference in tree hole abundance distributions is the changing relative importance of random immigration on tree holes of different sizes. The inability to reject the neutral model predictions on the basis of the data in the majority of tree holes suggests that this simple explanation cannot be ruled out.


(a)The observed bacterial richness in all 29 tree holes. The solid line represents the power-law relationship, S=2.11 V0.26, fitted using linear regression. (b) The bacterial richness predicted by the neutral model with θ=15 and m=10−6 calibrated using the taxa-abundance distribution of the smallest tree hole. The solid line represents the power-law relationship, S=2.19 V0.25, again fitted using linear regression. (c) Observed vs. predicted richness in each tree hole. The line represents perfect agreement and the two squares indicate where the neutral model was rejected.


(a)The ranked relative abundance of taxa in a random sample of 5 × 105 individuals from two different log-series distributed source communities; one with parameter θ=15 the other, more diverse, with θ=25. The dashed line is a threshold in relative abundance below which taxa will not be detected using DGGE. (b) The distribution of abundance taxa in the random samples that can be detected by DGGE. The distribution of common taxa is less sensitive to the value θ than that of rare taxa.


The success of the neutral model in explaining the different taxa-abundance distributions in the detectable diversity of tree holes, whose sizes vary over three orders of magnitude, without the need to change any parameters, constitutes the strongest evidence, so far, that NCMs can usefully describe community composition. The evidence creates a compelling case to study carefully the role that random reproduction, death and immigration play in shaping bacterial community structure. It suggests that at least some bacterial communities are dispersal-limited and, therefore, challenges the perspectives held by some commentators that global dispersal of microorganisms prevents them from having biogeography (Fenchel, 2003) and that microbial population sizes are sufficiently large to preclude local stochastic extinctions (Fenchel & Finlay, 2005). This suggestion could be tested by directly measuring immigration rates into similar but different sized bacterial communities.

Naturally occurring communities of microorganisms are vital to life on earth and are of profound practical significance in agriculture, medicine and engineering. Describing patterns in microbial communities is, therefore, important but not as important as explaining why the patterns form. Thus the model presented here is of particular importance because rather than fitting an arbitrary mathematical function to an observed pattern in microbial taxa abundances, the model calibrated at one scale successfully makes predictions at others. Prediction is rare in both microbial and macrobial ecology (Harte, 2004) and is a far more convincing test of an ecological theory than fitting predicted to observed taxa-abundance distributions at a single site. However, in presenting strong evidence in favor of Hubbell's neutral theory, one can court controversy. Indeed, when these results were presented at the joint Society for General Microbiology Meeting/British Ecological Society on which this thematic issue is based, several of the audience strongly objected. The grounds for this were that very many other models could have produced similar patterns and that processes such as niche differentiation could possibly explain the same patterns and better represented the biological complexity that is believed to exist. This reflects debate that has run in the ecological literature for the past five years (Hubbell, 2006; McGill et al., 2006) where there was an initial polarization between the niche and neutral perspectives on community assembly. As the debate has matured, the hostilities have diminished and there is now a degree of conciliation and a recognition that the two are not mutually exclusive (Hubbell, 2006) and that chance, dispersal limitation and niche differentiation, or species sorting, all have a role to play. However, the parsimony of neutral theory still remains controversial and many seek to provide evidence that it cannot explain all the variance in real communities of macroorganisms (e.g. McGill et al., 2006). There is no doubt that the neutral theory will fail to explain all of the variance but would maintain that the success, so far, of the neutral theory in explaining and, indeed, predicting the majority of the variance in microbial community composition is such that the burden of proof lies firmly with those that believe niche differentiation dominates the community assembly process. Their route to providing quantitative evidence of this will require a predictive model based on niche or species sorting concepts and none currently exist. There are highly cited examples of models that successfully meld deterministic, niche-based concepts with dispersal in a spatially distributed environment to demonstrate that a combination of these factors can promote biodiversity (e.g. Tilman, 1994; Mouquet & Loreau, 2003). However, these demonstrations tend to rely on a large number of (often invented) taxon-specific parameters. This degree of specificity is currently impossible in microbial ecology. Furthermore, there are no examples of such models being calibrated at one site or scale and then subsequently predicting phenomena at another. The rationale behind these theoretical demonstrations is commendable, in that ultimately to transfer information successfully through all scales in the landscapes of microorganisms and macroorganisms will require a theory that incorporates both demographic stochasticity and deterministic factors. However, for the foreseeable future, the representation of deterministic factors cannot rely on the experimental definition of a suite of parameters for each species, many of which cannot even be seen in microbial communities. Thus some alternative model that encapsulates the deterministic factors is required, perhaps based on energy concepts (Brown et al., 2004) or maximizing disorder (Shipley et al., 2006), and this remains an exciting challenge that transcends all the subdiscipline boundaries that appear to exist in ecology. For the moment though, it would appear that neutral dynamics are the best quantitative description of bacterial community assembly in beech tree holes in Wytham Woods, Oxfordshire, UK.


Sampling from a log-series taxa abundance distribution

Let μ be relative abundance and S(μ) be the expected number of taxa with abundance μ then according to Hubbell's model of metacommunity dynamics S is described by Fisher's log-series distribution

Embedded Image


Embedded Image

and SM is the total number of taxa in the source community. It is not immediately obvious how to sample at random from this distribution to generate realisations of the taxa abundance distribution in the metacommunity. However, a straightforward sampling algorithm becomes apparent if we use an approximation suggested by Volkov et al. (2003). They noted that as Embedded Image then the log-series distribution can be approximated by

Embedded Image

since Embedded Image and Embedded Image.

The advantage of this formulation is that the species abundance distribution can be obtained by generating SM independent realisations of Gamma variables Embedded Image for i=1, …, SM for finite θ as θ/SM→0.

As the variables are independent, their joint density function is simply the product of their individual density functions

Embedded Image

However, rather than using absolute abundances which requires explicit knowledge of the number of individuals in the metacommunity, we consider the relative abundance, pi, of each species. Setting Embedded Image and Embedded Image we note that only SM−1 of these pi variables are now independent.

Therefore, set

Embedded Image


Embedded Image

The joint density function of Embedded Image is therefore

Embedded Image

where J is the Jacobian, given by

Embedded Image (A6)

It can be seen that Embedded Image and therefore,

Embedded Image (A7)

The first term of this implies that Embedded Image which gives that Embedded Image as expected for the log-series distribution. The second bracket states that Embedded Image Embedded Image. Additionally, Embedded Image are independent of NM. Therefore, the distribution of relative abundances can be obtained by generating SM independent realisations of Gamma variables Embedded Image for i=1, …, SM and then normalizing by Embedded Image.


  • Editor: Jim Prosser


View Abstract