## 1. Introduction

Ecology is intrinsically quantitative, with ecologists spending much of the last 100 years enumerating the patterns and processes underpinning the dynamics of, and interactions within, natural populations and communities. Owing to its quantitative nature, ecology has long recognized the need to work in parallel with mathematical disciplines. This interdisciplinary approach is elegantly highlighted in some of ecology's seminal and most highly cited papers by early ecology pioneers, such as Fisher, Preston and MacArthur, who used mathematical approaches to explain and analyse ecological observations [1–3]. However, in addition to being a tool to use for ecologists, the disciplines of mathematical and theoretical ecology also developed independently of studies examining ecological data, into a fascinating and insightful discipline in its own right; emerging from the foundations laid by mathematicians, such as Volterra and Lotka, on which much of current theoretical ecology is built (see Levin [4] for a discussion of these early research leaders). Subsequently, this has produced a minor schism, where researchers tend to envisage themselves as either theoretical ecologists or empirical ecologists. Currently, much of the integration between the empirical and theoretical disciplines of ecology is made informally; empirical ecologists apply statistical models and numerical approaches to the analysis of their data often with little consideration for the underlying theory, while theoretical ecologists often make biologically unrealistic simplifying assumptions in the pursuit of beautiful and analytically tractable mathematical models. Yet, arguably, it is now impossible to conduct robust ecological research without considering both the data and the theory associated with the ecological question being addressed.

## 2. Linking models with ecological processes

Without ecological theory, collecting data is a futile and meaningless endeavour. Likewise, producing elegant and beautiful mathematical models of ecological systems without validation against real data is an empty achievement. These two aspects of ecology (empirical and theoretical) are as intricately linked as the coupled ODEs used to describe the interactions between the predatory fox and its prey the rabbit. Theoretical ecology provides a potential mechanistic explanation of observed ecological patterns. By following an iterative process of evaluating models with data and discovering which assumptions are plausible, which null hypotheses can be rejected, and what additional theoretical developments are required to explain the data, researchers can disentangle the underlying ecological mechanisms regulating the natural world. A recent example of this process is highlighted in the renewed interest in neutral theories of biodiversity [5]. Although neutral theories of biodiversity are based on the potentially unrealistic biological assumption of ecological equivalence across species, via the iterative process of continued testing and revaluation of the underlying theory ecologists have provided major insights into the dynamics of natural communities, largely in an attempt to identify why unrealistic assumptions do so well in explaining data. However, this idealized view of how ecological research should be conducted is logistically challenging and increased integration, collaboration and communication between empirical and theoretical ecologists is clearly needed to overcome these logistical challenges.

In September 2011, the University of Essex hosted an interdisciplinary conference, ‘Mathematical and Theoretical Ecology 2011: linking models with ecological processes’ (MATE 2011), with the aim of exploring these issues and creating discussion and debate between theorists and empiricists working in ecology. In this issue of *Interface Focus*, we present the best contributed papers from the Essex meeting. These papers cover a wide range of topics but all are linked by the theme of how advances in theoretical ecology can be integrated with ecological data.

## 3. Advances in population and community ecology theory

Three of the papers within this issue particularly focus on some of the fundamental aspects of population and community level ecology, and along with the other contributions, highlight some of the issues of integrating theoretical ecology and mathematical models with ecological data [6–8].

A central issue in ecology is that natural environments are patchy, and the resources used by organisms within the local community are heterogeneously distributed [9]. In stark contrast, many theoretical models in ecology adopt a mean-field approach, averaging out spatial variability and largely ignoring the spatial structure observed in almost all ecological data. This mean-field treatment of ecological models is a simplifying assumption, which allows analytical and numerical tractability and has been used to produce elegant models against which ecological data can be compared. However, whether or not a mean-field approach, or one that accounts for spatial variability, is adopted in ecological studies may largely predetermine results [10]. In this special issue, Grünbaum [6] addresses questions of whether or not to follow a mean-field approach in ecological models of interacting consumers and resources. Yet, in contrast to the often assumed additional complexity associated with spatially explicit modelling of patchy environments, Grünbaum [6] demonstrates a simple heuristic approach that incorporates consumer–resource patchiness with little additional complexity. In addition, Grünbaum [6] expands this approach, demonstrating the applicability of three non-dimensional indices (focused on an organism's movement, reproduction and resource consumption) in quantifying consumer–resource interactions in patchy environments, but which can also be used to demonstrate deviation from mean-field assumptions; offering a potential correction factor to large-scale ecological models that are heavily reliant on the simplicity mean-field assumptions offer.

Often mean-field approaches provide relatively realistic biological assumptions over the temporal and spatial scales relevant to the model and ecological question being addressed. For example, the human population within a densely populated city can be considered a well-mixed homogeneous environment for highly fecund and infectious disease-spreading microbes. However, even mean-field models with obvious analytical solutions have problems when considering their fit to ecological data and ability to accurately recreate observed ecological patterns. In the simplest case, one of the major obstacles to overcome is in parametrization; deterministic population models by definition will ultimately produce results dependent on initial parameter values, and even stochastic models will reflect to some extent their starting conditions. Yet, often ecological data are noisy or incomplete and parameter estimates not obvious, leading to a number of issues in model parametrization [11]. By focusing on population-based epidemiology models, Stollenwerk *et al*. [7] address issues of parameter estimates. Stollenwerk *et al*. [7] provide a comprehensive review of this subject by first examining basic epidemiological models and then expanding their examples to cover more realistic cases focusing on models of dengue fever. Building on this approach, Stollenwerk *et al*. [7] then produce a master equation from which the transition probabilities can be used to compute a likelihood function; linking their theoretical models with observed data by providing the probability of observing all points from empirical time-series data describing infection dynamics, as a function of model parameters. Once a likelihood function is computed it can be incorporated into a Bayesian framework, and along with the maximum-likelihood estimates, Stollenwerk *et al*. [7] then discusses the application of these methods in parameter estimation.

Maximum-likelihood and Bayesian approaches are increasingly being used in ecology to estimate parameters from data and select or compare between theoretical models describing observed ecological patterns. This use of likelihood-based statistical approaches reflects the underlying need for ecologists to accurately assess the performance of statistical models when applied to their data; likelihood-based approaches often provide a more parsimonious comparison of model performance (e.g. based on AIC scores) than selecting models based on maximizing adjusted *r*^{2}-values [12]. This may lead to likelihood-based approaches providing quantitatively different results from model-fitting analyses compared with other approaches such as least squares, often inspiring researchers to re-examine classic questions in ecology, but with novel statistical techniques. In this issue, Etienne *et al*. [8] tackle this challenge by applying novel statistical methods to a classic question: why are small-bodied taxa more diverse than large taxa, and can this observation be explained by phylogenetic clade age alone? Previous studies have suggested that small-bodied taxa are more diverse simply because they come from phylogenetic clades that originated longer ago and thus have had longer to diversify. Etienne *et al*. [8] revisit this finding, by developing a robust statistical approach using a constant-rate birth–death model with allometric scaling of speciation and extinction rates. This model is then used in a maximum-likelihood approach that maximizes the likelihood of the model, given data on extant diversity, clade age and average body size from a compiled dataset covering a range of Metazoan taxa. Etienne *et al*. [8] conclude that clade age alone does not explain higher diversity of small-bodied taxa, but that additional processes are at work and that small-bodied taxa may simply have faster ‘evolutionary clocks’. By taking the approach in Etienne *et al*. [8], which robustly merges empirical data and theoretical models, the authors have illuminated interesting new perspectives on an old ecological question.

## 4. Advances in behavioural ecology theory

The use of statistical models to infer properties of behaviour from observed data in individual animals and groups of animals is a topic explored by both Schliehe-Diecks *et al*. [13] and King [14]. Schliehe-Diecks *et al*. [13] give a detailed illustration of the application of mixed hidden Markov models (HMMs) in a case study that looks at mouse lemur behaviour. HMMs are powerful and flexible statistical tools that allow the user to infer and quantify sequences of behaviour. The basic rationale behind a HMM is that the underlying behavioural or motivational states are ‘hidden’ and cannot be observed directly, but consequences or outputs of the hidden state can be observed. In the case study of Schliehe-Diecks *et al*. [13], the ‘hidden’ state is the motivational state relating to hunger—animals are either hungry or satiated, but we cannot observe which state an animal is in directly. However, we can observe the ‘output’ state—whether the animal is currently feeding or not. Schliehe-Diecks *et al*. [13] develop a mixed HMM by including factors that account for differences in the behaviour of individual lemurs within the observed group. These factors could be covariates linked to behaviour, such as sex, body mass or time of feeding, but the mixed HMM also allows for ‘random effects’ to be included which can account for different ‘personalities’ across the study group. Schliehe-Diecks *et al*. [13] use their mixed HMM to analyse behavioural time-series data from 54 lemurs and demonstrate how feeding behaviour is linked to both body mass (with larger lemurs staying satiated longer) and sex (females having longer feeding and non-feeding periods). They finish with a discussion of how this type of mixed HMM model is easily generalized to other ecological contexts and could be used by ecologists to identify how behavioural flexibility and personality differences interact and lead to differences in observed behavioural sequences.

The state-space approach reviewed by King [14] in the context of capture–recapture–recovery data has a very similar model structure to the HMM discussed by Schliehe-Diecks *et al*. [13]. The main difference between the HMM and state-space approach is that in a state-space model the ‘states’ are not discrete (as in the HMM) but are continuous. As with HMMs, state-space models are a flexible tool that can be used on noisy datasets where there may be measurement error or missing data, making them ideal to use with ecological datasets such as those generated from tracking studies of animal movement behaviour [15]. In her review, King [14] explains much of the theoretical background to the state-space approach, while highlighting some of the common practical issues that may be faced. For example, because a state-space model is a continuous system process, it is generally much harder to find the likelihood function of the observed data (in contrast, this is relatively easy in an HMM). King [14] goes on to demonstrate how the state-space approach is an ideal method to analyse capture–recapture–recovery data. Capture–recapture studies typically involve initial capture events where individual animals are marked, released and then recaptured at some point in the future (although individuals do not always need to be physically captured—sightings and photo observations can be included). In addition to live captures or sightings, it is also possible to ‘recover’ dead individuals. Once the capture–recapture–recovery data are recorded, they can be used to estimate demographic parameters such as survival probabilities and/or abundance. King [14] demonstrates how the state-space approach can be used to analyse such data, including extensions to the basic model to include multi-states or mixed models such as random effects. King [14] finishes with an informative illustration of how the solution of a state-space model can be found in various ways using Bayesian techniques. The key point highlighted is that the performance and efficiency of different Bayesian model-fitting techniques will be problem- and model-dependent. The state-space approach is clearly a useful tool for ecological data analysis as it can combine data from different source into a single integrated analysis.

The recent advances in statistical modelling described by Schliehe-Diecks *et al*. [13] and King [14] are likely to prove invaluable for many ecological researchers trying to link theoretical models to data. However, it is also important that theoretical modellers continue to develop realistic and informative mechanistic models of behaviour to explore ecologically relevant processes such as animal movement and predator–prey interactions. This is the theme of the paper by Mckenzie *et al*. [16] who analyse data from a detailed tracking study of four wolves in the Rocky Mountains (Alberta, Canada). One of the key characteristics of this type of environment is the highly heterogeneous landscape (see also Grünbaum [6]). Mckenzie *et al*. [16] explore the effect of landscape features on wolf movement behaviour and the consequent effect on prey interactions. In particular, they set-up and analyse a mechanistic movement model where wolves react to ‘seismic lines’ (narrow linear stretches of forest, cleared for energy exploration) in the landscape. Their movement model is based on a simple diffusive random walk process that is extended to include increasing levels of complexity such as anisotropic diffusion along the seismic line or biased/directed movement towards the seismic line (see Codling *et al*. [17], for a review of these random walk and diffusion models). After analysing their tracking data, Mckenzie *et al*. [16] demonstrate that the wolves in their study do alter their movement behaviour when on seismic lines (moving faster and with more directional persistence leading to anisotropic diffusion along the line), but only one wolf moved towards the seismic line when close to it (a directed or biased movement). They then consider how the density of seismic lines and the different movement models interact and result in different functional responses (the *per capita* kill rate as a function of prey density). They show that the presence of seismic lines can lead to an increase in prey encounters and hence kill rate, and that the relative effect of seismic lines on the functional response is greatest at the lowest densities of prey. The paper raises a number of interesting questions from both a behavioural and theoretical viewpoint. For example, the authors finish by discussing how their mechanistic model could be made more realistic by including correlation in movement directions (as seen in their wolf movement data) using a transport equation approach [17] but this would consequently make the mathematical analysis more difficult.

One of the key advantages of a theoretical modelling approach is that it is possible to explore a wide range of scenarios that simply would not be possible experimentally or through field observations. Rands [18] presents an individual-based simulation model of a host–brood parasite interaction (e.g. a cuckoo trying to lay in the nest of a reed warbler). In contrast to other papers in this special issue, Rands [18] does not try to parameterize the model with real data, but instead explores the full parameter space (within reasonable upper and lower bounds) to determine general emergent properties of the system. The rules of behaviour at the individual level are very simple and are based on parameters that govern the probability of taking various actions (e.g. move away from nest or stay on nest in a given time step). In his model, Rands [18] assumes that individual hosts have two possible strategies that are of interest when a brood parasite is spotted (against a null model of no response); either ‘mobbing’, where the whole social group of hosts try to drive off the parasite (at the risk of one's own nest being parasitized), or ‘sitting tight’, where individuals simply return to their nest as quickly as possible when alarmed. The simulation includes a learning element to the host behaviour—all hosts start off naive but learn to recognize the parasite through social interactions with neighbours when mobbing (when sitting tight there is a risk of not learning to recognize the parasite). After running simulations across a wide set of parameters, Rands [18] demonstrates that both mobbing and sitting tight are successful strategies to avoid parasites (compared with the null model of no response). Interestingly, there was no significant difference between the two strategies (in terms of parasite avoidance), except for the case where the probability of a parasite appearing was high, in which case mobbing become a more successful strategy. The simulation model is simple and elegant but nevertheless gives some clear predictions about how social behaviour such as mobbing may have emerged in a host–parasite system. Rands [18] finished by highlighting the model assumptions that could be modified in future studies to explore different scenarios or problems.

In the context of animal social behaviour, the question of how animal groups collectively make decisions has received increasing attention recently, with most studies using a theoretical modelling approach to address the problem. In a thoroughly comprehensive overview of the topic, Conradt [19] considers how collective group decisions are made in the context of two key scenarios: (i) where information available to individuals in the group is uncertain and (ii) where individuals in the group have conflicting preferences. In the case of information uncertainty, Conradt [19] first details quorum responses, where the decision at the individual level is influenced by how many others in the group have made a certain decision. She then discusses the effective leadership model, where a small sub-set of the group that are ‘informed’ about the correct movement direction can lead a group of uninformed individuals effectively. Finally, the independence–interdependence model is described. In essence, all these information uncertainty problems are examples of the ‘many-wrongs’ principle where several decision-makers are able to pool their personal information and eliminate individual errors [20]. The case of conflicting preference between individuals in the group decision-making problem is perhaps more complicated. In this context, Conradt [19] first describes a ‘group-level’ model where behaviour can range from shared decisions involving all group members to ‘dictatorial’ where decisions are made by one individual only. For most biologically relevant assumptions, the best strategy in this model is then for the group to make shared decisions rather than have a dictator. The ‘leader–follower’ model is arguably the simplest group conflict model and in this case, it is the individual who has most need (e.g. for food) that makes the decision, an example of ‘leading according to need’. A similar result is obtained with the ‘pair-synchronization’ model, where leadership is effectively shared and decisions made according to need. Conradt [19] discusses synchronization models in small groups of three individuals and explains how in the context of movement synchronization, both shared and dictatorial decision-making are evolutionary stable strategies. The ‘leading according to need’ model is then described in larger groups. Conradt [19] finishes with a description of two clear open problems in this field of research. Firstly, the problem of collective decision-making where both information uncertainty and conflicting preferences are present has not yet been properly addressed. Secondly, a recurring issue with many of these models is the lack of empirical data to verify the model predictions, and this is where technological developments that allow new methods of data collection for large groups of animals may allow us to make progress [21].

## 5. Future directions in theoretical ecology

Despite many recent advances in theoretical ecology and the associated integration with empirical data, a number of fundamental challenges still remain to be tackled, and as with most scientific disciplines as new insights are gleamed, additional challenges emerge. Broadly speaking, ecological models can be split into two separate categories; simplistic mathematical models, which offer analytically tractable solutions and the examination of the underlying model properties, and simulation-based models (either rule-based or numerical integrations of formal mathematical models), which allow the examination of more complex models that cannot be solved analytically. Both of these modelling approaches have advantages and disadvantages associated with them, but a common theme is that as the models become more complex, working with them becomes more computationally demanding. In particular, numerical simulation of complex models is greatly limited by available computer power and the questions theoretical ecologists are beginning to ask are challenging the capacity of current computer hardware. This is particularly evident when researchers begin to address some of the big remaining questions in ecology, such as those outlined by Levin [4]; for example, ‘how does one scale from the microscopic to the macroscopic?’ In the final paper in this special issue, Petrovskii & Petrovskaya [22] address the current computational challenges including providing recommendations on how to deal with them. By focusing on a number of timely and important ecological examples, including issues regarding estimating crop pest abundances, incorporating space and scale in models, and examining population dynamics of pacific salmon, Petrovskii & Petrovskaya [22] identify some key computational challenges and how they originated. A central issue identified by the authors is that most simulation-based studies in ecology use modelling ideas and approaches developed for application in other sciences and not specifically to address ecological systems. Yet, ecological systems are very different from physical and chemical systems, notably containing more uncertainty and chaotic dynamics, and are often influenced by stochastic processes, and ecological models must reflect this [22]. This problem is confounded by often poor accuracy of ecological data, not due to poor sampling design and data collection, but as a result of the complexity of ecological interactions, making model validation difficult. Thus, computational methods in ecology need to be developed alongside theory to account for these issues and new computational tool kits need to have as much rigorous testing as is applied to the original theory [22].

It is likely that these computational challenges in ecology will continue to grow. In recent years, data describing ecological systems have increased, reflecting advances in automated data loggers, satellite-based remote sensing technology and the current revolution in ‘omics’ technologies, among other data recording methods. A common theme of many of the papers in this special issue is that ecologists now face the challenge of dealing with more data than has ever been available historically, but despite this increase in data quantity it still contains all the uncertainties, noise and stochasticity found in all ecological systems. This raises new challenges for theoretical ecologists as well as empirical ecologists. For example, it is now possible, using the latest generation of DNA sequencing technologies, to quantify communities containing thousands of microbial species represented by millions of individuals. Yet, developing a model, and simulating a theoretical community representing thousands of microbial species occupying a number of trophic levels with biologically realistic dynamics is at the very limit of computational ecology methods. With increasing amounts of data available, ecologists need to remain focused and continue to follow the examples of the early pioneers of ecology and rise to the challenge, producing theory able to match the demands of the data. After all, without theory providing testable hypotheses, ecology could become nothing more than data collection for its own sake. We finish with a quote from perhaps the most famous inter-disciplinary scientist of them all:
He who loves practice without theory is like the sailor who boards ship without a rudder and compass and never knows where he may cast.

Leonardo da Vinci

## Acknowledgements

We would like to thank the authors, reviewers and editors who helped to create this special issue under a very tight deadline. We are grateful to the MATE 2011 scientific committee and in particular Sergei Petrovskii for his help and guidance. We thank all the MATE 2011 participants and speakers for contributing to the meeting and creating a friendly atmosphere that allowed plenty of stimulating and exciting discussions to take place. We also thank Hannah Lewis for helpful comments on this introductory article. Funding for the MATE 2011 conference was provided by the London Mathematical Society, the British Ecological Society and the Department of Mathematical Sciences at the University of Essex.

## Footnotes

One contribution to a Theme Issue ‘Mathematical and theoretical ecology’.

- Received January 12, 2012.
- Accepted January 12, 2012.

- This journal is © 2012 The Royal Society