Abstract
It has long been recognized that numerical modelling and computer simulations can be used as a powerful research tool to understand, and sometimes to predict, the tendencies and peculiarities in the dynamics of populations and ecosystems. It has been, however, much less appreciated that the context of modelling and simulations in ecology is essentially different from those that normally exist in other natural sciences. In our paper, we review the computational challenges arising in modern ecology in the spirit of computational mathematics, i.e. with our main focus on the choice and use of adequate numerical methods. Somewhat paradoxically, the complexity of ecological problems does not always require the use of complex computational methods. This paradox, however, can be easily resolved if we recall that application of sophisticated computational methods usually requires clear and unambiguous mathematical problem statement as well as clearly defined benchmark information for model validation. At the same time, many ecological problems still do not have mathematically accurate and unambiguous description, and available field data are often very noisy, and hence it can be hard to understand how the results of computations should be interpreted from the ecological viewpoint. In this scientific context, computational ecology has to deal with a new paradigm: conventional issues of numerical modelling such as convergence and stability become less important than the qualitative analysis that can be provided with the help of computational techniques. We discuss this paradigm by considering computational challenges arising in several specific ecological applications.
1. Introduction
Ecology is a science with a long and fruitful history. First attempts to understand the relations between the living nature and the nonliving environment were made as early as in ancient Greece [1], i.e. at about the same time when the cornerstones of natural history were laid. However, through the centuries, ecology had long remained in a dormant stage being a rather descriptive field of knowledge. While, for instance, physics became an analytical science already in the Renaissance (e.g. recall the work by Galileo Galilei), ecology had not become really quantitative and theorybased until the studies by Lotka [2], Volterra [3] and Gause [4], who were the first to use mathematical tools for ecological problems. General principles of ecosystem organization were later refined and systematized by Odum & Odum [5], while the mathematical theory was further developed by Skellam [6] and Turing [7], who emphasized the importance of the spatial aspect.
The bright mathematical ideas of those seminal works sparked a huge fire. The last quarter of the twentieth century saw an outbreak of interest in mathematical ecology and ecological modelling [8–13]. Especially over the last decade, more and more complicated models have been developed with a generic target to take into account the ecological interactions in much detail and hence to provide an accurate description of ecosystems dynamics. Owing to their increased complexity, many of the models had to be solved numerically, the development that was inspired by the simultaneous advances in computer science and technology.
There are several reasons for this extensive growth in the number of simulationbased studies. In ecology, mathematical modelling and computer simulations play a rather unique role compared with other sciences. The problem is that, although a field experiment is a common research approach in ecology, replicated experiments under controlled conditions (which is a cornerstone of all natural sciences providing information for theory development and validation) are rarely possible because of the transient nature of the environment: indeed, how, for instance, can we reproduce the same weather pattern again and again? We also mention it here that largescale experiments are costly and, in the situation when consequences are poorly understood, can have adverse effects on some species and on the biodiversity, and may even pose a threat to human wellbeing. Capturing the complexity of real systems through tractable experiments is therefore logistically not feasible. Mathematical modelling and computer simulations create a convenient ‘virtual environment’ and hence can provide a valuable supplement, or sometimes even an alternative, to the field experiment.
Application of computational methods for studying an ecological system's properties—which is usually referred to as computational ecology—has therefore become common across the whole range of ecological applications, e.g. in fisheries [14–16], forestry [17–19], agriculture [20], climate change [16,18,21] and in evolutionary ecology [22,23]. There have been examples of success and failure, and there are huge challenges ahead (in particular, rooted in ecosystems complexity) as well as possibly some hidden pitfalls. But has computational ecology really emerged yet as a new discipline? With thousands of simulationbased studies published, this question may sound outdated or even irrelevant. However, there is much more in computational ecology than an ecological problem at one end and the researchers sitting in front of their computers at the other end, the way it is often seen. A simple yet important observation halfforgotten these days is that there is a mathematical model in between. Moreover, before simulations can be performed, the model needs to be turned into a computer code, and that can only be done by using numerical methods. In its turn, application of numerical methods is not at all straightforward, especially for complicated mathematical models. The choice of an adequate and effective numerical method (along with the tools for its verification and validation) therefore lies at the very heart of computational ecology. Surprisingly, this aspect of computational ecology has received very little attention (but see [24,25]). A computational method is often taken for granted and the discussion of important issues such as the method applicability, accuracy and efficiency is lacking. Meanwhile, an inadequate choice of a numerical method may have a detrimental effect on the study by making simulations costly and even producing wrong results.
The modelling approaches can be very different in terms of the mathematics used and depending on the goals of the study, and there are several ways to classify them. For instance, there is an apparent difference between statistical models [26] and ‘mechanistic’ models [27,28], although simulationbased studies may sometimes include both. Taken from another angle, two qualitatively different modelling streams are rulebased approaches (such as individualbased models and cellular automata) [29] and equationbased ones. In this paper, we will mostly focus on issues arising in mechanistic equationbased models, although we believe that our conclusions remain valid for other modelling approaches as well.
Another way to sort out the models used in ecology is to consider the level of complexity involved. Depending on the purposes of the ecological study, there have been two different streams in model building [30]. In case the purpose is to predict the system's state (with a certain reasonable accuracy), the model is expected to include as many details as possible. This approach is often called predictive modelling. The mathematical models arising in this way can be very complicated. Alternatively, the purpose of the study can be to understand the current features of the ecosystem, e.g. to identify the factors responsible for the a population decline or a population outbreak, but not necessarily to predict their development quantitatively. In this case, the corresponding models can be pretty simple. We will call this approach a ‘conceptual modelling’.
These two streams of theoreticalecological research can be clearly seen in the literature, even though it is not always straightforward to distinguish between them as sometimes simple models may show a certain predictive power and, on the contrary, complicated ones are used for making a qualitative insight into some subtle issues. Moreover, the whole attitude to what is simple and what is complicated has significantly evolved over time, and these days the conceptual models may be analytically intractable and require numerical solution.
The focus of this paper is therefore on the numerical aspects of computational ecology considered in the spirit of computational mathematics. Note that it is not our goal here to give a comprehensive review of all existing approaches and applications. Computational ecology has become so diverse that such a review would not be possible in one paper, nor even in one book. Instead, below we revisit a few carefully selected cases that, we believe, are representative of the state of the art in this field. We summarize interesting and promising findings and endeavour to identify the factors that may hinder future advances. Based on these cases, we conclude that computational ecology often uses the methods and approaches borrowed from its more mature counterparts (such as computational engineering or computational physics), but those methods may be not always be effective for ecological applications.
2. Predictive modelling
We begin with an overview of those recent studies that can be classified into predictive modelling, i.e. the studies that make an attempt to take into account a sufficient number of details or factors controlling the ecosystem state. The main challenge for predictive modelling is, therefore, to properly address the complexity of ecosystem structure and dynamics. Two sources of ecosystem complexity are the complexity of the web of the interspecific interactions and the spatial structure. We first revisit the nonspatial approaches.
2.1. Networks and foodweb modelling
The biological motive behind network modelling is to provide a very detailed description of trophic interactions in a real ecological community [31,32]. That is often done by quantifying each species with its own specific dynamical variable, e.g. the population size. This approach usually results in a very complicated trophic web or network [33,34], e.g. see the sketch in figure 1, where different nodes correspond to different species and the edges connecting the nodes show trophic interactions. Sometimes a reasonable compromise can be reached by combining a few similar species together—the approach known as ‘compartment modelling’, where a few species occupy the same compartment and are described by the same variable. However, this is not at all straightforward as some important information is then lost on the way (e.g. how important the interspecific competition can be inside a given compartment). Similarly, simplification of the food web into just a few trophic levels was shown to significantly distort the community properties because of omitting many subtle yet important details of interspecific interactions [35].
Neglecting population discreteness, the population densities may be considered as smooth functions of time, and hence an appropriate food web model [34] is given by a system of ODEs:^{1} 2.1where u_{i}(t) is the population density of the ith species at time t, N is the total number of species (or compartments), α = (α_{1}, … ,α_{M}) are biological and environmental parameters, where α_{i} is the ith species growth rate, and functions f_{i} take into account details of the densitydependence in intra and interspecific interactions. For a complex trophic web, the total number N of equations in the system can be as large as several dozen or even several hundreds. A solution of system (2.1) is expected, for appropriately chosen parameter values and the initial conditions, to predict the ecosystem state over a certain period of time. Model (2.1) has recently been a focus of several simulation studies [36] that resulted in developing a computer software potentially applicable to a wide class of ecological problems.
There are, however, a few problems with system (2.1) that make further development in this direction a considerable challenge. Firstly, because intra and interspecific interactions are normally nonlinear (e.g. recall the logistic growth and the Hollingtype responses), system (2.1) can have a very complicated bifurcation structure, which means that the solution properties can be sensitive to parameter values. That may be a reflection of the ecosystems complexity; however, the actual problem is that, in a real ecosystem, the parameter values are usually known with a low accuracy. Model (2.1) can therefore predict a range of different system behaviours inside the same biologically realistic range of the parameter values, and then the question is which one to choose.
Estimation of parameter values is therefore an important issue and it has received a lot of attention recently [37,38]. Stateoftheart development of predictive models incorporates the estimation of the relative probabilities of models and parameters in light of empirical evidence. This aspect is a key tool available to modern computational ecology. The computation of model and parameter probabilities, usually done with Bayesian inference methods [39,40], allows for the assessment of how well the model explains the empirical data and how uncertainty about various aspects propagates through to influence predictive accuracy.
Secondly, since the detailed model (2.1) may include species from different taxa, sometimes ranging from bacteria to mammals, the growth rates can be very different, i.e. α_{l}/α_{m} ≪ 1 for some l and m. This results in a computational difficulty known in numerical mathematics as ‘stiffness’ [41], when the timestep Δt of the numerical method has to be chosen according to the largest α, e.g. Δt ≪ α_{m}^{−1} in order to ensure the method's stability and accuracy, but the time interval T relevant to the system's dynamics is determined by the smallest α, T ≫ α_{l}^{−1}. As a result, in running the simulations, the number of required time steps estimated as T/Δt can be huge, so that it can be far beyond available computer power. Stiffness is a wellknown computational problem and there are some welldeveloped approaches as to how to deal with it [42,43]. However, a fundamental problem rooted in the very nature of an ecosystem's structure is that the more details of the network are taken into account (by including more different taxa) the stiffer the ODE system is going to be, and therefore the larger is the effort required for its numerical solution.
Thirdly, existing approaches to solve system (2.1) are lacking universality. A successful application of software based on model (2.1) to a given ecosystem does not at all guarantee its success in the case of another ecosystem. Different ecosystems have different species assemblages; the software and the mathematical model therefore need to be tuned to every particular case and validated accordingly. A serious problem of this way is that the basic properties of the system (2.1), in particular its bifurcation structure, can change significantly with a change in the properties of the functions f_{i}, even when the change is relatively minor [44,45]. Population community is a strongly nonlinear system and adding or omitting some apparently minor details may change the system properties dramatically [35]. Things are exacerbated by the fact that the low accuracy of the data and the effect of noise, especially at small population densities, often do not make it possible to choose the parametrization unambiguously [46].
The aspiration to develop a comprehensive mathematical model of the trophic web is therefore hardly consistent with the desire to make it universal and transferable. Indeed, there is a growing understanding [47] that there should be a proper balance between the level of complexity built in to the model,^{2} the expected predictive power of the model, and the model's usefulness for understanding community dynamics. The argument hence appears to be similar to the one behind the conceptual models.
To conclude this section, we want to mention that, in spite of its mathematical complexity, the approach based on network building and analysis can be used for conceptual modelling as well, e.g. when the purpose of the study is to understand the relative importance of different species and/or different flows through the network and to identify possible bottlenecks and control scenarios [48,49]. Also, we mention it here that the networkbased approach is by no means restricted to nonspatial systems and can be readily extended into spatially structured communities [50].
2.2. Space, geometry and multiple scales
The impact of space tends to make the population dynamics significantly more complicated compared with its nonspatial counterpart and to bring new and bigger challenges to simulations. In this paper, we mostly focus on the type of models resulting from the assumption of continuity (hence neglecting the population discreteness), which usually applies to spatial scales larger than the size of a typical individual and/or to sufficiently large population density. Complementing the assumption of continuity in space with continuity in time, a generic mathematical model is given by the following system: 2.2
Here u = (u_{1}, … ,u_{N}), where u_{i}(r,t) is the density of the ith component at position r and time t, N is the number of the ecosystem components included into the model, D = {D_{ij}} is the matrix of the (cross)diffusion coefficients and the vector α is the set of parameters. The vector field w(r,t) takes into account advection, e.g. with the wind for airborne species or with the water current for aquatic species. In many cases, for instance, in modelling flying insects or aquatic species, the problem is threedimensional so that r ∈ Ω ⊂ R^{3}, where Ω is the spatial domain where population dynamics is considered. Here t > 0 and the system (2.2) must be complemented with the initial conditions for each dynamical variable, i.e. u_{i}(r,0) = u_{i0}(r) (i = 1, … ,N), where u_{i0} are given functions of space.
The form of the spatial operator 𝒦 can be different depending on the type of the individual movement, i.e. Brownian or nonBrownian, which can be a biological trait of the given species developed in the course of evolution [51] but can also be a property of the environment, e.g. as in the case of turbulent mixing [52]. In the simplest yet biologically sensible case, (2.2) turns into a reaction–diffusion–advection system: 2.3where D_{i} is the diffusivity of the ith species.
Let us note that the mathematical framework based on equations (2.2)–(2.3) corresponds to a ‘meanfield’ description of the spatiotemporal population dynamics where the fluctuations owing to population discreteness and the effects of individual traits (e.g. behaviour) are neglected. Alternatively, the population dynamics can be modelled by using individualbased models, i.e. by considering all individuals and their traits explicitly [10,13,29]. Under certain conditions, the two approaches are known to be equivalent [11,53]. A more detailed consideration of the individualbased approach is beyond the scope of this paper.
System (2.2)–(2.3) generally requires a more sophisticated numerical algorithm compared with the corresponding nonspatial system (2.1). For this reason, even ‘realistic’ models rarely contain more than several components [54]. Also, the solution properties appear to be much more complex [55–57]. For numerical simulations, one of the main challenges is the existence of multiple spatial scales where the ratio of the largest to the smallest can be several orders of magnitude. Indeed, numerical solution of system (2.2)–(2.3) requires a discretization of the equations for instance by approximating the derivatives by finite differences [58]. For that purpose, a numerical grid must be generated with the spatial grid step h being sufficiently small in order to resolve spatial heterogeneities. Existence of multiple scales means that the ratio L/h (where L is the diameter of Ω, i.e. its largest linear size) can be very large. Correspondingly, it may result in a numerical grid with so many grid nodes that the simulations would either take a lot of computer time or may even appear to be beyond the available computer power. In order to take a closer look at the problem, as well as to outline possible approaches to numerical modelling of multiscale ecological systems, below we discuss two examples where multiple spatial scales must be taken into account.
2.2.1. The dynamics of the Pacific salmon
Salmon are known to have a complicated life cycle (figure 2a) where different stages differ not only in the typical body mass^{3} and body size but also in the type of environment. While mature salmon live in the ocean, fish eggs are spawned in streams and little rivers that can be situated a few hundred kilometres from the nearest shore. The first stage in the life cycle of salmon (from several months to 1 or 2 years, depending on a particular genus) is spent in the natal stream. With body size changing over time from a few millimetres to several centimetres, the larvae live in the laminar environment^{4} but eventually grow to a size when they become affected by smallscale turbulence (e.g. by increasing their encounter rate with food items [61]), by the river flow and, correspondingly, by the geometry of the river banks and bed [62].
When the young fish become bigger, they migrate down the river and their next stage is spent in the river estuary. It is during this stage that salmon undergo physiological changes allowing them to live in salt water. Also, individual behaviour changes, which results, in particular, in the formation of fish schools. Owing to a larger body size and weight, the response to environment forcing is changing as well. It is less affected by smallscale turbulence but still significantly affected by the river flow which, in the estuary, can have a complicated structure.
The next stage is spent in the ocean where salmon become fully mature. Over this period of maturation, salmon, who now live in large fish schools and migrate over hundreds or even thousands of kilometres, are affected by largescale ocean hydrodynamics, geographical variations in the climate, and also by fishing. After spending a few years in the ocean, salmon return to their spawning sites in small rivers, thus closing the life cycle.
The spatial dynamics of salmon is therefore essentially multiscale, with the characteristic scale of the environment ranging from a few centimetres for larva to hundreds of kilometres for adult fish. A straightforward attempt to simulate the population dynamics of salmon over its life cycle would result in generation of huge numerical grids that would be far beyond the power of modern computers, and hence would not be doable. A closer look at the problem suggests a more sensible approach by splitting the entire life cycle model into several blocks; figure 2b. Each life stage can then be simulated separately, to take into account its specifics by choosing the modelling approach. For instance, the larva/stream stage can be modelled by simulating the movement of passive tracers in an advective–turbulent velocity field where the latter may be obtained by solving the Navier–Stokes equations [61]. The juvenile/estuary stage can be simulated by using the box model [63], where the whole spatial domain is considered as the agglomeration of ‘boxes’; that may conveniently account for the details of the habitat structure. The dynamics of salmon fish schools in the ocean can be simulated as the movement of active tracer in a heterogeneous environment subject to turbulent diffusion [55,64].
Note that here we describe the salmon life cycle very schematically. It is wellknown that sizedependent lifehistory traits can have a variety of implications for the population structure and dynamics [65]. A more detailed description should at least take into account the changes in the metabolism rate which can be done, for instance, by using the Dynamic Energy Budget model [66]. The metabolism rate tends to decrease with the age of individual and hence simulation of different lifestages may require a significantly different time step. This can be another reason to apply the blocksplitting approach.
Splitting the whole modelling problem is therefore a reasonable way forward, because the modelling technique used at each stage (e.g. as mentioned above) is understood relatively well and relevant computer codes can be designed. However, the blocksplitting approach also brings its own issues such as the problem of data transfer between different stages. Indeed, the results obtained at one stage should create the initial conditions for the next stage, but the description of the system at different stages may be done using dynamical variables of different origin. Thus, accurate and reliable modelling of the salmon life cycle remains one of the great computational challenges faced by scientists today.
2.2.2. Monitoring of pest insects
A multiscale ecological problem of a completely different origin is given by ecological monitoring, in particular, monitoring of pest insects. Its main goal, as required by the needs of agriculture [67,68], is to provide an estimate of the pest population size or density based on the trap counts. The simplest setting consists of a single trap installed in a field. It is checked regularly after a certain interval (e.g. daily or weekly), the pest species are identified and the number of pest insects caught is counted. If the trap count contains any of the target species, it proves that these species are present in the vicinity of the trap. However, relating the trap count to the population density is a considerable and largely open problem, and its essence is readily seen from the following example. Given a trap of radius r caught n insects after having been exposed for time T, how can we restore the population density from this information? If this information is not sufficient, what is missing? Although some attempts to solve this problem have been made [20,53,69], a consistent theory and robust computational algorithms are lacking.
One difficulty with numerical simulations is that the problem is essentially multiscale, with the smallest scale being given by the trap size r ∼ 10^{−1} m and the largest scale being given by the size of the field R ∼ 10^{2} − 10^{3} m; their ratio can therefore be as large as 10^{4}. Another small scale can be given by the initial conditions, e.g. in the case of a pointsource release. Consider either the diffusion equation or the Fokker–Planck equation as a reasonable model to describe the dynamics of the population density owing to the individual movement (and, subsequently, to simulate the trap counts by calculating the density flow through the trap boundary [53]). A straightforward discretization of the system results in a huge numerical grid with the number of nodes of the order of 10^{12}, which is by a few orders of magnitude beyond the currently available computer power.
Note that, generally speaking, we cannot reduce the grid by arbitrarily reducing the computational domain to the vicinity around the trap. By catching insects, the trap introduces a perturbation to the distribution of the population density and the radius of this perturbation grows with time. An attempt to decrease the perturbation radius by decreasing the time T of trap exposure does not help either as smaller T results in smaller trap counts, which can make them statistically unreliable.
An effective computational approach to this problem can again be based on splitting it into different ‘blocks’, i.e. one block describing the insect movement in the vicinity of the trap and the second block describing the population density in the far field. This approach is supported by the biological observation that different factors can be of different importance at small and large distances from the trap. While in the vicinity of trap, the pattern of individual movement and the effects of smallscale heterogeneity are important (for instance, as given by the effect of wind for flying insects [20]), in the far field, the population density is more affected by the conditions at the field boundaries that should take into account the effects of pest migration through the boundary. Hence, a multiblock grid generation technique [70] can be applied to the problem to facilitate its numerical solution.
3. Conceptual modelling
Conceptual modelling is another powerful stream of research in theoretical and mathematical ecology. The focus is on the qualitative aspects of ecosystem dynamics rather than on quantitative ones; correspondingly, the goal is to predict the tendency or the type of response rather than a specific number. A classical study of this kind is given by the seminal paper by Rosenzweig [71], where he discovered the instability in nonspatial predator–prey systems arising as a response to an increase in the prey carrying capacity—the famous paradox of enrichment.
Occasionally, conceptual models can also explain some quantitative features (an example will be considered below) but that should be regarded as a ‘piggyback opportunity’ and could hardly be made an objective of the study.
Interestingly, owing to the developments in relevant areas of applied mathematics [27,28,56], in the university mathematics curriculum, and, especially, in computer power, that have been seen over the last two or three decades, the division between the predictive and conceptual models has evolved. The spatially explicit models of population dynamics (e.g. formulated in terms of PDEs) were originally regarded as complicated and hence were used as predictive models [8,9], but more recently they have been considered as conceptual ones [12,72].
Increased complexity of the conceptual models means that application of analytical tools is rare (but see [73]), and the results are obtained through computer simulations. Since conceptual models tend to be simpler than predictive ones, application of numerical methods and the subsequent computer code development may require somewhat less effort than in the case of predictive modelling. However, a closer look at simulation studies reveals challenges of a different kind.
3.1. Patchiness, chaos and grid refinement
We reveal the challenges of conceptual modelling by considering an instructive example, namely the problem of plankton patchiness. The horizontal distribution of phytoplankton in the ocean is very rarely homogeneous even when the marine environment is relatively uniform. Instead, patches of high plankton density alternate in space with patches of low density (figure 3a). The spatial pattern usually consists of patches of very different size, which may range from a few centimetres to hundreds of kilometres [74]. The attempts to relate this phenomenon to the specifics of the marine environment have not been entirely successful. It was shown that while small patches (less than 100 m) are indeed controlled by turbulence [75] and patches on the scale of dozens and hundreds of kilometres are controlled by environmental heterogeneity owing to the geographical variations, patches in the intermediate range between 0.1 and 30 km behave differently. This resulted in the concept of physical and biological scales [76] (figure 3b).
A conceptual model of plankton pattern formation should therefore take into account some basic features of the marine environment and also some basic biological interactions. We assume that turbulent mixing can be described as turbulent diffusion but neglecting its generic dependence on scale. For biological interactions, we only consider the main phytoplankton consumer, i.e. zooplankton. The corresponding system of equations then looks as follows: 3.1and 3.2where u(r,t) and v(r,t) are, respectively, the phyto and zooplankton density at horizontal position r = (x,y) and time t, D_{T} is the coefficient of turbulent diffusion, G(u) describes phytoplankton growth, P(u,v) describes grazing and M(v) describes zooplankton mortality that may take into account the effect of the top predators.
System (3.1)–(3.2) is apparently very simple in the sense that it completely leaves out many properties of a marine ecosystem. Surprisingly, its solution appears to have properties similar to what is seen in field data, exhibiting a clear patchy structure [55]. Typical results are shown in figure 4. Note that it is not only a visual similarity with the pattern shown in figure 3a; a quantitative analysis shows that the typical patch size obtained in model (3.1)–(3.2) agrees well with the upper boundary of the biological range [77].
A closer look at the patterns obtained in the conceptual plankton model (3.1)–(3.2) reveals that the corresponding system's dynamics is chaotic [55]. A generic property of chaotic dynamics is that the state of the system is very sensitive to the initial conditions.^{5} From the point of applications, an immediate consequence is that, using the conceptual model (3.1)–(3.2), we can explain the existence of the plankton patches in the ocean, and possibly even predict their characteristic scale, but we cannot predict their position. This simple notion has huge consequences for the modelling philosophy behind the computations. In fact, it changes the whole paradigm of the simulationbased approach. Indeed, when a system of PDEs is solved numerically, the central point is the convergence of a numerical solution to the analytical solution,^{6} so that the exact solution is approximated with a prescribed accuracy. Below we will show that it is not necessary in the case of conceptual modelling.
Let us define the numerical error e_{k} at time t_{k} as 3.3where u and ũ are the exact solution and the approximate (numerical) solution, respectively, {(x_{i},y_{j})} is the numerical grid where the approximate solution is computed. Convergence means that the numerical error e_{k} decreases for any fixed k when the spatial grid steps h_{x}, h_{y} and the time step Δt decrease. In the limiting case, when the spatial and temporal grid steps tend to zero, the error tends to zero as well; thus, the numerical solution approaches the exact solution at any position in space.
It immediately follows from the above that, when the grid steps or time step are not small enough, the numerical solution can be considerably different from the exact solution. An example given in figure 4 shows the numerical solutions obtained for h_{x} = h_{y} = 1 when two different choices of the time step Δt were considered. In the first case, the time step (figure 4a) is much bigger than the time step taken to compute the solution in the second case (shown in figure 4b). The initial conditions and all parameters are the same in both cases. Both snapshots are obtained at the same moment t (with the required number of time steps being 128 times larger in the second case). Obviously, the two patterns do not coincide. In a comprehensive study, Garvie [24] demonstrated that is sufficiently small to ensure very good accuracy of the simulations, so that the snapshot shown in figure 4b would not undergo any significant changes for smaller values of Δt. On the contrary, the pattern shown in figure 4a is going to change if the time step takes a smaller value.
Remarkably, however, while the actual ‘closeness’ between the numerical and exact solutions (defined as e_{k} < ε, where ε is the prescribed tolerance) can only be reached for sufficiently small spatial and timegrid steps, the type of the system's dynamics along with some of its typical properties may be reproduced correctly already when the steps are still relatively large. Recall that the purpose of the study based on system (3.1)–(3.2) is to understand the mechanism behind the plankton pattern, but not to predict their exact position and shape. Now, the patterns shown in figure 4a,b are qualitatively similar. Moreover, a closer inspection shows not only qualitative but also quantitative similarity. Namely, the typical patch size, the interpatch distance and the magnitude of the oscillations in the plankton density are approximately the same, even though the numerical solution has not converged yet. We refer to this situation as an ‘outputbased convergence’ [80,81]. Consider a certain characteristic quantity ℒ or an ‘output’ of the system (in the model (3.1)–(3.2), the principal output is given by the typical patch size). Convergence with respect to the output ℒ means that 3.4where ε_{ℒ} is the tolerance of output estimation that can be different from the solution tolerance ε.
The benefits of using the outputbased convergence instead of the conventional definition of the solution convergence are obvious. Once we can use coarser spatial and temporal grids in twodimensional and threedimensional simulations, we can study the ecological scenarios that would otherwise be far beyond the available computer power.
3.2. Evaluating pest abundance from sparse ecological data
In §3.1, we showed that grid refinement—the standard procedure to decrease the grid steps by introducing more grid nodes—may be not required when we are interested in some particular properties of the system but not necessarily in its complete description. Interestingly, there is another class of problems in computational ecology where the grid refinement would be beneficial but is impossible because of ecological reasons. One such problem is the ecological problem of collecting information about pest abundance, usually over large areas or regions or even nationwide. Local information about the pest density is obtained by sampling or trapping (cf. §2.2.2), and this information is then used in order to estimate the pest abundance in a given field.
Let u_{1}, … , u_{N} be the values of the population density of a given species obtained at the location of the traps r_{1}, … , r_{N}, respectively, where N is the number of traps installed over the agricultural field of area A. A commonly used statistical approach to estimate the population size I is based on the arithmetic average [82]: 3.5This approach works well when N is sufficiently large because the theory predicts that converges to the true mean density when N tends to infinity. However, if N is not large, the application of equation (3.5) becomes questionable, especially when the density distribution is not spatially homogeneous but exhibits some form of aggregation.
An alternative approach to population size estimation is based on the ideas of numerical integration [83]. Indeed, it is readily seen that estimation of the population size from discrete data on population density coincides with a conventional problem of numerical integration. Moreover, it has been shown [84] that the accuracy of population size estimation by means of numerical integration can be higher than the statistical approach (3.5).
It is wellknown that the accuracy of numerical integration depends on the number N of grid nodes, and grid refinement can therefore essentially improve the accuracy of integral evaluation [85]. However, a crucial observation is that, in pest monitoring programmes, the number N of traps cannot be made large. An excessively large number of traps in a given area may have a disruptive effect on the behaviour of the monitored animals, thus resulting in biased counts. Also, trapping is costly and labourconsuming and it introduces disturbance to agricultural procedures. Pest monitoring specialists would not be allowed to make this disturbance large as it can damage the agricultural product (e.g. crops) significantly, hence making the protective measures rather senseless.
The problem of numerical integration then has to be solved on a very coarse grid, and that makes most of the standard ideas irrelevant [84]. Indeed, the usual approach to assess the accuracy of different integration rules is based on their convergence rate when the number of grid nodes per unit area can be increased arbitrarily. This is clearly not the case of pest monitoring where the conventional error estimates cannot be applied and one has to come up with a new approach to conclude about the efficiency of a chosen method of numerical integration [86].
4. Discussion and concluding remarks
In this paper, we briefly revisited the state of the art in simulationbased, computational approaches to ecological problems. Following a long tradition [30], modelling in ecology can be loosely divided into two streams such as predictive modelling and conceptual modelling. Predictive modelling uses complicated models that are usually analytically intractable and hence the modelling is essentially based on computer simulations [17,23,47,63]. Conceptual modelling tends to use simpler models but their exact solutions are still not always possible; therefore, they often have to be solved by simulations as well [9,10,12]. In past decades, both of these streams have been enhanced greatly by recent developments in numerical mathematics and computer science and technology. Especially in predictive modelling, the increase in available computer power has been hugely stimulating. For instance, it has become possible to consider food webs in much detail [31–35] (and there are examples of a corresponding software development [36]) and/or to address population dynamics with multiple spatial and temporal scales occurring in various environments [13,16,21,25,50,62]. Although the full complexity of ecosystems dynamics is still beyond the available computer power, a blocksplitting approach has been identified as an appropriate modelling strategy to cope with multicomponent multiscale systems as it is used increasingly often, for instance in fisheries [14] and in dynamic global vegetation models [87].
In spite of tremendous progress that has been made over the last two decades and numerous papers published, there are some important yet rarely addressed issues remaining. The term ‘computational ecology’ has been coined already [47], but has it appeared yet as a clearly shaped, consistent field of knowledge with its own principles and methodology. One problem that we identify is that most simulationbased studies in ecology use ideas and approaches originally developed for application in other sciences, while the specifics of ecological dynamics has rarely been appreciated in full. Meanwhile, ecological dynamics is in many aspects very different from the dynamics of other natural systems, such as, for instance, in physics or chemistry. One essential difference is the adaptive, fitnessmaximizing behaviour of the individuals [22,23]. Another and probably even more important difference is that the impact of stochastic factors and the corresponding level of a system's uncertainty are much higher in ecology than in other natural sciences. In §3.1, we considered an example where the modelled system was chaotic. However, chaos is not the only source of uncertainty in ecological dynamics. Other and possibly more common sources are poor accuracy of ecological data and their transient nature. Noise that is inevitably present in ecosystems can significantly change the properties of an ecological model, both in the wellmixed ‘nonspatial’ case [88,89] and in the spatially explicit case [90,91], and this fundamental uncertainty affects the accuracy of ecological data.
We want to emphasize here that what is usually referred to as low data accuracy is not a result of underdeveloped equipment or poorly trained technicians. This is an inherent property of ecological dynamics rooted in the immense complexity of ecological interactions. Even in a system that is not chaotic, the system's uncertainty is high and the message then remains essentially the same: for simulations used in ecological modelling, we do not always need advanced numerical methods improving the convergence rate, because we do not always know where exactly this convergence should take us to.
In its turn, the uncertainty in ecological data has immediate implications for the usual routine of code development and simulations. A standard procedure of code development in wellestablished, ‘mature’ computational sciences (e.g. in computational engineering) usually consists of the following stages [92,93]:

— Preparation: specification of objectives, geometry, initial and boundary conditions, and available benchmark information; selection of the numerical method.

— Verification: a process for assessing simulation numerical uncertainty. Robustness of the simulation results should be proved by comparing them with the known analytical properties of the model, e.g. with exact solutions.

— Validation: a process for assessing simulation modelling uncertainty by using benchmark experimental data.
The above protocol, shown as a flowchart in figure 5, is widely accepted by the engineering community where every piece of computer code should undergo meticulous testing before being released to the users. For instance, the American Institute of Aeronautics and Astronautics (AIAA) Journal has strict editorial policy on numerical accuracy [94]. This policy includes justification of minimum formal accuracy for numerical results submitted to the journal as well as the statement of code verification activities and a rigorous check of the convergence for any computer code used or designed by the authors.
Apparently, stateoftheart ecological software is still a long way from those requirements, and the protocol of figure 5 does not always apply to code development for ecological applications. Indeed, as analytical solutions are relatively rare, a comprehensive verification and convergence study is hardly possible. As for code validation, a lack of experimental data and their typical poor quality can sometimes make it very difficult to distinguish real phenomena (i.e. inherent solution features) from artificial ones [25]. On the other hand, it is clear that the issues of verification and validation cannot be dismissed as ecological science progresses and more sophisticated computer codes are required in ecological modelling and simulation. Hence the question arises if the solid testing of a simulation model can be achieved in a reasonable time span and within reasonable allocation of human and computer resources.
Based on the above, the main problem of modern computational ecology is, in our opinion, the problem of methodology. Contrary to other computational sciences, computational ecology still does not have clearly defined standards and hence is not sufficiently useroriented. Indeed, for instance, in engineering applications, the difference between research codes and the socalled industrial (or commercial) codes can always be clearly seen. While the former are used for the study of new computational methods, the latter are heavily exploited for ‘predictive’ purposes in the design and technology processes. This is not the same in computational ecology, where only a small number of computational programmes (most of them are statistical programs) are being used by practical ecologists and environment policy decisionmakers. Despite successful development of a large number of research computational programmes, only a few of them have the potential to be transformed into a commercial code that can be steadily and efficiently used by a large number of users.
Note that another specific feature of computational ecology is ambiguity in the choice of the mathematical model. While in physical and engineering sciences the model to be used is usually well established and well known (consider the equations of fluid dynamics just as an example), in mathematical ecology the choice of model is often controversial. It means that validation of ecological software will often involve not only modification of the numerical method, but also a heavy revision of the mathematical model itself; see arrow 4 in figure 5.
We believe that the key issue in the development of the methodological toolkit for computational ecology should be a careful revision of the numerical techniques and methods used in mature computational sciences rather than blind copying of them. For example, we have discussed in this paper that the verification of a computer code can be done better in terms of the outputbased error estimation rather than the solution error estimation. Outputbased computations (i.e. computations that focus on the accurate numerical evaluation of a given solution functional) have been used in other computational sciences [80,81] but they do not play a leading role there, and nowadays most error estimates are being designed with a numerical solution rather than the solution functional in mind. The situation can become different in computational ecology where the outputbased error estimates may appear to be in the mainstream of research because of the nature of the problems solved there. Similarly, other popular computational techniques such as the use of higher order discretization methods, adaptive grid generation, hp adaptation, etc. should be revisited in the light of the complexity of ecological problems and restrictions that such complexity may impose on the choice of a numerical method. This may result in a new protocol, still clear and reliable but more suitable for ecological applications.
Acknowledgements
The authors are thankful to Roger Nisbet (Santa Barbara) and Otso Ovaskainen (Helsinki) for useful and stimulating discussions. Helpful remarks from three anonymous reviewers are appreciated. This study was partially supported by The Leverhulme Trust through grant F/00568/X (to S.P.).
Footnotes
One contribution to a Theme Issue ‘Mathematical and theoretical ecology’.
↵1 Alternatively, for the species with nonoverlapping generations, it could be a system of discrete equations.
↵2 At the very least making sure that the corresponding simulations would not require a computational power far beyond that of the modern computers.
↵3 Which spans over a few orders of magnitude, from less than 1 g of a newly hatched larva to approximately 10^{4} g of a fully grown adult.
↵4 Recall that the smallest turbulence scale is given by the Kolmogorov length of approximately 1 cm [59,60].
↵5 This sensitivity being quantified by the dominant Lyapunov exponent λ_{D} which, in system (3.1)–(3.2), is of the order of 0.03.
↵6 Even though we do not know the solution itself, its existence in system (3.1)–(3.2) is guaranteed by the theory of PDEs [78,79].
 Received September 29, 2011.
 Accepted December 6, 2011.
 This journal is © 2012 The Royal Society