Abstract
We discuss how statistical inference techniques can be applied in the context of designing novel biological systems. Bayesian techniques have found widespread application and acceptance in the systems biology community, where they are used for both parameter estimation and model selection. Here we show that the same approaches can also be used in order to engineer synthetic biological systems by inferring the structure and parameters that are most likely to give rise to the dynamics that we require a system to exhibit. Problems that are shared between applications in systems and synthetic biology include the vast potential spaces that need to be searched for suitable models and model parameters; the complex forms of likelihood functions; and the interplay between noise at the molecular level and nonlinearity in the dynamics owing to often complex feedback structures. In order to meet these challenges, we have to develop suitable inferential tools and here, in particular, we illustrate the use of approximate Bayesian computation and unscented Kalman filteringbased approaches. These partly complementary methods allow us to tackle a number of recurring problems in the design of biological systems. After a brief exposition of these two methodologies, we focus on their application to oscillatory systems.
1. Introduction
Mathematical models are widely used to describe and analyse complex systems and processes across the natural, engineering and social sciences [1]. In biology, we consider mathematical models as mechanistic representations of dynamical processes in living systems. While mathematical modelling in the life sciences has a surprisingly long history—the population dynamic work of Leonardo de Pisa, better known as Fibonacci, even predates the work of Galileo Galilei, which set the foundations for theoretical physics, by some 200 years—it has only recently become pervasive in the modern biomedical and life sciences. Mathematical models force us to explicitly state our assumptions about the constituent parts of biological systems and their interactions, and through either mathematical analysis or simulation follow them to their logical conclusions. These may be evaluated against experimental or field study data, and careful comparison may result in an increased understanding of the biological processes.
While in systems biology the main aim is to understand how naturally evolved systems operate [2], synthetic biology is developing around the vision that we can rationally engineer biological systems for specific biomedical or biotechnological purposes [3–5]. Here, after specification of the design objectives—e.g. production of the hormone insulin—a solution is sought by combining standardized parts into a working system; the whole approach is inspired by engineering approaches and relies heavily on in silico simulation and testing prior to implementation in ‘wet ware’. This is of course fraught with a host of problems: biological macromolecules are hardly as simple as, for example, electronic components; they tend to be involved in many processes in a manner that is highly contingent on a variety of regulatory inputs; they are often subject to stochastic effects (owing to thermal effects and the small number of many molecules); and finally they interact promiscuously, i.e. there is no insulation between different pathways but instead probably ubiquitous crosstalk (interference) between different pathways.
In systems biology, the challenges are equally formidable: systems' behaviours change as environmental or physiological conditions are altered, and there appears to be considerable celltocell variability in many important phenotypes. These in turn underlie, for example, cell fate decisions and can have profound clinical implications in, for example, stemcell biology [6], cancer [7] or antibiotic resistance in microbial organisms [8]. Crucially, we also have to carefully choose the appropriate modelling framework; and consider quality and quantity of data and prior knowledge about the process under investigation. Most reverseengineering approaches are targeted at specific types of data, or at inferring certain types of models. Relevance and Bayesian network approaches [9–11] aim to infer regulatory interactions from geneexpression data, generally under the assumption that molecular interaction networks do not change over time or in response to the environment, although it is increasingly becoming possible to relax such restrictions [12]. Dynamical systems on the other hand use stochastic processes and/or differential equations in order to model the relationships between molecules inside a cell, tissue or other biological systems [13]. Such dynamical systems are the focus of our work here.
In the area of systems biology, a generic modelling work flow illustrated in figure 1 has emerged [14]. Thus, for a given model, one typically obtains parameter estimates, models the system and perhaps conducts a sensitivity analysis, which provides insight into how simulation outcomes depend on uncertainty in the parameter values and initial conditions of the system. Within this simple framework, we would consider it best practice to provide parameter estimates with some measure of confidence [15], and to analyse the sensitivity or robustness of model outputs [16,17]. This is in order to ensure consistent propagation of uncertainty and therefore the ability to perform probabilistic inference when undertaking modelbased reasoning regarding system properties.
2. Inference versus design
In statistical problems, we are typically faced with data, 𝒟 = {d_{1}, d_{2}, … ,d_{n}}, where d_{i} can be a vector of observations. Note that, for many systems, the d_{i} do not have to be identically and independently distributed, and in the cases discussed below, this is generally not the case. In addition to the data, we also have a set of candidate models, ℳ = {M_{1}, M_{2}, … ,M_{q}}, which may have given rise to the data (figure 2). Each model has a parameter vector, θ_{i}, 1 ≤ i ≤ q.
Thus our problem is to identify the model—together with the model parameters—which best explains the data [18]. In a Bayesian framework, we thus seek to determine 2.1where π(m) and π(θ_{m}) denote the priors for model m and its associated parameter set θ_{m}, respectively, and f(Dm, θ_{m}) is the datagenerating process for model m, 1 ≤ m ≤ q; Pr(𝒟) is a normalization constant that is also frequently referred to as the evidence [19]. The posterior Pr(m, θ_{m}𝒟) is here taken jointly over the parameter spaces of all the models and the (discrete) model indicator variable. The conceptual equivalence between model selection and parameter inference is one of the most appealing features of the Bayesian approach. Unfortunately, however, analytical expressions of the posterior are generally not available, but a host of computational approaches have been developed that allow the numerical evaluation of posteriors and derived quantities.
Marginalizing over model parameters yields the marginal posterior for model m, 2.2which assigns a probability to model m to be the model that generated the data 𝒟, conditional on the available data, the set of candidate models, ℳ, and the specified model and parameter priors [19,20]. If for a particular model, we have a marginal posterior close to one, then in this framework we would choose this model. If, however, the posteriors of several models are roughly equally large and no clear best model emerges, then we can (or should) use Bayesian model averaging for any further analysis.
The principal difference between statistical inference and design as discussed here is that, in the former, we have data and seek to reconstruct the model that is most likely to have generated the data [21]; in the latter case, we have an idea of the model/system outputs that we would like to see (such as adapation as in Ma et al. [22] or Barnes et al. [21]). The task then is to learn the structure of the system that is most likely to generate these desired data. As long as we can encode our design objectives in an appropriate way, we can use inferential techniques in the design of synthetic or engineered systems.
We thus replace the observed data 𝒟 by surrogate data, 𝒟_{S}, that embody our design objectives appropriately, and then determine the marginal posterior probability of each candidate model to deliver these specified outcomes. And just as in the case of the statistical inference, or reverseengineering, problem, the more detailed 𝒟_{S} becomes, the more intricate the search for a suitable model gets. This gives rise to the second difference between design and inference: while the choice of a suitable costfunction which rewards models for fit to the data is typically obvious, choosing a costfunction for design may also take into account other factors. In addition to satisfying the design objectives, other factors may also play an important part. In the context of synthetic biological systems, for example, cost and safety considerations may override the ‘fit’ of the model to 𝒟_{S}. We will return to this point in the penultimate section.
3. Sequential algorithms for inferencebased modelling
For simplicity, we will refer to data and design objectives collectively as data and denote these by 𝒟. As mentioned before, analytical expressions for the posterior, equation (2.1), are generally not available. Markov chain Monte Carlo (MCMC) methods have risen to great prominence among the computational tools capable of representing posterior distributions numerically. There is a substantial literature in the area of computational statistics. Here, however, we focus instead on sequential algorithms [23]. While sequential approaches were initially conceived to cope with problems where data are produced in a sequential fashion and inferences were improved as new data became available, sequential Monte Carlo (SMC) and Kalmanfilteringtype approaches can also be used on the whole data. Then the sequential algorithms yield increasingly better estimates for the posterior distribution.
Below we will consider highly nonlinear dynamical systems. For these evaluation of the likelihood, is generally not straightforward [24, 25], and in many cases computationally prohibitive. Here, we develop two complementary frameworks that pursue different strategies in order to study complex dynamical systems.
3.1 Unscented Kalman filter for parameter inference
The Kalman filter [26] has been one of the main workhorses for state estimation, but can also be applied for parameter inference, and has been used widely in timeseries analysis [27]. We consider the case of estimating the parameters, θ, of a nonlinear mapping g(·) from a sequence of inputs x_{ti} (which may also be unknown), and noisy observed outputs, 𝒟 = {d_{1}, … ,d_{n}}, obtained at times t_{i} with 1 ≤ i ≤ n. The filter is an example of recursive Bayesian estimation in which at step k, the current estimated parameter distribution conditioned upon data d_{j} for j < k, is used to make a prediction, , for the next data point. This estimate is then compared with the actual observation, d_{k}, in order to arrive at updated estimates for θ (and x_{k} if it is unknown).
The Kalman filter is based on the assumption that dynamics are linear and any noise entering the dynamics and/or the observations is Gaussiandistributed. We are, however, dealing with highly nonlinear problems in both systems and synthetic biology, and nonGaussian noise. Several extensions to the standard Kalman filter have been proposed, the two most prominent of which follow very different strategies: the extended Kalman filter copes with nonlinearities by Taylor expanding the dynamics to first order [28]. The unscented Kalman filter (UKF) allows for arbitrarily complex dynamics but captures probability distributions to finite order, typically by focusing on mean and variance (i.e. approximating noise by Gaussian noise models) [29–31].
Here, we follow the second approach, and propagate our prior parameter distribution—assumed to be sufficiently well approximated by a Gaussian—through the exact system dynamics. This is achieved by carefully choosing a set of parameters Ξ_{k} = (χ_{0}, … ,χ_{2L})—where each χ_{i} is known as a sigma point, and L is the total number of parameters (and states if desired) to be inferred—which completely capture the statistics of the prior distribution. These may be propagated individually through the system and reassembled to approximate the distribution of the output. For the moment, we will consider parameter estimation rather than model selection. We begin by defining the state space model, 3.1where g(·) is the nonlinear observation function with parameter vector θ_{k}, u_{k} ∼ 𝒩(0,R_{uk}) and v_{k} ∼ 𝒩(0,R_{vk}) are the measurement noise and artificial process noise driving the system with covariance matrices R_{uk} and R_{vk}, respectively. x_{0} are the initial conditions that can also be included into the parameter vector if they are unknown.
We then have for the UKF

U1 Initialization: Set k = 1.

U2 Prediction: while k ≤ n and

U3 Update:
Here, we have with the sigma point set Ξ_{k} and weights ω_{i} given by where and parameters κ, α and β may be chosen to control the positive definiteness of covariance matrices, spread of the sigma points and error in the kurtosis, respectively.
The 2L + 1 sigma points, Ξ_{k} (in U3), together with the corresponding weights ω_{i}^{c/m}, describe the distribution over the parameter state up to the second moment of the probability distribution. That is, we approximate the probability distribution up to and including the second moment of the distribution; we can go beyond this at increased computational cost. For more complex likelihood surfaces, the UKF therefore gives us a local (Gaussian) look at the posterior. In order to characterize parameter space more globally, multiple runs of the UKF may be used with priors chosen to cover the regions of interest.
Crucially for our purposes, the observation function, g(·), can be arbitrarily complex (subject to the usual regularity conditions) and along with the data, 𝒟—which other than an observed dataset, may also be any function of the data, or even expected or desired qualitative/quantitative characteristics of the system's behaviour [32]—may be chosen to explore various features of interest of the dynamical system. This great, and thus far underexplored, flexibility allows us to use the UKF to determine a set of parameters commensurate with the objectives set out prior to the analysis and encoded in 𝒟. In addition to data and conventional surrogate data (e.g. to condition on design objectives), this may include

— stability properties of the stationary state,

— existence of bifurcations, and

— boundedness of behaviour.
Below we will illustrate this approach in the context of the attractor structure.
The UKF is a natural complement to the approximate Bayesian computation (ABC) approaches discussed below. We are able to tackle larger problems because of the simplified representation of the probability distributions.
3.2. Approximate Bayesian computation
Circumventing explicit evaluation of the likelihood by directly simulating from the datagenerating model(s), ABC methods [33–36] are increasingly being applied, especially in inferentially demanding contexts such as evolutionary and systems biology. They work by comparing simulated and observed/desired data and rejecting all parameter/model combinations that fail to provide a satisfactory description of the data. In their simplest form, they work as a simple rejection algorithm.

(i) Draw m* from the prior π(m).

(ii) Sample θ* from the prior π(θ_{m*}).

(iii) Simulate a candidate dataset D* ∼ f(D θ*, m*).

(iv) Compute the distance. If Δ(𝒟, D*) ≤ ε, accept (m*, θ*), otherwise reject it.

(v) Return to (i).
Here, Δ(x,y) can be any suitable distance measure, and ε > 0 is the tolerance level which is set by the user. Once a sample of N particles (parameter vectors) has been accepted, the approximate marginal posterior distribution is given by Here, we compare the complete data; in principle, we may want to choose to compare summary statistics, which can have advantages for very complicated data structures. But great care has to be invested into the choice of the best summary statistics if we want to perform model selection. This is because choice of insufficient statistics can lead to spurious results for the marginal posterior distribution as discussed by Robert et al. [37].
This simplicity of ABC algorithms is deceptive, as for practical applications, the simple rejection scheme is computationally too inefficient. A range of computational schemes based on MCMC, and especially SMC methods have been proposed and here we follow the algorithm introduced by Toni et al. [35] and Toni & Stumpf [38]:

A1 Initialize ε_{1}, … ,ε_{T}.
Set the population indicator t = 1.

A2.0 Set the particle indicator i = 1.

A2.1 If t = 1 sample (m**, θ**) from the prior distribution Pr_{0} (m, θ) = π (m, θ).
If t > 1 sample m* with probability Pr_{t−1}(m*) and draw m** ∼ KM_{t}(mm*).
Sample θ* from previous population {θ(m**)_{t−1}} with weights w_{t−1} and draw θ** ∼ KP_{t,m**}(θθ*).
If π(m**, θ**) = 0, return to A2.1.
Simulate a candidate dataset D* ∼ f(Dθ**, m**).
If Δ(𝒟, D*) > ε_{t}, return to A2.1.

A2.2 Set (m_{t}^{(i)}, θ_{t}^{(i)}) = (m**, θ**) and calculate the weight of the particle as where
1(Δ(𝒟, D_{b}^{*}) ≤ ε_{t}) denotes the indicator function (=1 if Δ(𝒟, D_{b}^{*}) ≤ ε_{t} and =0 otherwise), and
If i < N set i = i + 1, go to A2.1.

A3 Normalize the weights w_{t}.
Sum the particle weights to obtain marginal model probabilities,
If t < T, set t = t + 1, go to A2.0.
Here, KM and KP are model and parameter perturbation kernels, respectively; B_{t} ≥ 1 is the number of replicate simulations for a fixed particle.
In practical applications, the choices of ε_{t}, 1 ≤ t ≤ T, and the perturbation kernels, KM and KP, B_{t}, can matter a great deal. Obviously, choice of priors will also affect inferences; the distance metric Δ(x,y) should have negligible impact as ε_{T} → 0. Much of this is still an area of ongoing research, but the package ABCSysBio [39] offers a convenient implementation of the ABC SMC algorithm and can be used for model selection. The computational power provided by graphics processing units has enabled the application of these methods to problems of increasing complexity [40]. In the following section, we will illustrate the use of this framework in both inference and design settings.
4. Learning the structure of natural systems and successful designs
Here, we illustrate the uses of the ABC SMC and UKF approaches outlined above. We will start with parameter estimation and model selection in the ABC framework. Here, we will stress the conceptual similarities between inference and design.
4.1. Parameter estimation for dynamical systems
The repressilator [41] has become one of the canonical model systems for systems and synthetic biology. It is a simple circuit of three genes and their respective protein products, where protein 1, y_{1}, inhibits expression of gene 2, x_{2}; protein 2, y_{2}, inhibits the expression of gene 3, x_{3}; and protein 3, y_{3}, finally, inhibits x_{1}. In mathematical form, we write for the simplest case of shared kinetic parameters, 4.1where x_{i} and y_{i} denote the mRNA and protein concentrations corresponding to gene i. The index i cycles over the set {1,2,3}. The structure and mathematical details of this model are discussed elsewhere [35].
This model and its generalizations [42] have been studied extensively and it is also included among the examples of the ABCSysBio package [39], which implements ABC SMC for dynamical systems. In figure 3, we show the resulting posterior probability distributions obtained from simulated data, assuming flat priors for all four parameters. This figure shows the complexity of posterior parameter distributions, even for very small dynamical systems: posterior distributions show considerable levels of correlation and in some directions are still extending over large regions (marginals may even cover the whole prior range).
This is a typical finding for dynamical systems: Sethna and coworkers [43,44], and subsequently other authors [45,46], have found that the likelihood or posterior for the parameters of dynamical systems tend to be flat, even when unrealistically large data were used. Furthermore, correlation between parameters appears to be ubiquitous, which may put practical limits on the use of statistical approaches that assume independence between parameters. This is mainly owing to the fact that dynamical systems tend to be mechanistic and attempt to represent the underlying physical processes. This often means that they contain more parameters and only certain combinations are inferable.
The problem of the complexity of likelihood surfaces or posterior distributions of dynamical systems is further exacerbated by the fact that these also depend subtly on a host of dynamical features of such systems, such as the existence of bifurcations and initial conditions. While posteriors that cover large areas may seem disappointing, more recent studies, albeit confirming such findings, also clearly indicate that parameters which are hard to infer have only negligible impact on the system outputs. On the other hand, all parameters that, when varied slightly, yield very different system outputs, tend to be inferred easily. Following [44] such parameters (or parameter combinations) are referred to as ‘sloppy’ and ‘stiff’, respectively. Many of these problems are most naturally studied using the observed or expected Fisher information [47], also for stochastic dynamical systems [48].
4.2. Qualitative design with the unscented Kalman filter
As we have seen in the previous example, the parameter posterior, Pr(θ𝒟), for a dynamical system can extend over large areas in parameter space. Reporting such highdimensional and decidedly nonnormal probability distributions generates new challenges but is important given that point estimates carry only limited information. Here, we return to the parameter estimation problem. But rather than looking for quantitative agreement between data and a mathematical model, we seek to determine a set of parameters that gives rise to certain type of qualitative behaviour [32].
One way of specifying or encoding qualitative behaviour of a dynamical system is via its Lyapunov spectrum [49]. For a dynamical system which is the solution of an ordinary differential equation system we define the Lyapunov exponents as the rate at which initially arbitrarily close trajectories diverge or converge exponentially with time under the influence of the dynamical system's vector field. More formally, if δy(t) denotes the difference between two trajectories that at time t = 0 were separated by δ_{0} > 0, the Lyapunov exponents are the eigenvalues of the matrix Λ defined by 4.2
The leading eigenvalue, λ_{1}, determines to a large extent the dynamical behaviour of the solution. For λ_{1} < 0 all initially nearby solutions will converge and consequently the system will have a single stable equilibrium solution. For λ_{1} > 0 trajectories will diverge and the attracting state of the system will depend sensitively on the initial conditions; such dynamical behaviour is commonly referred to as ‘chaotic’. Finally, for λ_{1} = 0, trajectories will neither converge nor diverge and the system's attractor will have an oscillatory structure. While Lyapunov exponents are costly to calculate in practice, they do capture the qualitative dynamical behaviour comprehensively and are therefore a useful feature to consider when designing dynamical systems, or specifying their qualitative behaviour.
The UKF now allows us to identify parameter sets that give rise to certain types of dynamical behaviour, if we define the state of the system in terms of the leading Lyapunov exponent (or the entire spectrum) that we desire the system to exhibit. That is, we specify the set of Lyapunov exponents, λ_{i}, and, starting from an arbitrary starting point, θ_{0}, we iterate through the UKF procedure until the procedure delivers a set of parameters that are in concordance with the desired system output (characterized by the specified values of λ_{i}).
In figure 4, we illustrate this in two contexts. First in figure 4a, we infer parameters for the Hénon attractor, one of the canonical models used to study and illustrate the effects of chaotic dynamics for discrete timedynamical systems.
H Hénon map: 4.3where a and b are free parameters.
Here, we start from a parameter choice for which the system exhibits oscillations and drives it towards values for which we recover the classical attractor structure. In figure 4b, we revisit the repressilator again, given by equations (4.1). Again we start from a random vector which lies in the region where the system's attractor occupies a stable equilibrium point and drives it into an oscillatory regime by specifying λ_{1} = 0. For the repressilator, this tends to happen very quickly.
Both examples serve only as illustrations of the use of the UKF and qualitative considerations in the design of dynamical systems. We find that this approach is particularly well suited if the correct qualitative behaviour is paramount. This may, however, also happen in inference: simple optimization approaches can get stuck in regions with the qualitatively wrong behaviour (for example, when a straight line is chosen to approximate an oscillatory solution). By specifying the correct qualitative target behaviour via the (leading) Lyapunov exponent(s), we can guide the solution towards the correct behaviour. This may, for example, aid in eliciting suitable priors for further quantitative inference tasks (e.g. using ABC or exact Bayesian approaches). Overall qualitative inference has received very little attention thus far [50–52], but the combination of suitable statistical tools coupled to appropriate encoding of qualitative features holds a lot of promise both for the inference and the design setting. The types of qualitative behaviour we may want to consider are summarized together with appropriate criteria in table 1.
4.3. Model choice and design using approximate Bayesian computation
In a design setting, we often have a choice of available designs with a wide range of potential parameters which would generate system dynamics that fulfil our specifications. In either case, we can marginalize over the parameter space in order to obtain posterior probabilities, 4.4
In design applications, 𝒟 can be any set of specified system behaviours. The approach outlined above allows us to compare the probability of different models to achieve the design objectives. Here, a probabilistic approach is beneficial if the system dynamics exhibits elements of intrinsic or extrinsic stochastic behaviour.
4.3.1. Classic oscillators
As a simple illustration, in figure 5 we compare the ability of three dynamical systems to exhibit certain types of oscillatory behaviour. We compare three mathematical models describing biological oscillators, which had been considered previously [53].
M_{1} Meyer–Stryer oscillator: 4.5where c_{4} is free and the other parameters are fixed at c_{1} = 6.64, c_{2} = 5, c_{3} = 0.0000313, c_{5} = 2, c_{6} = 0.5, c_{7} = 0.6, K_{1} = 0.1, K_{2} = 0.15, K_{3} = 1 and the initial conditions are x_{0} = 0.1, y_{0} = 0.01, z_{0} = 25.
M_{2} Van der Pol oscillator: 4.6where μ is free, x_{0} = 0.1 and y_{0} = 0.
M_{3} FitzHugh–Nagumo oscillator: 4.7where ϕ is free, θ = 0.2, ω = 0.112, γ = 2.5, ν_{0} = 0 and w_{0} = 0.
All of these systems can exhibit oscillatory behaviour and have been proposed in different contexts as models of realworld biologically oscillating systems.
We illustrate the design approach by imposing two different sets of objectives. First, we demand that the oscillations occur at a frequency of 0.02 Hz (within the region 0.0188–0.0298) and that the amplitude of x(t) exceeds 0.1 (figure 5a). This is achieved by simulating systems for t = 200 s and imposing the following design constraints: 4.8where n is an integer, f_{t} is the target frequency (=0.02), f is the frequency determined from the largest component of the Fourier spectrum, A_{t} is the target amplitude (=0.1), T is the target period and t_{0} is a cut to remove initial transients (=50 s). These constraints are applied to the first species in all three models although the target species can change the posterior distribution significantly [21]. Second, we require that only the system is able to undergo a Hopf bifurcation [54] (figure 5b); a distance function which allows us to specify this type of behaviour is given by 4.9where the λ_{i} are the eigenvalues of the linearized systems around the steady state [55]. This distance function decreases as the real part of a pair of complex conjugate eigenvalues moves from − inf to 0 giving rise to purely imaginary eigenvalues and limitcycle oscillations. Here, we used ε = 0.001 as the final threshold.
As we can see in figure 5, the three models show some differences in their respective abilities to produce the desired behaviour. The Van der Pol oscillator, for example, appears less able to deliver the first design objectives than the other two models, which are roughly equally proficient in the first case but start to differ considerably under the second, lessrestrictive design objective. Here, we find that the simple FitzHugh–Nagumo oscillator exhibits Hopf bifurcation more reliably than the other two models. In addition to the marginal model probabilities, we also, of course, obtain estimates for the parameter regions that are commensurate with the desired behaviour, which allows us to assess, for example, biophysical constraints that need to be fulfilled by real systems.
Depending on the desired system behaviour, the balance may tip in favour of any of these oscillators. But as we see in figure 5a, sometimes a tie between different models may mean that we have to defer to other factors to choose the best model. In the case of a choice between the Meyer–Stryer and FitzHugh–Nagumo oscillators, for example, we may prefer the simpler model because of the potentially higher ‘cost’ of implementing the threespecies model. Including additional factors into the cost function is straightforward in Bayesian decision problems, and is also possible (and desirable) in the design problem as outlined here.
4.3.2. Synthetic in vitro transcriptional oscillators
An example of how this approach can be applied to synthetic biological design is demonstrated by considering three in vitro synthetic transcriptional oscillators previously constructed using DNA–RNA circuits [56].
D_{I} a twoswitch negative feedback oscillator: 4.10where α, β, γ, n, m, x_{0}, y_{0}, u_{0} and v_{0} are free parameters.
D_{II} an amplified negative feedback oscillator: 4.11where α, β, γ, δ, n, m_{1}, m_{2}, x_{0}, y_{0}, u_{0}, v_{0} and w_{0} are free parameters.
D_{III} a threeswitch ring oscillator: 4.12where α_{i}, β_{i}, n_{i} and x_{i}(0), i ∈ {1,2,3}, are free parameters.
All three systems were constructed with the twoswitch negative feedback oscillator demonstrating the most robust oscillations. While the authors eventually used more complex models to achieve agreement with experimental data, these simple models were used to understand the principles of operation.
Our approach in this example was to first use the UKF to determine regions of parameter space giving rise to oscillatory behaviour (λ_{1} = 0). The UKF was run 200 times with parameters sampled over the region [0,100] for α, β, γ, [0,25] for m, n and initial conditions sampled over the region [0,5]^{4} using latin hypercube sampling. Examples of the resultant distributions for the twoswitch negative feedback oscillator (D_{I}) are given in figure 6. The ranges from the UKF parameter distributions were then used as priors for an ABC model selection analysis with the Hopf bifurcation criterion (equation (4.9)) as our desired design objective which gives rise to limitcycle oscillatory solutions (though the upper bound on Hill coefficients was set to 10). Figure 7 illustrates the results of this analysis, which show that the twoswitch negative feedback oscillator clearly outperforms the other two for these design objectives. This demonstrates the balance between model complexity (number of parameters) and the achievement of the design objectives which will be discussed below. Figure 8 shows the ABC posterior for the twoswitch negative feedback oscillator. The desired behaviour is most sensitive to the Hill coefficients m,n and least sensitive to α.
5. Inference and robust behaviour
In the earliermentioned examples, the focus has been on evaluating the posterior distributions, Pr(θ𝒟) or Pr(m,θ_{m}𝒟). This incurs considerable computational costs and, given that we have to make some approximations in both approaches presented here, we may want to consider the underlying rationale.
Optimization methods locate the relevant extrema in some objective or cost function. Thus, for a single model they provide us with the parameter value that, for example, maximizes the likelihood of observing the given data. Comparing different models in an optimization framework is not always straightforward as it requires a number of additional heuristics and assumptions; information criteria [57] such as the Akaike information criterion, Bayesian information criterion or deviance information criterion can be used to compare the explanatory power of different models in a way that penalizes against overly complicated models. But such post hoc measures can be problematic, and perhaps more worryingly, optimizationbased approaches typically lack appropriate model selection mechanisms entirely.
The advantages of the computationally much more costly inferential frameworks advocated here are easily illustrated in figure 9. We assume that we have data (circles in the inset to the figure) and two competing models for which we have obtained posteriors for the single parameter describing both models. The maximal posterior estimate of the ‘grey’ model is higher than that of the ‘black’ model; simulating from this model for the maximum parameter, , yields a curve (solid grey line) which perfectly fits the data for the grey model. However, perturbing the parameter slightly by a small amount, δ, rapidly deteriorates the fit between model and data (dashed grey line). For the black model, we observe that the maximum a posteriori estimate results in a good but not perfect fit (solid black line); adding a small perturbation to a value results now, however, in a less severely compromised fit (dashed black curve). In evolutionary biology, robustness has sometimes been referred to as ‘survival of the fattest’. Thus, height of the posterior probability curve, or optimality of a cost function in other words, might suggest that the grey model is to be preferred, while robustness to variation in the parameter would tip the balance in favour of the black model. In fact, in a Bayesian framework, the marginal likelihood gives an objective criterion for model selection. This naturally strikes a balance between model performance and robustness (and, of course, model complexity is implicitly dealt with in this framework).
6. Discussion
In both inference and design, we seek to balance model fit/performance with robustness. Many engineering approaches have been devised in settings where stochastic effects are ignored, whereas randomness is inherent to the statistical perspective. Hence, some elegant approaches that are applicable in deterministic modelling contexts [58] are not readily applicable to processes where intrinsic or extrinsic noise affects the dynamics and behaviour of a system. In such contexts, we may prefer the computational burden incurred by the approaches described above (linear in the number of parameters for the UKF and approximately exponential for ABC SMC) over the potential pitfalls that optimization methods (e.g. [59,60]) can encounter when confronted with noisy and stochastic dynamics.
From our point of view, the principal difference between inference and design is whether we seek to elucidate the structure that has given rise to some observed data, or the structure that will give rise to the type of output which we would like to observe. This change in perspective allows us to apply the whole apparatus (and, in our view, the inherent advantages) of Bayesian inference to design problems. If we are furthermore, as envisaged in synthetic biology, dealing with libraries of standardized and wellcharacterized basic parts or components, then we can begin to construct integrated design platforms that explore the available model space (in terms of the available components) and search for solutions that meet the design objectives. Table 1 shows some possible design objectives that can be achieved using our design framework.
Here, a number of challenges need to be faced: unlike electronic systems, biological systems are subject to (specific and unspecific) crosstalk that is mediated by molecules involved in many different processes (e.g. metabolites such as ATP or NADPH); the search space can be vast and efficient heuristics will almost certainly be required to traverse such spaces efficiently; and the correct encoding of design objectives is not always clear a priori. Some of these potential difficulties, of course, mirror many of the problems also encountered in statistical inference and systems biology, or the inverse problem field more generally. The computational approaches developed and illustrated here make best use of the advantages of the Bayesian inferential framework: while we have not focused on this here, they can incorporate prior knowledge and biophysical constraints in their respective prior specifications; they naturally strike a balance between model robustness and performance (or fit to data); and they ensure that parsimony is obeyed as much as possible. In summary, we feel that the Bayesian perspective on design offers a considerate and disciplined way of approaching exciting problems in synthetic biology.
Footnotes
One contribution of 9 to a Theme Issue ‘Inference in complex systems’.
 Received June 15, 2011.
 Accepted September 12, 2011.
 This journal is © 2011 The Royal Society