Mathematical models are increasingly used in life sciences. However, contrary to other disciplines, biological models are typically over-parametrized and loosely constrained by scarce experimental data and prior knowledge. Recent efforts on analysis of complex models have focused on isolated aspects without considering an integrated approach—ranging from model building to derivation of predictive experiments and refutation or validation of robust model behaviours. Here, we develop such an integrative workflow, a sequence of actions expanding upon current efforts with the purpose of setting the stage for a methodology facilitating an extraction of core behaviours and competing mechanistic hypothesis residing within underdetermined models. To this end, we make use of optimization search algorithms, statistical (machine-learning) classification techniques and cluster-based analysis of the state variables' dynamics and their corresponding parameter sets. We apply the workflow to a mathematical model of fat accumulation in the arterial wall (atherogenesis), a complex phenomena with limited quantitative understanding, thus leading to a model plagued with inherent uncertainty. We find that the mathematical atherogenesis model can still be understood in terms of a few key behaviours despite the large number of parameters. This result enabled us to derive distinct mechanistic predictions from the model despite the lack of confidence in the model parameters. We conclude that building integrative workflows enable investigators to embrace modelling of complex biological processes despite uncertainty in parameters.
The notion that to understand a phenomenon in nature requires a mathematical model originates from ancient thinkers, including Aristotle and Plato, but not realized until the work of Isaac Newton in the seventeenth century. This mode of thinking has been instrumental for the success of physics whereas the corresponding mathematical formalization of the life sciences only began during the second half of the twentieth century. This divide is in part due to the major difference between physics and biology in our limited understanding of biological processes from first principles. Consequently, models capturing events within and between cells are therefore by necessity plagued with uncertainty with respect to model structure and parameter values. However, the rise of mathematical modelling within the life sciences during the last two decades has largely focused on formulating models and their analytical or numerical solution. The bearing of mathematical modelling for medicine and biology has not reached its full potential chiefly because of the difficulty to have confidence in models due to their inherent uncertainty. Such model ambiguity is as a rule reflected in a large number of parameters not constrained by experimental data. Hence, there is an urgent need to develop efficient workflows on how to analyse such over-parametrized models to develop predictive mechanistic hypotheses. This is a central challenge to address in order to increase the confidence in mathematical modelling within the life sciences.
In recent years, several approaches have been developed in order to standardize the analysis of such models and the generation of useful predictions. A few groups have developed an ensemble approach towards the analysis of over-parametrized models under (comparably) scarce experimental data. Battogtokh et al. , in the context of chemical reaction networks, proposed a methodology based on a statistical ensemble of ‘all’ parameter sets consistent with existing profiling data (feasible parameter sets, FPSs). Their methodology enabled a classification of parameters either as ‘well’ or ‘poorly’ specified and thereby allowed the investigator to identify the experiments which maximally reduce the parameter uncertainty. In parallel, Brown & Sethna  developed a statistical ensemble approach to analyse the collective properties of sloppy models, i.e. models with (i) poorly known parameters, (ii) simplified dynamics, and (iii) uncertain connectivity. Soft and stiff directions were identified by examining the sensitivity of model behaviour to parameter changes. Recently, Gutenkunst et al.  discovered that most biological models are characterized by a ‘sloppy’ spectrum of parameter sensitivities. In these ensemble approaches, the parameter space (PS) was explored to find FPSs by combination of random walks and gradient optimization methods. However, only Battogtokh et al.  went beyond an optimization analysis and derived specific predictions about the behaviour. Nevertheless, a parameter uncertainty analysis needs to be linked with hypotheses generation by a supervised analysis or by unsupervised exploration. Model checking (Mc) is an example of a supervised approach where the model is interrogated to formally verify a system's behaviour with respect to a set of properties . An inherent limitation of an Mc analysis is that the questions need to be predefined by the researcher [5,6]. A supervised approach considers the analysis of the observed behaviours of the model. A recent framework surpasses this limitation , and Cedersund et al.  consider an automated, non-supervised approach that searches for ‘core behaviours’, defined as hypotheses that cannot be rejected by any FPS. A clever combination of experiments, model-based data analyses and theoretical predictions allows the formulation of those hypotheses. From our point of view, this approach presents the first prediction-oriented model-based research for over-parametrized models.
Nevertheless, parameter uncertainty is only part of the problem and model uncertainty must also be considered. Kuepfer et al.  extend the ‘ensemble framework’ from parameters to models by developing a library of mechanistically alternative dynamic models to unravel key operating principles in the target of the rapamycin pathway of Saccharomyces cerevisiae. Their analysis considers a core model and several ‘local’ expansions around the core model. Then experimental data are used in each model to (i) solve the parameter fitting problem and (ii) to rank the different models based on their prediction power. Interestingly, this approach considers unique parameter sets for each model considered; thus omitting the ‘parameter ensemble’ idea. A study of possible parameter values can provide information on which mechanisms may, must and cannot be significantly active during a specific time-scale [8,10]. This approach can be complemented by using formal methods for reducing the complexity of models such as those shown in [9,11,12]. However, the application of such approaches has so far been limited to small models and not to models with hundreds of state variables (SVs) and parameters .
Taken together all these efforts bring us to the idea and necessity of developing a workflow which enables the generation of competing biological hypotheses through a modelling approach which captures a complex biological process. We propose a workflow applying to a case study that shows how a large and complex model can be reduced and simplified to extract the most biologically relevant information given the scarcity of biological data available. We make use of the parameter ensemble approach but we extend it by combining it together with a behaviour ensemble approach. Our workflow therefore integrates and extends the methods which are available for model analysis and behaviour extraction. We make use of optimization procedures, statistical (machine-learning) techniques and cluster-based analysis of the SVs' dynamics and the parameter sets. As a case study, we apply the methodology to a time-dependent systems biology mark-up language (SBML) encoded disease model that captures the initial plaque growth over time in the arterial wall (atherogenesis), hence referred to as the A model. Our A model is encoded in an SBML format ; it will be publicly available and its identifier is MODEL1002160000 at http://www.ebi.ac.uk/biomodels-main. This workflow has also been applied in computational neuroscience as a tool (i) to answer a specific question in selective attention by exploring the feasible PS ; and (ii) to find different qualitative behaviours  in a working memory model .
Section 1 of the paper presents an overview of the work-flow. The subsequent sections develop the model analysis in detail where each step of the workflow is illustrated by the analysis of the A model. The final section presents the conclusions regarding the utility and possible extensions of the workflow.
2. Workflow: a tool configuring model analysis and hypotheses generation
The notion of workflow is used here as an abstracted sequence of operations aiming to extract hypotheses from over-parametrized models having large number of under-determined parameters relative to the available prior knowledge and experimental data.
Generically, we consider a model M, which is defined by a set of parameters p; each parameter p ∈ p is bounded inside a range. Within PS, we consider a binary classification of the parameter values as feasible and not-feasible, where ps ∈ PS is feasible if its associated behaviour agrees with the experimental data. FPS denotes the set of feasible parameter sets. However, different facets of ps ∈ PS can be studied such as ‘parameter values’; secondly, ‘behaviour-of-state-variables’ i.e. the trajectory of the model; and finally the ‘sensitivity characteristics’. The study of parameter values can be focused on elucidating the mathematical characteristics of FPS (i.e. is it convex? see [18,19]), whereas the sensitivity analysis can identify the sloppiness and stiffness directions in FPS (see ). The study of model behaviours has only recently been addressed by introducing the concept of ‘core prediction’ defined as ‘uniquely identified properties in unidentifiable models’ . Here, our workflow focus on the identification and collection of competing distinct behaviours of the model, thereby enumerating different but yet possible mechanistic hypotheses relative to a model and data. In contrast, ‘behaviour core predictions’ represent the intersection of all competing hypotheses. Our setting is ordinary differential equation (ODE) models, but we hypothesize that our workflow includes necessary parts for other model classes as well. The workflow consists of a set of sequential steps to be satisfied, hence to initiate step n, all previous ought to be accomplished, while still allowing for backtracking, i.e. an iterative workflow as illustrated in the flowchart (figure 1). For each step within the workflow we discuss heuristic insights including useful tools.
Once a model has been formulated the second step in the workflow targets the parameter estimation problem using optimization techniques. The outcome of this analysis is a filtered sample of FPS (fsFPS) and their related behaviours. Both these sets are used in step III to analyse the different aspects of the PS (such as parameter values and behaviours) and to generate hypotheses regarding the dynamics of the system. The fourth step, identifies efficient experimental manipulations to distinguish and validate the competing hypotheses.
There are several backtracking steps in the workflow such as revisiting step II whenever there is evidence that the cardinality of fsFPS is too small. In addition, the investigator needs to return to step 1 in order to validate the hypotheses by experimental designs and to use them in the considerations of putative modification/enlargement/correction of the model.
3. Step i. formulation of a computational model
To develop a model, it is necessary to consider (i) the biological system under consideration, (ii) the biological question to be addressed, (iii) experimental data available, and (iv) the modelling formalism (ODE, PDE, or Bayesian modelling). A detailed discussion of this mixture between science and art is out of our scope . For our purposes it suffices to employ the middle-way out modelling approach as initially suggested by Sydney Brenner et al.  and subscribing to the interpretation of Dennis Noble  ‘My interpretation of the “middle-out” approach is that you start calculating at the level at which you have the relevant data’.
However, we observe that (i) the number of parameters to be incorporated and biological processes to be selected need to take into account the trade-off between the model complexity and the experimental data available. It is helpful to enumerate all biological processes relevant in the biological system of interest and prioritize them by their relevance to the biological question under consideration. This allows the modeller to consider which elements are more relevant. (ii) The representation of the biological processes is a key part; many researchers consider the use of simplified dynamics , but this can decrease the ability of a model to explain experimental data. Even if in this paper we focus on parameter uncertainty, model uncertainty should also ideally be taken into account in the process of model formulation .
For the remainder of the paper we consider an ODE system as follows: 3.1and 3.2where ps denotes a parameter set, x0 denotes the initial conditions and u denotes the external input to the system.
3.1. Case study: a model of the atherogenesis
The term ‘atherosclerosis’ was introduced by Marchand to describe the association of fatty degeneration and vessel stiffening . Atherosclerosis is characterized by the accumulation of lipids and fibrous elements in the large arteries. During the initial phase—atherogenesis—there is a sub-endothelial accumulation of cholesterol-engorged macrophages, called ‘foam cells’. Atherosclerosis is considered a progressive disease where heart attack or stroke can be induced by thrombosis . Biological experiments have revealed a critical period where the disease accelerates rapidly . Moreover, these experiments suggested that a reduction of lipid concentration in the blood stream is effective in reducing the disease progression, provided that such lipid reduction is delivered prior to the onset of the disease.
The current understanding of atherogenesis highlights two related key processes. On one hand, the oxidation of low-density lipoprotein (LDL) at the intima followed by the formation of foam cells, their necrosis and posterior plaque formation. On the other hand, the immune system plays a role through the action of T-cells and B-cells. We designed a mechanistic model, A model, that describes the dynamics of atherogenesis, i.e. the initial development of the plaque, before any flow dynamics or plaque instability start posing any risk for the patient. Yet, our precise mechanistic understanding how these processes are interconnected is still in its infancy thus providing us with a realistic case of uncertainty while we have a solid description of the model outcome, i.e. the plaque progression over time.
The A model is defined by eight ODEs including eight SVs and 53 free parameters to characterize their inputs and interactions. SVs describe the activation of macrophages (MP), B-cells (Bcells), T-inflammatory cells (TINF) and T-inhibitory (TINH) cells; the size of the plaque and the concentrations of oxidated LDL (oxLDL), LDL and high-density lipoprotein (HDL). The model is available in C ++ and in SBML (code MODEL1002160000). A complete study of this model is out of the scope of this paper and will therefore be analysed in detail in a separate paper. However, a more detailed description of the A model (including parameter ranges) can be found in the electronic supplementary material part 1 and the equations are displayed in part 2. Figure 2 shows SVs and biological processes and summarizes the dynamics of the SVs. Note that x0 = 0, the initial point of the behaviour in all cases and for all SVs. The unique external input to the system is the level of LDL in the blood set to 0.
4. Step ii. collecting feasible sets of parameters
Once a mathematical model is formulated or given, it is necessary to perform an unbiased discovery of those sets of parameters that are ‘feasible’ relative to the experimental data, prior knowledge and research question. To this end, we need to consider the following three components in a workflow: (§4.1) how to formalize the evaluation of a model, (§4.2) how to search through ‘relevant’ parts of the PS and (§4.3) rejection of certain ‘feasible’ sets of parameters, denoted here as introducing a filtering procedure.
4.1. Evaluation of a parameter set
To discover FPS, it is necessary to introduce mathematical fitness functions formalizing the notion of ‘how feasible a given set of parameters is’. If parameter values are within a pre-defined range, the feasibility of a parameter set p is defined by the behaviours (beh(p)) of the SVs. On one hand, beh(p) are expected to agree with available experimental data; on the other hand, beh(p) are expected to fulfil some expert knowledge criteria (see non-least-squares terms from ‘fuzzy data’ in ).
The evaluation of a parameter set can therefore be expressed mathematically as 4.1where each fi denotes a function to be evaluated and αi denotes its relative weight. The mathematical instantiation of the functions depends on the available experimental data and expert knowledge. The ensemble approach proposed by Battogtokh et al.  used time-dependent profiling experiments in which concentrations were measured. At each experiment controlled and quantitatively known experimental conditions were altered. The eval function was constructed as a probability function that considered ‘the probability of observing beh(p) given the experimental data’. In contrast, Brown & Sethna  considered two terms, the first term considered an L2 distance between the observed behaviour and the average experimental behaviour. The second term penalized the violation of inequalities or other general nonlinear terms that the model must fulfil. A second penalty term can also be used as a regularization method adding an extra term to the cost function, which penalizes deviations of the parameters from some given nominal values . Regularization adjusts the flexibility of the model . In the above formulations, the evaluation function is supposed to be minimized. However it is possible that a parameter set p can be considered qualitatively accurate if the evaluation function reaches certain threshold. This has been used in Brannmakr et al. , where the parameter is explored to obtain sets with a cost below a threshold. Brannmark et al.  claim that they are ‘looking for model properties that are shared among all these parameter values’. In summary, there are a number of choices such as the relative weights, nature of fitness functions, and the interpretation of the sum. These options determine the theoretical upper limit of ‘feasible’ sets of parameters. The jury is still out on how these choices should be integrated in a specific application. In our hands, we find a promiscuous approach useful in the sense that we use equal weights and not too tight fitness criteria. The tools for the analysis of the solutions (§5) will take care of the fact that the number of solutions identified in practice will be larger using a promiscuous approach.
All methods previously mentioned are based on the definition of a priori feasible ranges for each parameter. However, in many cases that information is not available. In these cases, it is possible to focus the analysis on the study of the dynamics (such as in the case study presented) or to perform preliminary experimental/computational experiments with the objective to define an initial PS.
4.2. Search algorithms
The PS must be explored numerically by making use of the evaluation functions. If the evaluation function is based on an L2 norm, distance gradient methods (such as Levenberg–Marquardt) combined with a Monte-Carlo (MC) search are efficient for the exploration of PS [1,2]. Gradient methods will find local minima that will depend on the initial value, thus arguing for combining them with an MC search and thereby we can expect a reasonable exploration of the system. However, this combination is not efficient enough when the evaluation function deviates significantly from L2, or when the computational effort needed to evaluate a given parameter set is prohibitively large and the number of simulations must therefore be reduced.
In either case black-box meta-heuristics  are useful, since this class of methods search through the PS in an unbiased manner for those parameter sets that optimize the evaluation functions. Some of the most widely used and robust methods include particle swarm optimization (PSO), Genetic Algorithm, Simulated Annealing and Scatter Search. Yet, the no-free lunch theorem states , that there is no best optimization method for all problems. Nevertheless, there are more flexible methods and less flexible methods. One possible disadvantage of these methods is that an incorrect fine tuning for a specific problem will clearly affect the quality (error) and number of identified FPS. This is an area of active research targeting the development of automatic fine-tuning methods .
The problem of parameter estimation of nonlinear dynamic biochemical pathways by heuristic algorithms is not new . There are recent studies on the shape of the set of the FPS thus providing new insights into the development of search algorithms using the geometry of the problem . Interestingly, note that the problem under consideration is multi-objective as it is defined by a weighted sum of several evaluation functions such as the weights αi in equation (4.1) which will affect the search behaviour in the PS.
Regardless of the selected searching procedure the output of this step will be a sampling of the feasible set of parameters (sFPS). In order to assume independence between sets in sFPS only the best solution found in each optimization run is considered for further analysis. If there is a tie between two solutions one solution is selected randomly. A different option than sampling the FPS is to mathematically define the regions of feasible solutions, that is define the boundaries of FPS. An interval analysis and constraint propagation approach was proposed by Tucker et al.  to identify FPS in systems of ODE. Kuepfer et al.  formulated the identification of the FPS as a feasibility problem and computationally formulated the task as a relaxation to a semi-definite program. Recently, Hasenauer et al.  extended this idea to identify FPS in discrete time models of biochemical reaction networks from time-series data.
The combination of both policies such as reducing the PS by set-based approaches and optimization methods to find feasible sets within the reduced PS, speeds up the process of generating sFPS.
The initial sFPS contains parameter sets that, despite a low eval, should be rejected. The rational is the following: first, since eval is the weighted combination of several functions (equation (4.1)), a parameter set will finally be confirmed as feasible only if all its individual evaluations fi are below a threshold level. Secondly, we can reject a parameter set based on a residual analysis . The residual is the difference between the observed average behaviour (obs) and the predicted simulated behaviour (beh), i.e. for each data point i for a given parameter set ps we have 4.2
Here, two sufficient criteria for rejection have been identified. Firstly, provided that the sum of the residuals is large then eval will reject ps. Secondly when most of the residuals mimics their neighbours (e.g. if most simulations lie on the same side of the experimental data) then ps ought to be rejected as it does not explain the data correctly .
In some studies experimental data are split into two sets (see ). The first set is used to find the parameter values and the second set is used to validate them. Those parameter values that fail to predict the second experimental dataset are rejected. This procedure avoids parameter over-fitting. However in practice the number of data points is typically too small for dividing the dataset.
We denote the filtered sFPS as fsFPS. We will consider from now on the case when the cardinality of fsFPS is sufficiently large to allow statistical analysis.
4.4. Case study: feasible sets of parameters for the atherogenesis model
Here we define an evaluation function, we search the PS to collect all sFPS followed by filtering and thereby obtaining fsFPS. The experimental data on plaque development  were transformed into a parametrized sigmoid function (see the electronic supplementary material, part 3). We sampled 60 time-points and a regularization was considered by defining the final plaque size as 1.
The eval was a weighted linear combination of seven fitness functions using the behaviours during two different conditions: the high LDL (where plaque formation is expected), and a low LDL regime (where no plaque is expected) [35,36]. For additional details on the fitness functions (see the electronic supplementary material, part 3).
To sample FPS, we used the PSO algorithm. PSO  is an adaptive algorithm based on a social environment where a population of particles visit different ‘positions’ of a given volume in a space. Particles move stochastically towards the population's best fitness position (social knowledge) and the particle's own previous best fitness position (cognitive knowledge or self-knowledge). In this manner the particles of the population share information of the best volumes to search. We selected PSO because it is specifically tailored for real-value optimization problems where a generic ‘black-box’ evaluation function can be defined. We ran the PSO algorithm several thousands of times and selected the best evaluated parameter set of each run. Afterwards we filter this set by setting a quality threshold for each one of the fitness functions. Over 300 parameter sets were finally selected (fsFPS), representing less than 21 per cent of the total number of identified solutions (sFPS).
5. Step iii. extraction of robust mechanistic hypotheses from the parameter sets
Here, we aim to identify distinct mechanistic hypotheses residing within an over-parametrized model constrained by available experimental data. To this end, the set of fsFPS will be inspected. In particular we will consider the analysis of fsFPS with respect to parameter values (§5.1), behaviours in state-space (§5.2), identification of model behaviours under perturbations of biological interest (§5.3), and §5.4 deals with the extraction of robust competing hypotheses. The first three of these topics can be discussed for different subsets of fsFPS (a single parameter set or an interesting subset of fsFPS), i.e. a region within the PS, or for the complete volume of parameter sets, i.e. all fsFPS. In all analyses we consider the model reduction problem either based on parameter reduction of behaviour reduction.
5.1. Analysis of parameter sets
A large body of work on local sensitivity analysis [38–40] has demonstrated the sensitivity of biological model to individual parameters. However, even if recent efforts have been devoted to generate a general theory of non-equilibrium networks , still most of the analyses concern models described by a unique parameter set. Below we consider the inclusion of perturbation experiments (both robustness and sensitivity based) in the workflow in order to highlight common properties among FPS relevant in the biological processes. We and others [41,42] and Brannmark et al.  suggest instead a study of the ensemble of possible solutions which therefore leads us to ask different questions such as whether there is any structure or statistical dependencies within the fsFPS. Gutenkunst et al.  developed an analysis of the subset of FPS defined as the best fitted parameter set (bfps) and then, they explored the surrounding PS by random walks. Hence, the analysis was limited to a localized region in the PS. Here the analysis is constrained to the neighbourhood of a single FPS (bfps). Repeating the analysis for all bfps generates a distribution centred around bfps. The standard deviation of the distribution provides an estimate of the sensitivity of a parameter. Additional information can be obtained by analysing sloppy and stiff directions in the Hessian matrix around the localized region .
To study the FPS by a sole statistical exploration of all points in fsFPS for a given parameter p will return a distribution that spreads over most of the original allowed range, thus limiting its practical use. Correlation and two-dimensional projections of pairs of parameters can be informative  but limited to the non-linearities inherent to ODE models. However, it is relevant to observe by simple measures how informative a parameter can be. To this end, we aim to classify parameters as high spread (HS) or low spread (LS) by calculating spread (p) 5.1where max and min denote, respectively, the maximum and minimum value of parameter p in fsFPS and ub and lb denote, respectively, the upper bound and the lower bound in the original allowed range. Thus, spread(p) is the ratio between the difference of the maximum and minimum value of p in fsFPS and the difference of the maximum and minimum value of p allowed. An LS parameter can be considered, after certain validations  as a ‘core prediction’. We hypothesize that (i) parameters in HS parameters could either be non-relevant (NR) to the dynamics of the model, or could be correlated with other variables that would lead to a compensatory mechanism (CM) similar to those described in Hengl et al. . (ii) Some LS parameters can be set to a constant value (CV) and they will be ‘core predictions’ as defined in Brannmark et al. . Running new simulations with modified parameter values and comparing against the original beh(fsFPS) allows the classification of parameters as NR, CM or CV, where NR parameters may be omitted, CV parameters may be replaced with a CV and CM parameters may be merged (lumped) without affecting model behaviour. The final outcome of this process is a reduced model that strengthens the interactions between those elements that are governing the dynamics.
The progress on the problem of analysing the structure of FPS is still limited. The reason is that asking topological questions such as the nature of the connectivity or convexity within the PS are hard . Simple tools such as clustering have proved to be useful for elucidating the structure of FPS, such as clustering fsFPS into different sets by k-means, or hierarchical clustering using a Euclidian distance or using clustering measures (such as Silhouette  and Davies–Bouldin ) to select a final number of clusters. Principal component analysis  can be deployed to evaluate which parameters are responsible for the characterization of FPS and to observe the spatial ordering of the clusters in two-dimensional and three-dimensional plots. An interesting extension would be to analyse the sloppy and stiff directions  in different regions of FPS to evaluate if they share common characteristics. Note that defining the boundaries of FPS by set-based approaches such as  will provide detailed analysis of FPS, however those tools are not always applicable for any model.
5.2. Analysis of the behaviours in state-space
Although parameter analysis is useful, the lack of sufficient experimental data limits the quality of information that can be obtained. This is particularly valid for those studies that focus on a single parameter solution to a model. In such cases, a separate study of the dynamics of the SVs can provide additional information. Moreover, it turns out that a systematic analysis of the behaviours in state-space is very informative also when a number of parameter solutions have been collected using an ensemble approach. Here we hypothesize that even if the cardinality of fsFPS is large, the cardinality of the set of their associated behaviours (denoted by beh(fsFPS)), will be considerably smaller because several behaviours will be ‘similar’. This observation underlies the fact that it is possible to find a reduced number of mechanistic hypotheses despite the uncertainty in parameter values. To this end, we therefore consider the reduction of the model complexity in order to identify the ‘key dynamics’ of the model.
5.2.1. Reduction of model complexity
A large body of work has focused on how to reduce the number of SVs and parameters in a dynamical model . For example, lumping of an ODE model reduces the system to a set of differential equations with a smaller set of new SV. Some of the new SVs correspond to a combination of several original ones . However, even if new methods are being developed for automatic lumping , this is still an active research field. Models with large number of parameter sets and low number of SVs can use analytical techniques to explore patterns in the dynamics which arise due to timescale separation in the system. In particular, a method described in Härdin et al.  surveys the behaviour of the system along the so-called slow invariant manifolds, i.e. curves or surfaces in the phase space towards which the behaviours in the system are attracted. This analysis provides the investigator with a simplified picture of the global behaviour of the system dynamics, i.e. behaviour common for a wide range of initial conditions.
5.2.2. Reduction of model complexity by key behaviours
With an analysis of the PS and the behaviours of the SVs as described above we can by reduction begin to dissect the essential dynamics of the model. This set the stage for identifying the key different behaviours of the system. Battogtokh et al.  proposed how to analyse the behaviours for SVs. Cedersund et al.  and Brannmark et al.  developed techniques enabling the identification of core predictions in a given model. However, to the best of our knowledge no previous work has proposed an algorithm for the automatic identification of all qualitatively different behaviours in FPS.
To this end, we have developed the following pipeline:
— Phase 1. As we are interested in the behaviours and not in the actual values, behaviours of every SV are normalized for every parameter set by dividing by their maximum value. Then all behaviours are 0-1 normalized.
— Phase 2. Afterwards, beh(fsFPS) is clustered for each SV.
— Phase 3. For each SV, a final number of clusters is selected.
— Phase 4. Finally, as each cluster defines a maximum number of qualitative different behaviours for each SV, then each parameter set will be clustered in different groups for each SV. It is possible to compute all observed combinations and compare it against all possible combinations. The set of observed combinations is the set of qualitative different behaviours that we denote by ‘key behaviours’ (KBs).
To clarify this pipeline we show an example where KBs are identified in a 2-SV ODE model (figure 3). Figure 3a shows the behaviours for two SVs, named SV1 and SV2, in a 2-SV model. For each SV, behaviours are clustered in 3 and 2 clusters, respectively. Figure 3b displays the PS and fsFPS where each ps is classified depending on the clustering for SV2. Next a support vector is computed to reveal the combination of parameters that separates both sets. Figure 3c shows all possible combinations of clusters, however it is shown that despite all possible combinations only few of them have representatives. Hence, here we can observe a combinatorial reduction of the inherent complexity in a model: dependencies between different SV in a model prohibits arbitrary combinations of key SV behaviours. Hence, in the end, only a subset of combinations is allowed and these combinations constitute distinct and robust mechanistic hypotheses for the biological system of interest.
5.3. Perturbations of interest
Using the analysis of KBs within a model we can address how to identify competing mechanistic hypotheses (see above). Here, we can further classify the parameter sets in fsFPS by how they behave under certain perturbations of interest. This is different to sensitivity analysis as we are not considering small perturbations or robustness analysis but a ‘perturbation of interest’. A perturbation of interest is defined as a question that is biologically relevant and that can be studied through the model. Parameter sets in fsFPS can be classified according to its behaviour under these perturbations.
5.4. Identification of hypotheses
A hypothesis can be defined as a proposed explanation for an observable phenomenon. Therefore any explanatory argument regarding the biological system of interest would be defined as a hypotheses. However, a hypothesis would be of interest if (i) it allows validation and (ii) it has some predictive power.
In the models that we consider there are typically two main types of hypotheses. The first type concerns the identification and characterization of parameters of interest. It is possible to generate testable predictions about how perturbation on those parameters would affect the behaviour of the model or to specify the value of a given parameter (such as in core predictions ). Simultaneously, to classify a parameter as relevant allows one to highlight the relevance of its related biological processes and the elements ‘ruling’ them.
The second type of hypotheses identifies KBs, which explicitly provide competitive hypotheses about the dynamics of the different SV. By using KB a researcher would be able to provide a (reduced) number of explanations for a given experimental observation. It is also possible to predict properties of each KB by studying the sloppy and stiff directions associated to each one.
5.5. Case study: extracting hypotheses within the atherogenesis model
Following the workflow we first analysed parameter values in the A model. We evaluated spread(p) over fsFPS. We observed the distribution of the measure for all parameters (see the electronic supplementary material, part 4); it returns a single low spread parameter and HS values for the rest of the parameters. The identified outlier is related to LDL oxidation. Next we analysed fsFPS by clustering the parameter sets by k-means algorithm . Notably by this procedure, no relevant Euclidian clustering was found in fsFPS (see the electronic supplementary material, part 4). Our conclusion is that it is indeed hard to extract information from the A model based uniquely on the PS. We thus turned to analysing the set of behaviours of the fsFPS for each SV, denoted by beh (fsFPS>, SV), in order to identify KBs and mechanistic hypotheses. As behaviours are continuous, we sampled and normalized 60 equally distant time points for each SV and used this sample to cluster parameter sets. Figure S6a (in the electronic supplementary material, part 4) shows beh(fsFPS, oxLDL), where Figure S6b,c show the 2-mean and 3-mean clusterings. Figure S6d show clustering evaluation by using the measures developed by Silhouette  and Davies–Bouldin  for k = 1,…, 5. We repeated the process and used clustering measures to select the final number of clusters for each SV. In the A model we identified one cluster in LDL, HDL, MP and PLAQUE, two clusters in oxLDL and Bcell and three clusters in TINF and TINH. The total number of combinations is 36. Interestingly, despite the combinatorial explosion, two combinations captured more than 33 per cent of parameter sets in FPS (see the electronic supplementary material, part 4). Those two combinations can be considered as ‘KBs’. These two are therefore two relevant hypotheses in plaque development dynamics (see figures 4 and 5).
Next we performed a perturbation experiment. From a biological point of view it is relevant to analyse if the rapid switch in the disease development is time-dependent with respect to lipid lowering. We aim to answer this question by lowering LDL-blood at different time points and with different intensities, and compare the plaque size at the end of the simulation for each combination (see figure 6). An interesting result was the observation of a new grouping of fsFPS in three clusters: (i) plaque size development depends on the time the perturbation takes place, (ii) it depends also on the intensity, and (iii) it is totally insensitive to LDL-blood perturbation (see the electronic supplementary material, part 4). We integrated all aspects by using a support vector machine (SVM) algorithm  to identify the vector that separates clusters in the PS (see the electronic supplementary material, part 4). Here we identified that biological processes involved in LDL clearance (such as B-cell digestion and the inhibition of TINF cells) are relevant in the differentiation of the two main KBs.
6. Step iv: validation
Here we need to develop methods that enable the identification of efficient experimental manipulations that are sufficient to distinguish and validate the competing hypotheses. The previous analyses on how to use machine-learning and clustering methods for pin-pointing essential SVs provide us with a machinery which can separate between competing mechanistic hypotheses. Using this line of reasoning we suggest that the various mechanistic hypotheses put forward by the A model could be validated experimentally by repeating the experiments of Skogsberg et al.  and measuring T-cell activation and macrophage activation over time. Finally, a new experiment could be designed where LDL is perturbed during plaque progression at different time points. In contrast, Brannmark et al.  and Cedersund et al.  develop an interesting framework where experiments are included in an iterative process that continuously adds information that can validate or refute a core prediction.
7. Concluding remarks
We have developed a workflow for extracting competing mechanistic hypotheses from large over-parametrized models that are designed to capture the known or possible interactions between processes, but which suffer from limited available prior knowledge and experimental data. This method addresses this uncertainty by considering a sample of the FPS and by characterizing and clustering them. The main idea is, on one hand, to merge and organize available model analysis frameworks in the literature and, on the other hand, to automatize the classification and generation of model-based hypotheses.
We verified the proposed workflow using the A model of atherogenesis. This allowed us to identify KBs of the SVs governing the development of the atherosclerotic plaque. We could do this despite challenges such as the high dimensionality of the PS, the strong parameter uncertainty and the potential combinatorial complexity in the number of solutions. Interestingly, we observed a surprisingly small number of classes of competing hypotheses that were consistent with the experimental data. The reduced number of model behaviours provides us with a few testable hypotheses on the mechanisms for the rapid increase in atherosclerotic plaque at a given time in disease development .
Yet, our outline of a workflow is a first step where there are several parts that can be further improved in future studies of computational models of increasing complexity. Areas for additional research include the optimization procedure, selection of KBs and, to generalize the workflow to enable its use for a broader set of models.
J.T. was funded Swedish Research Council (VR) and Stockholm County. J.T. and D.G. were supported by Karolinska Institutet and Virtual Physiological Human Network of Excellence (VPH-NoE, Grant agreement no.: 223 920). A.C. was funded by Ministry for Science and Innovation of Spain, Volkswagen Foundation and European Regional Development Funds. D.G. was funded by European Science Foundation LESC/EMRC FFG Reference 2669.
One contribution of 17 to a Theme Issue ‘The virtual physiological human’.
- Received February 14, 2011.
- Accepted March 7, 2011.
- This Journal is © 2011 The Royal Society