## Abstract

Multi-site enzymes, defined as where multiple substrate molecules can bind simultaneously to the same enzyme molecule, play a key role in a number of biological networks, with the *Escherichia coli* protease ClpXP a well-studied example. These enzymes can form a low latency ‘waiting line’ of substrate to the enzyme's catalytic core, such that the enzyme molecule can continue to collect substrate even when the catalytic core is occupied. To understand multi-site enzyme kinetics, we study a discrete stochastic model that includes a single catalytic core fed by a fixed number of substrate binding sites. A natural queueing systems analogy is found to provide substantial insight into the dynamics of the model. From this, we derive exact results for the probability distribution of the enzyme configuration and for the distribution of substrate departure times in the case of identical but distinguishable classes of substrate molecules. Comments are also provided for the case when different classes of substrate molecules are not processed identically.

## 1. Introduction

The study of rate laws describing enzyme kinetics has a long history, with early representative works being that of Michaelis and Menten [1] and Monod. Analysis of these enzymatic bottlenecks is often assisted by coarse-graining the enzyme into a discrete set of states, e.g. a traditional two-state Michaelis–Menten approximation with distinct states corresponding to enzyme bound or unbound to substrate. However, even simple deterministic Michaelis–Menten models can be difficult to properly analyse without some form of approximation [2–4], and this difficulty is compounded when stochastic dynamics are also considered [5–12]. Given that the majority of cellular processes are governed by enzymatic reactions, progress in the development of intuitive but powerful mathematical approaches to enzymatic kinetics immediately impacts our understanding of many biological networks.

One recent approach to enzyme–substrate kinetics has been queueing theory. Queueing theory, which treats the routing of individual ‘customers’ between different ‘stations’ of ‘servers’, was historically developed for telecommunication networks and service centres [13]. A cursory glance over the queueing theory literature suggests that there are several notable strengths of a potential queueing theory approach to enzyme kinetics. First, queueing theory's fundamental assumption of an underlying discrete and stochastic model leads to a focus on results that are stochastic in origin, which meshes nicely with the realization that natural cellular networks are often noisy [14]. Second, queueing theory's assumption of finite bandwidth for the processing of customers by servers is entirely analogous to the finite bandwidth of reactions catalysed by enzyme. Finally, the heavy influence of queueing theory in a variety of disciplines, e.g. operations research, suggests that some biologically motivated questions tied to queueing may already have been answered in a different context. Pursuit of the queueing approach to biological problems has indeed been fruitful thus far, leading to progress in gene networks [15–19], metabolic networks [20,21], degradation pathways [22], signalling [23,24] and translational cross talk [25].

In this article, we extend a prior investigation [22] of ClpXP proteolytic kinetics. ClpXP is a well-studied protease in *Escherichia coli* that is responsible for degrading mistranslated proteins and (in healthy conditions) certain stress response factors [26–28]. ClpXP's function is greatly enhanced by the presence of multiple binding sites for substrate via interaction with a molecular chaperone, SspB [29–35]. Absence or dysfunction of the SspB molecule is strongly associated with reduced substrate affinity, since SspB bound to ClpX binding sites forms a ‘waiting line’ of substrate for the ClpXP catalytic core, thus preventing the catalytic core of ClpXP from being unoccupied by substrate for any appreciable duration of time. An appropriate queueing analogy for a single ClpXP molecule is then a server (catalytic core) that selects customers (protein substrate) at random from a finite capacity waiting line (SspB–substrate complex associated with ClpX binding sites) and processes (degrades) the customers (substrate). In the following, we analyse this model in some detail, and we argue that the multi-site nature of the enzyme drives the model into a regime well approximated by traditional queueing models. It is also worth mentioning that our investigation of multi-site enzyme kinetics is also multi-class, i.e. exactly treating the dynamics of multiple types of substrate competing for the same enzyme. This analysis is assisted by a new result, independent departure symmetry, which generalizes results originally derived for quasi-reversible queues. These multi-class results rely on the assumption that different classes of substrate are distinguishable but otherwise identical, but we will comment on why we anticipate that our results are not particularly sensitive to weakly breaking this assumption.

The coarse-grained models used to derive these results are sufficiently general that a number of other multi-site enzymes, e.g. AAA+ proteases other than ClpXP [36], may have similar waiting line behaviour. For example, the protease ClpAP uses the chaperone ClpS in much the same way ClpXP uses the chaperone SspB. Truly, given the many advantages of a multi-site motif, we anticipate our results will generalize to a wide array of enzymes in native and synthetic biological networks. Despite this, we will typically refer only to ClpXP when discussing results from our model, since it is one of the best understood multi-site enzymes.

In line with queueing theory [13], many of our results pertain to statistics of the ‘departure’ times of substrate from the enzyme, i.e. the times when substrate has been processed and expelled from the catalytic core. Understanding the statistics of departures is a natural way to characterize the behaviour of such discrete stochastic processes, e.g. the departures of a transcription process can be used to investigate burstiness of mRNA production [37]. We find that our multi-site enzyme models are far from bursty, with the departure process often being well represented by a Poisson process (independent exponential times between departures), which incidentally is the same result found in a variety of classical queueing networks [38].

The article is organized in the following way. In §2, a discrete stochastic model for multi-site enzyme kinetics is presented. Steady-state analysis in the case of a single class of substrate appears in §3, while the multi-class case is treated in §4. Section 5 investigates the influence of time-dependent substrate levels on single-class and multi-class settings. Concluding comments appear in §6.

## 2. Stochastic model definition

We wish to study a coarse-grained model for ClpXP (and similar enzymes), with a specific emphasis on multi-site dynamics, where multiple substrate molecules can bind and wait while another substrate molecule is being processed by the catalytic core. Illustrated in figure 1*a*, our queueing-inspired model for proteolytic degradation by ClpXP is chosen to be a system with *N* binding sites for substrate (the waiting line), a single catalytic core (the server) and *R* total species (classes) of substrate. When discussing ClpXP specifically, a value *N* = 6 would be reasonable based on the sixfold symmetry of ClpX, though the situation is made slightly more complex owing to SspB–substrate complexes typically binding to ClpXP as dimers [34] (we do not include this feature of ClpXP dynamics in the model). Unoccupied binding sites are occupied by substrate of class *i* with rate *η _{i}*(1 ≤

*i*≤

*R*). An occupied binding site can deliver substrate of class

*i*to be processed by an unoccupied catalytic core with rate

*ξ*, and then the occupied catalytic core processes substrate of class

_{i}*i*with rate

*μ*. For reference, an analogous Michaelis–Menten system would only model the catalytic core.

_{i}A set of reactions for the model are then
2.1
2.2
2.3where *i* is a label for different kinds of substrate, *X _{n}* is the unoccupied

*n*th binding site (1 ≤

*n*≤

*N*),

*X*is the

_{n}S_{i}*n*th binding site occupied by substrate type

*i*,

*Y*is an unoccupied catalytic core and

*YS*is the catalytic core occupied by substrate type

_{i}*i*. The rate constants

*ξ*, and

_{i}*μ*for passing and degrading substrate, respectively, can depend on substrate type

_{i}*i*, but we set them to constants (

*μ*=

_{i}*μ*,

*ξ*=

_{i}*ξ*). This simplification in effect assumes that substrates are treated identically once bound, which will be important in our multi-class analysis. Furthermore, we suppose that

*ξ*→ ∞, i.e. that substrate is rapidly passed from occupied binding sites to an unoccupied catalytic site.

Free substrate is not treated explicitly in the analysis, but, instead, the rate constants *η _{i}* subsume the influence of free substrate concentration on the binding rate. Thus, we make a quasi-steady-state (Michaelis–Menten-like) approximation for the enzyme [10]. In the limit of large substrate count, where the concentration of substrate does not change appreciably on the time scale of enzyme kinetics, we predict that this approximation will be valid. It is worth noting that simpler models for multi-class enzyme kinetics have already been analysed in detail outside of the quasi-steady-state assumption [22], in which case free substrate can exhibit strongly correlated behaviour.

Finally, we neglect premature unbinding of substrate from the enzyme. This approximation is motivated by a large effective affinity of substrate to enzyme, e.g. owing to cooperative binding of the chaperone SspB. In the queueing interpretation, this is the same as supposing that customers do not renege, i.e. walk away from the service station owing to periods of inactivity. Refinement of the model to include the effects of substrate unbinding is straightforward, though it complicates analysis of the model.

Subsequent sections make use of the fact that, when analytic investigation is not easily achievable, the system represented by equations (2.1)–(2.3) maps precisely to a corresponding chemical master equation, which can be analysed numerically to find the probability for any system configuration [39]. For *R* substrate types, there are (*R* + 1)^{N}^{+1} potential system configurations. Though the system size grows exponentially with *N*, the system with two classes of substrate and six binding sites has only 2187 probabilities to keep track of at any point in time, and if we also keep track of the last two departures (two additional virtual sites in the system occupied by the two substrates previously processed by the catalytic core), the number of probabilities increases to 8748. A modern computer can readily handle systems of these sizes, especially given the fact that the matrix governing evolution of this system is sparse. We use custom Python programs importing the SciPy package to perform these calculations [40].

## 3. Steady-state analysis for a single class of substrate

We first consider the dynamics of a multi-site enzyme in the case of a single class of substrate. This treatment reveals some of the most interesting properties of multi-site enzymes, including the ability for the binding sites of the enzyme to dramatically (cooperatively) increase the apparent affinity of the enzyme. Our analysis of a single class of substrate is applicable to several situations, including (i) when the enzyme is specific to a single class of substrate, (ii) when several very similar substrate classes are labelled as a single substrate class, and (iii) when only a single substrate class is sufficiently abundant to be relevant. The last two situations are relevant to ClpXP dynamics, since ClpXP generally degrades a variety of different substrates.

### 3.1. Steady-state distribution

At steady state, a multi-site enzyme is almost never fully depleted of substrate except at very low enzyme count, which implies that the enzyme is almost always busy processing bound substrate. We derive this property below by calculating the steady-state distribution of the enzyme configuration.

Define the occupation number *Z*, with 0 ≤ *Z* ≤ *N* + 1, such that the enzyme is devoid of substrate when *Z* = 0, while the enzyme possesses an occupied catalytic core and *M* = *Z* − 1 occupied binding sites for larger *Z*. Thus, the formula *M* = max(0, *Z* − 1) is a uniformly valid expression for the number of occupied binding sites. Also define the total rate of substrate binding *η* ≡ ∑_{i}*η _{i}*. The dynamics for

*Z*can then be specified by the reactions 3.1and 3.2where we specify the propensities (i.e. excluding mass action terms), and where

*Θ*(

*Z*) is a Heaviside function (

*Θ*(

*Z*) = 1 for

*Z*> 0, and

*Θ*(

*Z*) = 0 otherwise).

If *P*_{s}(*z*) is the steady-state distribution Pr(*Z* = *z*, *t* = *∞*), then an elementary calculation for the one-dimensional master equation shows that the steady state is achieved by the zero net current (detailed balance) condition:
3.3This recursion relationship is simpler when expressed in terms of the complementary variable , i.e. the number of unoccupied sites. The recursion relationship in the new variable corresponds to the steady-state distribution for a complementary system with constant production rate *μ* and degradation rate (per substrate) *η*, excepting for a correction to the dynamics when . It is well known that this complementary system should have a distribution similar to a Poisson distribution, and, with this inspiration, it is easy to find the steady-state distribution in the original variable
3.4and
3.5with *λ* ≡ *μ*/*η*, and a normalization factor such that . can be calculated numerically, but it also can be expressed in terms of incomplete gamma functions.

One limiting case is when , i.e. . This limit corresponds to the saturated regime, where the probability for the enzyme to be totally unoccupied (*P*_{s}(0)) is small. Since the distribution is then approximately a Poisson distribution (in the variable ), the normalization constant can be accurately approximated by the usual expression . The important quantity *P*_{s}(0) is then estimated to be
3.6This value picks up a cooperativity of *N* + 1 with respect to the smallness parameter *λ*. Thus, multi-site enzymes can drastically increase their effective substrate affinity by leveraging a ‘waiting line’ for substrate. Figure 2*a* (large *ηN*) illustrates this effect. In its native context, ClpXP's multi-site nature allows it to efficiently degrade even small concentrations of potentially harmful prematurely terminated proteins.

Multi-site enzymes lose their effectiveness when , i.e. . In this case, the enzyme is mostly unoccupied, and the system tends to have at most one substrate bound, i.e. . The situation is then similar to the Michaelis–Menten case (figure 1*d*). We can then estimate the probability
3.7Thus, enzymes with different *N* parameter values but common *μ* parameter values act similarly for equal effective substrate binding rates *ηN*. Figure 2*a* (small *ηN*) illustrates this effect.

### 3.2. Enzymatic velocity

The steady-state velocity *V*_{cat} of the enzyme, i.e. the number of substrates processed per unit time, is *μ* times the probability that a substrate is occupying the catalytic core
3.8or approximately
3.9and
3.10leading as before to highly cooperative behaviour for , and non-cooperative (hyperbolic) behaviour for . It is worth reiterating that the strongly unsaturated case leads to kinetics of the usual Michaelis–Menten form.

### 3.3. First departure time and next departure time

Knowledge of the functional form for the steady-state velocity *V*_{cat} would be largely sufficient for a deterministic approximation, but the stochastic nature of our model begs for an analysis that quantifies the probabilistic nature of enzymatic activity (this is a key feature of queueing models). Specifically, we desire to understand the stochastic process for the duration of time between successful enzymatic reactions, or, in queueing parlance, the departure process. A convenient measurement in this regard is that of the first departure time (FDT) *T*_{1}, which is the duration of time to the next departure at steady state. A higher order statistic is the next departure time (NDT) Δ*T* = *T*_{2} − *T*_{1}, which is the interval of time between the first and the second observed departures at steady state. When the steady-state departure process is a Poisson process, as occurs for a wide variety of queueing networks at steady state [41–43], then knowledge of the mean FDT or NDT is sufficient to define the departure process. For a renewal process, the NDT distribution is sufficient to define the dynamics of the system [44–46].

Consider first the cumulative probability function *P*_{FDT}(*τ*_{1}) ≡ Pr(*T*_{1} > *τ*_{1}) for the FDT. Though this distribution can be difficult to obtain in general, the structure of our multi-site enzyme allows for a closed form solution in the limit *ξ* → *∞*. If the enzyme is initially in a state *Z* ≥ 1, such that the catalytic core is occupied by substrate, then a single degradation reaction (rate *μ*) is sufficient to produce a departure. For initial state *Z* = 0, the FDT is instead the convolution of binding at least one substrate (rate *ηN* since *Z* = 0) followed by degradation (rate *μ*). A straightforward calculation reveals the conditional probabilities
3.11and
3.12

The full FDT distribution is then the weighted sum
3.13with *P*_{s}(0) obtained from equation (3.5).

As already mentioned, the departure process for a wide variety of queueing networks is Poisson, which has an exponential distribution, , for some *r* > 0. Our multi-site enzyme model does not generally satisfy the conditions for this simple departure distribution; however, the cumulative distribution equation (3.13) can approximate an exponential function arbitrarily well depending on the value of *λ* = *μ*/*η*. This can be shown by calculating the integral squared deviation between the FDT cumulative distribution and a cumulative exponential distribution with the same mean departure time (*r*^{−1} = (1/*μ*) + (1/(*ηN*))*P*_{s}(0)), which after a straightforward calculation is
3.14
3.15

Equation (3.15) approaches zero for both and (figure 2*b*), suggesting these limits recover aspects of a Poisson departure process. The saturated regime, where *P*_{s}(0) is very small owing to cooperative effects, can lead to a particularly good approximation of the departure process by a Poisson process. An intuitive explanation for this is that if the catalytic core is always occupied, then departures occur with a constant rate *μ*, thus generating a Poisson process.

It is an interesting fact for our multi-site enzyme model that the NDT distribution Pr(Δ*T* > Δ*τ*), describing the duration of time between the first and the second departures from a steady-state condition, is precisely the same functional form (swapping *τ*_{1} for Δ*τ*) as the FDT distribution. A sketch for this calculation is relegated to appendix A. One can show in an entirely analogous manner that a queue with constant arrival rate and constant service rate has this same property [13], though this queue also satisfies this identity for higher order departures, while our multi-site model generally does not. This deviation from the queueing prediction is related to arrivals (substrate binding events) in our system not being Poisson, unlike the queueing case.

## 4. Steady-state analysis for multiple identical but distinguishable classes of substrate

Suppose that multiple classes of substrate are handled identically by the enzyme once bound. Then, as we will show, the analysis of the model at steady state in the case of multiple distinguishable but otherwise identical substrates follows almost immediately from the single-class case. In the biological context, this analysis is an idealization of the case where several similar classes of substrate compete for common processing, e.g. when fluorescent proteins YFP and CFP compete for degradation by ClpXP. We do not anticipate that our results are fragile to weakly breaking our underlying assumptions, as we briefly discuss in this section.

First note that since different classes of substrate are treated identically by the enzyme in our model, owing to *ξ _{i}* =

*ξ*and

*μ*=

_{i}*μ*being class-independent constants (see equations (2.1)–(2.3)), the dynamics without regard to substrate class are the same as in §3. Recovery of the substrate class information from the single-class results is possible at steady state owing to a symmetry preserved by the system, where the class of each substrate within the system remains identically but independently distributed with respect to all other substrate molecules in the system. Because the statistics for a set of substrate molecules can be derived from a list of identical but independent random variables, we term this an ‘independent list symmetry’. Proof of this result can be done in a manner entirely analogous to the proof in Mather

*et al.*[24]. We have included an intuitive explanation in figure 3, with further comments in appendix B.

The probabilities for substrate class are entirely determined by the rate at which each class arrives into the system. Define the probabilities for arrivals of substrate class *i*
4.1These probabilities are normalized, . Given that the enzyme treats different classes identically, it can be shown that independent list symmetry follows (perhaps after a short transient) if each *ρ _{i}* is constant in time. In principle,

*η*values can then be time-dependent while keeping each

_{i}*ρ*constant in time, but we only pursue steady-state results in this section.

_{i}The steady-state distribution can be written in the following way. Define the random variable ** Q** to be the state descriptor of the system. The component

*Q*(with 0 ≤

_{n}*n*≤

*N*) describes the state of the catalytic site when

*n*= 0 and the state of a particular binding site when

*n*> 0. The value of

*Q*is an integer in the range 0 …

_{n}*R*, with

*Q*= 0 indicating the site does not contain substrate, and with

_{n}*Q*> 0 indicating the class of substrate bound. Define a dependent random variable

_{n}**describing the occupancy, with**

*M**M*= 0 if

_{n}*Q*= 0, and

_{n}*M*= 1 if

_{n}*Q*> 0. Thus,

_{n}*M*=

_{n}*Θ*(

*Q*). For simplicity, define

_{n}*ρ*

_{0}= 1, which can be interpreted as that there is only one way to have an empty site. A system with independent list symmetry then has the probability distribution 4.2The term Pr(

*M*=

_{n}*Θ*(

*q*),

_{n}*n*= 0 …

*N*) is precisely the probability distribution for substrate to be bound to certain sites without regard to substrate class, which was explored previously in §3. The product

*Π*

*then provides all the information pertaining to substrate class. In short, equation (4.2) corresponds to the class of each substrate for occupied sites being independently but identically distributed conditional on knowing which sites on the enzyme are occupied.*

_{n}*ρ*_{qn}Another way to interpret equation (4.2) is that the independent list distribution maximizes entropy of the distribution given minimal constraints (appendix C). In short, the mean probabilities *ρ _{i}* provide all of the information relating to substrate class, and there exists no additional information on the substrate class for any particular site or sites in the queueing system. We anticipate but do not prove that this optimization principle leads to robustness, such that systems weakly breaking our independence assumption should have probability distributions close to maximal entropy distributions. Investigation of this hypothesis will be left for future work.

Given the distribution in equation (4.2), many results follow [22,24]. For example, it is easy to show that, for a definite number of substrate molecules, the frequency of substrate classes is multi-nomial. Now, we find that another significant consequence of independent list symmetry is independent departure symmetry. Intuitively, since the class of substrate located on the catalytic core is always identically but independently distributed relative to other substrate molecules, the probability distribution describing the class of each departure should also be independently distributed for each departure (appendix B). The full departure process can then be described by the probability distribution for the departures
where *T _{n}* and

*Q*are the departure time and substrate class, respectively, at the

_{n}*n*th departure. The function Pr(

*T*

_{1}>

*t*

_{1}; … ;

*T*>

_{m}*t*) encodes all of the information for the single substrate class system (e.g. from §3.3), and

_{m}*Π*

*encodes the multi-class information.*

_{n}*ρ*_{qn}We can compare equation (4.3) to classical results in queueing theory. For example, in quasi-reversible multi-class queueing networks [38,43,47,48], the departure process is equivalent to the superposition of independent Poisson processes for each substrate class. Quasi-reversibility then implies exponentially distributed times between each subsequent departure of a given class (or for all classes). The result in equation (4.3) includes the quasi-reversible result as a special case, but (4.3) can be much more general, allowing potentially complex statistical dependence between the times of different departures, but preserving independence for the class of different departures.

Given that we previously found that departures in our model are in a certain fashion well approximated by a Poisson process with appropriate mean rate (recall the discussion in §3.3), especially when , and given that we now find independence of classes at each departure, then the departure process for multiple substrates strongly resembles independent Poisson processes for each class, as is found for many quasi-reversible queueing systems. Apparently, the multi-site nature of our model makes its departure process very much like a classical queueing process, which greatly simplifies our understanding of the model's behaviour. This tantalizing evidence begs for further enquiry but is left for future studies.

## 5. Time-dependent substrate binding rates

We can expect that the rates *η _{i}* vary as a function of time in a real system, e.g. owing to depletion of substrate through enzymatic activity. The probability distribution for the enzyme configuration in this case will not necessarily be well approximated by quasi-steady-state results. A reasonable question to ask therefore is how sensitive our multi-site enzymatic system is to fluctuations in

*η*. We find numerically that multi-site enzymes tend to buffer fluctuations and thus remain close to certain quasi-steady-state assumptions, thus providing hope that the behaviour of multi-site enzymes in biological networks can be simplified using quasi-steady-state approaches.

_{i}Variation of the time-dependent rates *η _{i}*(

*t*) can occur in at least two different ways. The simplest is in-phase variation, such that the substrate probabilities

*ρ*from equation (4.1) are constant in time. In this case, the independent list symmetry can be maintained, and analysis of the single-class system with time-dependent

_{i}*η*(

*t*) = ∑

*(*

_{i}η_{i}*t*) is sufficient to analyse the full system. A central variable when studying the departure process is the probability

*P*(0,

*t*) ≡ Pr(

*Z*= 0,

*t*) for the enzyme to be unoccupied at a given time

*t*(see §3). Analytical treatment of this situation is beyond the scope of this article, and so we numerically investigate how variation of the input rate leads to subsequent variation in the departure rate by measuring the variation in

*P*(0,

*t*) owing to periodic variation of

*η*(

*t*). We find that the multi-site structure leads to improved buffering of fluctuations in the saturated regime (figure 4).

The alternative variation in the rates *η _{i}*(

*t*) is balanced variation, where

*η*= ∑

*(*

_{i}η_{i}*t*) is constant, but each

*η*(

_{i}*t*) can be time dependent. The corresponding single-class version of the multi-class system has constant binding rate

*η*and simply approaches steady state. Hence, all interesting non-stationary phenomena after a short transient are multi-class in origin. We attempt to distil such phenomena by measuring statistical dependence between the substrate classes of the first and the second most recent departures from the enzyme. Correlation between these departures is presented in figure 5. We find that, even in the worst case, deviations from the zero correlation limits are small. Similar results were found when we investigated the mutual information between the substrate classes for subsequent departures.

The present investigation suggests that the multi-site structure of the enzyme may effectively buffer both single-class and multi-class variation. Buffering in the single-class case is indeed reasonable, since an enzyme with many binding sites more rapidly becomes saturated (figure 2*a*). Buffering of fluctuations in the multi-class case is more interesting, since perturbation of the probability distribution can occur (in principle) even for saturated systems. A full analysis of this situation is beyond the scope of this article, but we attempt to provide an explanation in the limit where *η* → *∞* and *N* is very large. The system then requires a time greater than the natural time scale *N*/*μ* to significantly affect the abundance of each substrate class on the enzyme. Since this time scale is much larger than the time scale 2/*μ* to sample two departures, and since the class of departures is directly determined by the relative class abundance on the binding sites, then the system may remain in a quasi-steady state with respect to departures. Note that this result does not necessarily imply that the system is in quasi-steady state with respect to the instantaneous rates *η _{i}*(

*t*).

## 6. Discussion

Development of a coherent biological queueing theory is ongoing, and many questions relating to its ultimate usefulness remain unanswered. Among these questions is just how often can a queueing formalism bring quantitative insight to a particular real biological system? In this article, we studied a model that was motivated by the known properties of the ‘waiting line’ structure of the *E. coli* protease ClpXP and related proteases.

We found that not only does a waiting line structure lead to a strong cooperative increase in the effective affinity of substrate to the enzyme, but the waiting line can also push the model deeply into a queueing regime, where the stochastic process describing the completion of enzymatic reactions at steady state is very similar to the departure process for traditional queueing models, i.e. Poisson processes. In the case of multiple classes of substrate, an apparently new queueing-inspired result, independent departure symmetry, was determined to hold for the departure processes of a wide range of enzyme models, even when traditional queueing theory does not apply. Finally, we numerically investigated the response of our enzyme to time-dependent perturbation, and we found preliminary evidence that the waiting line structure can effectively buffer the system from fluctuations.

Further investigation of multi-site enzyme kinetics is certainly warranted for a number of reasons. Especially, a fundamental assumption in our analysis was that different classes of substrate are treated identically by enzyme, i.e. *ξ _{i}* and

*μ*did not depend on substrate class

_{i}*i*. Breaking this assumption can be expected to break the independent list symmetry and independent departure symmetry discussed in this article, which greatly complicates the analysis of the model. The situation is not hopeless, however, as the theory for priority queues (preemptive and non-preemptive) itself has a long history and may assist in the analysis of enzymes with strong asymmetry in their processing rates. Preliminary investigation suggests that for these more general multi-site enzyme models in certain parameter regimes, such as saturated regimes (large

*η*), the resulting departure process may be sufficiently simple to analyse.

_{i}A proper stochastic analysis of the accuracy of the Michaelis–Menten (quasi-steady-state) approximation for multi-site enzymes would also be welcomed. For substrate count sufficiently large, it is sensible that the quasi-steady-state approximation is valid. However, for small free substrate count, a number of details may have to be treated more carefully, e.g. competition for free substrate between enzymes, and the diffusion-limited dynamics of substrate binding to enzyme. Such details are outside the scope of this article, but we anticipate that queueing theory will also offer insight into this biophysical problem.

## Funding statement

This work was supported by the National Science Foundation (grant no. 1330180).

## Acknowledgements

We would like to thank Ruth Williams and Nicholas Butzin for stimulating exchanges during the formation of this manuscript.

## Appendix A. Formalism for next departure time calculation

Calculation of the NDT cumulative distribution Pr(Δ*T* > Δ*τ*) can be done in the following way (see §3.3 for context). Define the joint probabilities *W _{n}*(τ

_{1}) that the FDT (labelled

*T*

_{1}) exceeds

*τ*

_{1}, and that

*Z*has a certain value or range of values at

*τ*

_{1}A 1 A 2 A 3

It is straightforward to show that these quantities satisfy the ODEs A 4 A 5 A 6

Equations (A 4)–(A 6) can be solved iteratively, starting from *W*_{0}(*τ*_{1}). With these solutions, define the infinitesimal joint probability d*F*_{n}(*τ*_{1}) that the FDT is precisely in a time d*τ*_{1} about *τ*_{1} and the state *Z* = 0 (*n* = 0) or *Z* ≥ 1 (*n* = 1) immediately after a departure (see, for example, [49] for a discussion of revising probabilities after a particular reaction). Then
A 7
A 8Supposing these infinitesimal probabilities, the density (in *τ*_{1}) for the cumulative probability for the next departure to be a time Δ*T* > Δ*τ* can be derived in the same way as equation (3.13), leading to the equation
A 9which can be integrated over *τ*_{1} to provide the full cumulative distribution
A 10Once this calculation has been done starting from the steady-state distribution for the multi-site model in equations (2.1)–(2.3) (with *ξ* → *∞*), it is straightforward to show that the single-class FDT and NDT cumulative distributions have precisely the same functional form, i.e.
A 11though we do not provide the details of this calculation here.

## Appendix B. Independent list symmetry and independent departure symmetry

We here provide a few additional details regarding the independent list symmetry and independent departure symmetry discussed in §4.

First, the preservation of independent list symmetry by a variety of queueing-associated operations can be demonstrated with minimal difficulty. Once this is done, it then is a trivial matter to extend the argument to time-continuous evolution [24]. We begin with an independent list probability distribution (see equation (4.2))
B 1One of the most fundamental operations in queueing theory is the act of moving customers from location to location. Such operations can rather generally be represented by a permutation of the current state of the queueing network. Define an operator and associated matrix that perform a permutation on the state vector,
B 2
B 3
B 4where is a permutation of ** q**. Equation (B 4) consists of two pieces, with only relating to information on substrate occupancy in the permuted state, and containing all information for the multi-class distribution. Equation (B 4) is precisely the same form as the independent list distribution in equation (B 1), and, thus, permutations of enzyme sites preserve the independent list symmetry.

Another extremely important operation in queueing theory is the addition of a new customer to the network. Define an operation that adds a customer to site *h*, with an independently distributed random class determined according to the probabilities *ρ _{i}* in equation (4.1). Define an associated matrix that converts the value at index

*h*to value

*j*. Then, operation of the arrival operator on an independent list distribution leads to the following equations (the notation Pr(

*M*=

_{h}*z*) is shorthand for the probability that

**=**

*M***(**

*Θ***) for all indices except**

*q**h*, where

*M*=

_{h}*z*): B 5 B 6 B 7 B 8 B 9which is of the independent list form. Similarly for a deletion operator that replaces site

*h*with an unoccupied site (

*δ*(

*Z*) = 1 if

*Z*= 0, and

*δ*(

*Z*) = 0 otherwise) B 10 B 11 B 12 B 13 B 14which is again of the independent list form. Note that the step between equations (B 12) and (B 13) used our convention that

*ρ*

_{0}= 1.

In this manner, it can be demonstrated that any composition of these shuffling, addition and removal operations leads to another independent list. An approach similar to that in Mather *et al.* [24] then extends the above results to the time-continuous case.

Once independent list symmetry has been shown to exist for a system, independent departures follow when the rule for selecting departures is independent of the class of customers in the system. To show this, we introduce an auxiliary queue that keeps track of the last several departures (figure 6). The method to prove independent list symmetry can then be applied to the joint system (departure queue plus the original system), since departures are simply customers moving from the original system to the departure queue. Thus, the departure queue maintains independent list symmetry, implying that adjacent departures are statistically independent from one another. We do not provide a full derivation here.

## Appendix C. Independent list symmetry as a consequence of maximum entropy

The independent list distributions discussed in appendix B can be understood also through an entropy maximization principle. Intuitively, independent list distributions are the distributions of the greatest uncertainty conditional on known mean abundance for each substrate class in the queueing system. The fact that independent list distributions are in this way optimal suggests that small perturbations of the queueing system away from identical treatment of substrate will generate steady-state distributions near maximum entropy (deviation from maximum entropy is anticipated to be the second order in the perturbation strength, by optimality). We anticipate but do not prove then that maximum entropy distributions thus robustly predict steady-state distributions for a neighbourhood of systems approximated by identical but distinguishable substrate systems.

To demonstrate that independent list distributions satisfy maximum entropy, assume a queue (length *N* + 1) of a similar type as the above, but only allow non-empty queue sites. The more general case with both filled and empty sites is a straightforward generalization. Label for shorthand *P*(** q**) = Pr(

**=**

*Q***), and define**

*q**m*(

_{k}**) to be the count of substrate class**

*q**k*in

**, where**

*q**k*≥ 1 for this discussion. We can define the entropy as the sum over all possible states of the queue C 1We maximize the entropy subjected to several constraints. First, we constrain the distribution to be normalized C 2We also constrain the mean count of each class

*k*to be

*ρ*, C 3where

_{k}*m*(

_{k}**)/**

*q**N*can be interpreted as the probability of finding a given substrate of class

*k*in a particular queue

**. It can be shown as a consequence of the above two constraints that C 4but this is not an independent constraint. Using the method of Lagrange multipliers, we maximize a function C 5with respect to**

*q**P*(

**) (for all possible**

*q***),**

*q**λ*

_{0}, and each

*λ*. Maximization of

_{k}*Λ*with respect to

*P*(

**) provides C 6which implies C 7or C 8We can adjust the parameters**

*q**λ*

_{0}and each

*λ*to fix this probability distribution in the form C 9which can be shown to satisfy the two constraints of our maximization. Equation (C 9) can be rewritten as C 10which is precisely the independent list distribution from before.

_{k}The more general case of queueing systems with unoccupied sites can be done by also constraining known probabilities Pr(** M** =

**(**

*Θ***)) for occupancy (equation (B 1)). We do not perform these steps here.**

*q*## Footnotes

One contribution of 10 to a Theme Issue ‘Computational cell biology: from the past to the future’.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.