We assume that we are given a constraint-based metabolic model in standard form with its stoichiometric matrix S and flux vector v together with steady state mass balances
and upper and lower bounds for the fluxes
We further assume that, in a preprocessing step, reversible reactions of enzymatically catalyzed reactions in the metabolic network model are split into two irreversible (forward and backward) reactions (with αi ≥ 0).
In order to incorporate adequate enzyme (mass) constraints in a given metabolic model, MOMENT [13] first introduces, for each enzyme-catalyzed reaction i, an enzyme concentration variable gi (mmol/gDW). We initially assume that a reaction is catalyzed by a unique enzyme. The flux vi (mmol/gDW/h) through reaction i is then limited by the product of the enzyme concentration and the (maximal) turnover number, kcat,i (1/h), of this enzyme:
which can alternatively be written as
(Note that the kcat,i values may differ for forward and backward direction of (split) reversible reactions). In order to reflect the limited amount of metabolic enzymes in the cell another constraint is introduced stating that the sum of all enzymes in the model may not exceed a threshold P (g/gDW):
MWi is the molecular weight (g/mmol) of the enzyme catalyzing reaction i. It should be noted that P only refers to metabolic enzymes (covered by the metabolic model) and is thus smaller than the total protein content of the cell.
When applying MOMENT to a genome-scale model, a great number of additional variables gi and their associated constraints (4) must be introduced which may negatively affect the performance of complex analyses of the resulting model. Furthermore, the constraints (4) and (5) cannot be directly integrated into the standard form of a metabolic model represented by (1) and (2). For this reason, MOMENT models cannot be directly treated with standard tools for constraint-based modeling (such as [22–24]). In order to tackle these issues, we developed the sMOMENT (short MOMENT) method which leads to the same results as MOMENT but uses a more compact representation of the model. Using (4) we first substitute gi in (5) and obtain:
We can thus safely use the following alternative for (5):
This inequality can be reformulated as follows:
The auxiliary variable vPool (g/gDW) quantifies the mass of all metabolic enzymes per gram of cell dry weight needed to catalyze the reaction fluxes vi and this value must not exceed the given maximum P. The advantage of (8) is that it can directly be integrated in the standard system defined by (1) and (2) (Fig. 1). First, a pseudo-metabolite (enzyme pool) is added as a new row in the stoichiometric matrix where the stoichiometric coefficient for each reaction i is . Afterwards, a pseudo-reaction Rpool (“enzyme delivery”) is added whose coefficients in S are all zero except unity for the added enzyme pool pseudo-metabolite and the associated “enzyme delivery flux” vPool has an upper bound of P (Fig. (Fig.11).
Augmentation of the stoichiometric matrix with the sMOMENT approach. Mpool is the enzyme pool pseudo-metabolite and Rpool the enzyme-pool-delivering pseudo-reaction. Ri stands for reaction i, Mj for metabolite j; r is the number of reactions, m the number of metabolites
The integration of the enzyme mass constraints in the stoichiometric matrix as shown in Fig. Fig.11 is similar to the one used by GECKO [11] but it markedly differs from it as it avoids explicit introduction of enzyme species and their delivery reactions which largely increases the dimension of GECKO models. To achieve that, special treatment is needed for reactions catalyzed by multiple enzymes as well as for multifunctional (promiscuous) enzymes. The handling of these cases in sMOMENT is similar to MOMENT but again simplified compared to MOMENT’s usage of recursive rules. Herein we consider an enzyme as an entity that can catalyze one or, in the case of multifunctional enzymes, several reactions. An enzyme can be either a single protein or an enzyme complex consisting of multiple proteins. Genome-scale metabolic models often provide gene-enzyme-reaction relationships which are essential to build enzyme-constrained metabolic models because they enable one to associate reactions with their catalyzing enzymes as well as enzymes with the respective genes and gene products needed to build that enzyme (or enzyme complex). We denote by E the set of all q enzymes of a metabolic model:
Every enzyme Ej has its own molecular weight (g/mmol) which can be directly derived from the masses of its amino acids (in the case of enzyme complexes, its molecular weight is the sum of the single protein masses, each multiplied with the stoichiometry of the single protein in the complex). This information is readily available in databases such as UniProt [25]. Additionally, each enzyme Ej has an associated kcat value . With E(i) we denote the enzyme(s) that catalyze reaction i:
For setting the enzyme costs ci = MWi/kcat,i of reaction i in the eqs. (5)–(8) sMOMENT selects the minimal enzyme costs of all enzymes catalyzing reaction i:
This rule used by sMOMENT simplifies the treatment of reactions with multiple enzymes but does not change the feasible flux space because the solution with minimal protein costs used by sMOMENT is contained in the corresponding MOMENT or GECKO model as well (and will in fact be selected in these models by the solver in optimizations where the protein pool becomes limiting). While the flux space of sMOMENT and predictions made therein are thus identical to MOMENT and GECKO, the latter two hold explicit variables for the involvement of each enzyme and can thus account for all possible enzyme combinations that can generate a given flux in the case where a reaction can be catalyzed by multiple enzymes (whereas sMOMENT always assumes that the enzyme with the minimal cost is used). However, this additional information is rarely relevant and in cases where the solutions of the optimization is limited by the protein pool, the enzyme with the minimal enzyme costs (as favored by sMOMENT) will be selected. If a reaction has no associated enzyme we set the term (and thus the enzyme costs) in eq. (8) to 0.
As already stated above, GECKO [11] was introduced as an extension of MOMENT. It uses the same type of enzyme mass constraints but introduces additional reactions and metabolites to explicitly reflect enzyme usage. The disadvantage is that the model size increases significantly which hampers its use in computationally expensive analyses. On the other hand, this representation allows the direct incorporation of measured in vivo proteomic concentrations as upper limits for enzyme usage. Equivalently to GECKO, although not further used herein, it is possible to include proteomic concentration data in the sMOMENT method as well. Assuming we are given the concentration [Ek] of an enzyme Ek (mmol/gDW) and that this enzyme is the only catalyst in the reaction(s) it catalyzes, this immediately sets an upper bound for the sum of all reaction fluxes catalyzed by enzyme Ek:
where R(Ek) denotes the set of reactions catalyzed by enzyme Ek. Similar as we did for the overall protein pool (cf. eq. (7) and (8)) we may include this constraint by adding an additional pseudo metabolite and pseudo reaction in the stoichiometric matrix.
For the case that Ek is not the only catalyzing enzyme in a reaction i it catalyzes, we split this reaction in two reactions with the same stoichiometry, one reaction is now (exclusively) catalyzed by enzyme Ek while the other reaction is catalyzed by all other enzymes of the former reaction i (i.e., E(i)\Ek). Thereby, the rule (11) has to be applied again for both of the new reactions and the respective (possibly adapted) enzyme cost values have to be used in eq. (8) and in the augmented stoichiometric matrix. In case that the split reaction i had a limited flux bound (vi < ∞), additional constraints must be introduced (e.g. “arm” reactions as used in the GECKO approach) to ensure that this constraint is met by the sum of all the reactions obtained by splitting reaction i.
The procedure outlined above has to be repeated for all enzymes with measured concentrations. With a growing set of concentration measurements, this will add several new columns and reactions in the stoichiometric matrix. However, concentration measurements are often available only for a small fraction of all enzymes. In these cases, the size of the augmented sMOMENT model as described above will still be significantly smaller than a fully expanded GECKO model. If concentrations are specified for all enzymes then the resulting model will, in fact, be an analogon to a GECKO model with the same number of reactions and metabolites. In principle, when using the AutoPACMEN toolbox (see below), very high (non-limiting) concentrations can be defined during model generation to enforce explicit inclusion of some or of all enzymes (in the latter case, a GECKO-analogous model will be generated); these concentration values can later be adapted for a given set of measurements.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.