Models
Some common flux calculation models are provided with PhysioFit, and users can also build their own models.
Note
We would be happy to broaden the types of models shipped with PhysioFit. If you have developed a new model, it might be usefull and valuable to the fluxomics community! Please, keep in touch with us to discuss and see if we can include your model in the built-in models shipped with PhysioFit! :)
Models shipped with PhysioFit
Steady-state models
As detailed in Peiro et al. (2019), PhysioFit includes a steady-state model that accounts for i) non enzymatic degradation of some metabolites and ii) growth lag. This model is described by the following system of ordinary differential equations:
where \(\mu\) is growth rate, \(X\) is the biomass concentration, and \(t_{lag}\) is the lag time.
where \(k_{i}\) is the first-order degradation constant of the metabolite \(M_{i}\) and \(qM_{i}\) is its exchange (uptake or production) flux.
The flux \(qM_{i}\) is positive (negative) when \(M_{i}\) is produced (consumed). The sign of \(qM_{i}\) can thus be used to automatically identify products and substrates.
Integrating equations 1-2 provides the following analytical functions:
Three additional models are derived from this general model (without degradation, without lag phase, and without degradation nor lag phase).
Indeed, without a lag phase (i.e. \(t_{lag}=0\)), equations 3-4 simplifies to:
In the absence of degradation (i.e. \(k=0\)), equation 4 simplifies to:
In the absence of both degradation and lag (i.e. \(t_{lag}=0\) and \(k=0\)), equations 3-4 simplifies to:
Dynamic model
We also implemented a dynamic model where fluxes and growth are represented by Monod kinetics, for one substrate and one product.
Time course concentrations of biomass (\(X\)), substrate (\(S\)) and product (\(P\)) are described by the following system of ordinary differential equations (ODEs):
where \(q_{S}\) is is the substrate uptake flux, \(yield_{biomass}\) is the biomass yield and \(yield_{product}\) is the product yield.
The dependence of the substrate uptake flux on the substrate concentration is expressed in this model by the Monod rate law:
where \(q^{max}_{S}\) is the maximal substrate uptake rate and \(K_{M}\) is the “half-velocity constant” (the value of \(S\) at which \(\frac{\mu}{\mu_{max}}=0.5\)).
User-made models
Overview
Since PhysioFit 3.0.0, users can create and implement their own models to calculate fluxes and other growth parameters for any biological system. This section explains how to write your first model, how to test the model and how to implement it on your PhysioFit instance.
Build a template
To implement user-made models, PhysioFit leverages Python’s object model to create classes that inherit from an Abstract Base Class and that handles all the heavy-lifting for implementation. A simple set of rules enables users to use their model in PhysioFit.
The model must be a class located in a dedicated module. Start by opening a text file using your IDE (Integrated Development Environment), and enter the following structure in the file:
from physiofit.models.base_model import Model
class ChildModel(Model):
def __init__(self, data):
pass
def get_params(self):
pass
@staticmethod
def simulate():
pass
if __name__ == "__main__":
pass
This is the base template to build your model. Methods get_params
(to initialize and return model parameters) and simulate
(to simulate metabolite dynamics for a given set of parameters) are mandatory. Additional methods are allowed if needed (e.g. to carry out intermediary steps for the simulation).
Populate the template
The first attribute to add in your model’s __init__
method is the model name. We strongly advise
to choose a name that helps the user understand what the model is destined to simulate. You must also add two other
attributes: the free parameters that PhysioFit will estimate & the fixed parameters provided by users. Finally, you must also call the super().init(data)
method to inherit the logic from the base class:
from physiofit.models.base_model import Model
class ChildModel(Model):
def __init__(self, data):
super().__init__(data)
self.model_name = "Tutorial model"
self.parameters_to_estimate = None
self.fixed_parameters = None
def get_params(self):
pass
@staticmethod
def simulate():
pass
if __name__ == "__main__":
pass
Note
If your model does not contain fixed parameters, you must still initialize the attribute as None
. This is
considered good practice.
We can now check that the model can be initialized properly. Use the block at the end of the file for testing purposes. Here is an example of how you can test the model:
if __name__ == "__main__":
test_data = pd.DataFrame(
{
"time": [0, 1, 2, 3],
"experiments": ["A", "A", "A", "A"],
"X": [0.5, 0.8, 1.2, 1.8],
"Glucose": [12, 11.6, 11, 10.2]
}
)
model = ChildModel(data=test_data)
print(model)
If you now run the file, you should have a standard output in your console that looks like:
Selected model: Tutorial model
Model data:
time experiments X Glucose
0 0 A 0.5 12.0
1 1 A 0.8 11.6
2 2 A 1.2 11.0
3 3 A 1.8 10.2
Experimental matrix:
[['A' 0.5 12.0]
['A' 0.8 11.6]
['A' 1.2 11.0]
['A' 1.8 10.2]]
Time vector: [0 1 2 3]
Name vector: ['X', 'Glucose']
Biomass & Metabolites: ['Glucose']
Parameters to estimate: None
Fixed parameters: None
Bounds: None
The next step is to define the parameters (used for simulationsand optimization). PhysioFit supports two types of parameters (parameters to estimate and fixed parameters) which are detailed below.
Free parameters
The free parameters are the parameters that will be estimated by PhysioFit, and thus that require defining bounds and initial values
to be initialized. The list of parameters and their initial (default) values must be returned by the get_params
method:
from physiofit.models.base_model import Model
class ChildModel(Model):
def __init__(self, data):
super().__init__(data)
self.model_name = "Tutorial model"
self.parameters_to_estimate = None
self.fixed_parameters = None
def get_params(self):
# Parameters are given in a dictionnary, where the key is
# the parameter name and the value is a number that will
# be the initial value for the optimization process
self.parameters_to_estimate = {
"BM_0": 1,
"growth_rate": 1
}
# Do the same for all metabolite parameters to estimate
# using a for loop:
for metabolite in self.metabolites:
self.parameters_to_estimate.update(
{
f"{metabolite}_flux" : 1,
f"{metabolite}_init_value" : 1
}
)
@staticmethod
def simulate():
pass
Note
For a given model, the number of metabolites may vary depending on the experiment, hence the metabolite-dependent parameters can be automatically defined in this function (as illustrated here using a for loop).
The next step is to define the default bounds used for the optimization process (these bounds can be changed in the GUI). The bounds are a
class of objects that handle the logic and checks. They are derived from the python dict
base class, and as such
implement the same methods (e.g. update
). Here is an example of how to implement the bounds:
from physiofit.models.base_model import Model
class ChildModel(Model):
def __init__(self, data):
super().__init__(data)
self.model_name = "Tutorial model"
self.parameters_to_estimate = None
self.fixed_parameters = None
def get_params(self):
# Parameters are given in a dictionnary, where the key is
# the parameter name and the value is a number that will
# be the initial value for the optimization process
self.parameters_to_estimate = {
"BM_0": 1,
"growth_rate": 1
}
# Instantiate the bounds object
self.bounds = Bounds(
{
"BM_0": (1e-3, 10),
"growth_rate": (1e-3, 3)
}
)
# Do the same for all metabolite parameters to estimate
# using a for loop:
for metabolite in self.metabolites:
self.parameters_to_estimate.update(
{
f"{metabolite}_flux" : 1,
f"{metabolite}_init_value" : 1
}
)
# Append the default bounds to the bounds attribute
self.bounds.update(
{
f"{metabolite}_flux": (-50, 50),
f"{metabolite}_init_value": (1e-6, 50)
}
)
@staticmethod
def simulate():
pass
Warning
The keys in the bounds and in the parameters to estimate dictionary must be the same!
Fixed parameters
The fixed parameters are parameters that are known and are not estimated by PhysioFit. For example, in the case of steady-state models that account for non enymatic degradation (see Steady-state models.), we need to provide the degradation constant of all unstable metabolites (these constants must be measured in an independent experiment, e.g. see Peiro et al., 2019):
self.fixed_parameters = {"Degradation": {
metabolite: 2 for metabolite in self.metabolites
}
}
The fixed parameters must be provided as a dictionary of dictionaries, where the first level is the name of the parameter itself (here degradation) and the second level contains the mapping of metabolite-value pairs that will be the default values initialized (here we give a default value of 2 for every metabolite for example). Each key of the first level is used to initialize a widget in the GUI, thus allowing users to change the corresponding values for the metabolites given in the second level.
Simulation function
Once the get_params
method has been implemented, the next step is to implement the simulation function that
will be called at each iteration of the optimization process to simulate the metabolite dynamics that correspond to a
given set of parameters (see Flux calculation for more details).
To do this, first write out the function definition:
@staticmethod
def simulate(
params_opti: list,
data_matrix: np.ndarray,
time_vector: np.ndarray,
params_non_opti: dict
):
pass
- As shown above, this function takes four arguments:
params_opti
: list containing the values of each parameter to estimate in the same order as defined in the :samp:`parameters_to_estimate` dictionary (see Free parameters)data_matrix
: numpy array containing the experimental data (or data with the same shape)time_vector
: numpy array containing the time pointsparams_non_opti
: dictionary containing the fixed parameters (see Fixed parameters)
Now you can start writing the body of the function. For sake of clarity, we recommend unpacking parameters values from the
list of parameters to estimate into internal variables. Th function simulate must return a matrix containing the simulation results, with the same shape as
the matrix containing the experimental data. To initialize the simulated matrix, you can
use the empty_like
function from the numpy library:
@staticmethod
def simulate(
params_opti: list,
data_matrix: np.ndarray,
time_vector: np.ndarray,
params_non_opti: dict
):
# Get end shape
simulated_matrix = np.empty_like(data_matrix)
# Get initial params
x_0 = params_opti[0]
mu = params_opti[1]
# Get X_0 values
exp_mu_t = np.exp(mu * time_vector)
simulated_matrix[:, 0] = x_0 * exp_mu_t
fixed_params = [value for value in params_non_opti["Degradation"].values()]
# Get parameter names and run the calculations column by column
for i in range(1, int(len(params_opti) / 2)):
q = params_opti[i * 2]
m_0 = params_opti[i * 2 + 1]
k = fixed_params[i - 1]
exp_k_t = np.exp(-k * time_vector)
simulated_matrix[:, i] = q * (x_0 / (mu + k)) \
* (exp_mu_t - exp_k_t) \
+ m_0 * exp_k_t
return simulated_matrix
The math corresponding to the simulation function provided above as example can be found here (equations 5 and 6). See here for information on how to test the completed model.
This example showcases the use of analytical functions to simulate the flux dynamics. It is also possible to use numerical derivation to solve a system of ordinary differential equations (ODEs), which can be usefull when algebric derivation is not straightforward. This require the implementation of additional functions into the simulate function. The system of ODEs can be provided directly within the body of the simulate function:
from scipy.integrate import solve_ivp
@staticmethod
def simulate(
params_opti: list,
data_matrix: np.ndarray,
time_vector: np.ndarray,
params_non_opti: dict
):
# Get parameters
x_0 = params_opti[0]
y_BM = params_opti[1]
km = params_opti[2]
qsmax = params_opti[3]
s_0 = params_opti[4]
y_P = params_opti[5]
p_0 = params_opti[6]
params = (y_BM, y_P, km, qsmax)
# initialize variables at t=0
state = [x_0, s_0, p_0]
def calculate_derivative(t, state, y_BM, y_P, km, qsmax):
# get substrate and biomass concentrations
s_t = state[0]
x_t = state[1]
# calculate fluxes
qs_t = qsmax * (s_t / (km + s_t))
mu_t = y_BM * qs_t
qp_t = y_P * qs_t
# calculate derivatives
dx = mu_t * x_t
ds = -qs_t * x_t
dp = qp_t * x_t
return dx, ds, dp
# simulate time-course concentrations
sol = solve_ivp(
fun=calculate_derivative,
t_span=(np.min(time_vector), np.max(time_vector)),
y0 = state,
args=params,
method="LSODA",
t_eval = list(time_vector)
)
return sol.y.T
As we can see, the function calculate_derivative
returns the derivatives of each metabolite concentration and is used by an ODEs solver that performs the simulations. This function is thus
created within the body of the simulate function, before being called by the solver. More information on the mathematics
behind this implementation can be found here.
Note
The simulation function will be called a high number of times by the optimizer for parameter estimation, so optimize this function as much as possible. When possible, implement the model using analytical solution as calculations will be faster than solving numerically the corresponding ODEs.
Test the model
Once you have completely populated your model file, you can now launch a round of simulations and optimizations in a programmatic way:
if __name__ == "__main__":
from physiofit.base.io import IoHandler
from physiofit.models.base_model import StandardDevs
test_data = pd.DataFrame(
{
"time": [0, 1, 2, 3],
"experiments": ["A", "A", "A", "A"],
"X": [0.5, 0.8, 1.2, 1.8],
"Glucose": [12, 11.6, 11, 10.2]
}
)
io = IoHandler()
model = ChildModel(data=test_data)
model.get_params()
fitter = io.initialize_fitter(
model.data,
model=model,
mc=True,
iterations=100,
sd=StandardDevs({"X": 1, "Glucose": 1}),
debug_mode=True
)
fitter.optimize()
fitter.monte_carlo_analysis()
fitter.khi2_test()
print(fitter.parameter_stats)
This will return the calculated flux values and associated statistics.
Note
The test data and calculation parameters (e.g. standard deviations) defined in the test function must correspond to those expected for the new model.
To test the integration of the model into the GUI, copy the .py
file
in the folder models
of PhysioFit directory. You can get the path towards this folder by opening a python
kernel in your dedicated environment and initializing an IoHandler
from physiofit.base.io import IoHandler
io_handler = IoHandler()
print(io_handler.get_local_model_folder())
Note
The model file name must follow the naming convention model_[model number].py
. If the last model in the list
is the model_5.py
, the next one should be named model_6.py
.
You can now launch PhysioFit’s GUI, load a data file corresponding to the new model, select the model, and run flux calculation. In case of errors, have a look to the error message and correct the code.
Note
We would be happy to broaden the types of models shipped with PhysioFit. If you have developed a new model, it might be usefull and valuable to the fluxomics community! Please, keep in touch with us to discuss on the model and see if we can include your model in the built-in models shipped with PhysioFit! :)