---
title: "Metabolic Flux Dynamics in Upper Glycolysis with `dynafluxr`"
output: rmarkdown::html_vignette
author:
  - "Serguei Sokol^[TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse, France] ^[MetaboHUB, National Infrastructure of Metabolomics and Fluxomics, Toulouse 31077, France]"
date: 2025-06-27
vignette: >
  %\VignetteIndexEntry{glyco}
  %\VignetteEngine{knitr::knitr}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=7,
  fig.height=5
)
```

```{r setup}
library(dynafluxr)
#pkgload::load_all("~/projs/dynafluxr/dev/dynafluxr")
```
## Introduction

This vignette shows how `dynafluxr` R package can be used to unravel metabolic flux dynamics from metabolite concentration time courses and a stoichiometric model. No regulation model (like Michaelis-Menten) is required for this task. Our method is heavily relying on B-splines as implemented in `bspline` package. We show how to prepare measurement data, edit stoichiometric reactions, use various package options and interpret/exploit the results.
Even if graphical user interface (GUI) cans be called and used with function `gui()`, this vignette is focused on command line interface (CLI) which is more suitable for reproducible research and automation. The CLI can be used from R or from a shell environment.

The data used for this vignette were obtained by NMR measurements on MetaToul-FluxoMet platform, Toulouse Biotechnology Institute (TBI), Toulouse, France. Credits for making these data available go to Svetlana Dubiley, Pauline Rouane, Cyril Charlier, Guy Lippens and Pierre Millard. Pierre Millard has also shared stoichiometric model and supervised the whole project. All cited persons are (or were at the moment of the project) from TBI.
The case presented in this vignette is also presented in an article entitled "Estimating flux dynamics from metabolite concentration time courses with dynafluxr" by Serguei Sokol, Svetlana Dubiley, Pauline Rouane, Cyril Charlier, Guy Lippens, and Pierre Millard.

## Data preparations

Data must be stored in a `.tsv` (Tabulation Separated Values) file (`.tsv` extension is not mandatory, it can be `.txt`, `.csv`, whatever, but the field separator must be the tabulation character), one column per chemical specie. The first row describes column names. Time points, at which measurements are done, must be stored in a column whose name starts with 'Time'. No specie name can start with 'Time'. Decimal separator must be a point `.` character. A head of data file used as an example is looking like

```{r}
fmeas=system.file("dataglyco/data.tsv", package="dynafluxr")
cat(head(readLines(fmeas)), sep="\n")
```

The raw data may be difficult to read because of tab misalignment and row wrapping. Here are parsed data:
```{r}
mf=read.delim(fmeas, comment.char="#")
print(head(mf))
```
## Stoichiometric model preparation

The reactions are written as in following example:
```{r}
fsto=system.file("dataglyco/network.txt", package="dynafluxr")
cat(readLines(fsto), sep="\n")
```
The first field separated by tabulation is a reaction name then reaction itself where species are separated by "&nbsp;+&nbsp;" sign and possibly preceded by a stoichiometric coefficient with "&nbsp;\*&nbsp;" symbol, e.g. `2*NAD`. The symbol "\*" is manadtory as a metabolite name can start with a number, e.g. `3PG` is a molecule of 3-phosphoglycerate and not `3 * PG`. The reaction can also be a degradation or washing out one, e.g. `A ->` which means that A is degraded to nothing or transported outside of the considered system. The reaction can also be a synthesis one, e.g. `-> A` which means that A is synthesized from nothing or brought from external environment.

We start by gathering example files in a fresh temporary created working directory (it will no longer exist after ending the R session):
```{r}
ddir=tempfile(pattern="glyco")
dir.create(ddir)
file.copy(c(fmeas, fsto), ddir)
```
## Data exploring

Let fit available data with B-splines of order 4 and with 5 internal knots (default in `dynafluxr`), then plot them to have an idea how data looks like:
```{r}
fmeas=file.path(ddir, "data.tsv")
fsto=file.path(ddir, "network.txt")
mf=read.delim(fmeas, comment.char="#")
nm=colnames(mf)[-1L]
msp=fitsmbsp(mf$Time, mf[, nm], n=4, nki=5)
matplot(mf$Time, msp(mf$Time), t="l", ylab="Concentration [mM]", xlab="Time [min]", lwd=2)
coltr=apply(col2rgb(1:6)/255, 2, function(v) do.call(rgb, c(as.list(v), alpha=0.25)))
matpoints(mf$Time, mf[, nm], pch="o", cex=0.75, col=coltr)
legend("topright", legend=nm, lty=1:5, col=1:6, cex=0.75, lwd=2)
```

Data treatment can make that measured concentration are getting negative.

As concentrations cannot get negative by definition, `dynafluxr` constraints B-splines to non negative values.
Another non desirable effect can be that consumed metabolite is sometimes increasing which has no physical meaning. In our case, Glucose is only consumed so in `dynafluxr`, it can be signaled in options `--decresing` while `--increasing` is reserved for accumulated products like FBP. All options can be abbreviated till unambiguous prefix, e.g. `--decr` for `--decreasing`.

We can see also in the previous plot that FBP has non zero values at the beginning of the experiment. To skip these nonphysical data, we can use an option `--skip` which indicates the number of first time points to skip. In our case, we will skip 24 first time points.

## First estimate of fluxes with `dynafluxr`

We try `dynafluxr::cli()` (Command Line Interface) function with minimal options, just indicating measurement data, stoichiometric model and, as seen before, that GLC/FBP must be monotonously decreasing/increasing respectively and that 24 first time points must be skipped. The `cli()` function is the main entry point to `dynafluxr` package and it can be used from R or from a shell environment. It is a wrapper around `dynafluxr::fdyn()` function which does the real work of flux estimation.

```{r}
res=cli(c("-m", fmeas, "-s", fsto, "--decr", "GLC", "--incr", "FBP", "--skip", "24"))
```

The same command could be run from a shell (not R) environment if variables \$fmeas and \$fsto are defined. In this case it would look like:

```
$ Rscript -e "dynafluxr::cli()" -m $fmeas -s $fsto --decr GLC --incr FBP --skip 24
```
The `cli()` function accepts a vector of command line arguments. The first argument is `-m` or `--meas` which indicates the measurement data file, then `-s` or `--stoich` for stoichiometric model file, then options for decreasing and increasing species, skipping time points and so on. The result of the function is a list with several elements.
The full list of available options can be viewed with `dynafluxr::cli("-h")` command. The result of `cli()` function (in case of no error in run-time), is a series of data and pdf files written in a directory `glyco/data` which is the base name of measurement data file:

```{r}
list.files("glyco/data/")
```

The file `Readme.md` describes the content of the result directory. These files can be explored by system tools like pdf-viewers and spreadsheet software but for needs of this vignette we will reproduce some results by R means. As we won't need result files in this demo page, we'll cancel their writing with empty option `-o ""` which normally indicates result directory/archive name if it is different from the default one:
```{r}
res=cli(c("-m", fmeas, "-s", fsto, "--decr", "GLC", "--incr", "FBP", "--skip", "24", "-o", ""))
```

Let now glance on fit quality, i.e. results of $\chi^2$-test:
```{r}
print(res$chi2tab)
```

We can see that $p$-values are almost 0 for G6P and FBP which usually means a poor fit quality. Let see what is going on by plotting integrated species with measurements:

```{r}
nm=colnames(res$mf)[-1]
matplot(res$tpp, res$isp(res$tpp, nm), t="l", xlab="Time [min]", ylab="Concentration [mM]", lwd=2)
matpoints(res$tp, res$mf[,nm], pch="o", cex=0.5, col=coltr)
legend("topright", legend=nm, lty=1:5, col=1:6, cex=0.75, lwd=2)
```

For a human eye is does not look bad but for a computer, it is not good enough. The problem is that the uses underestimated reference variance visible in the column `RefVar` of the $\chi^2$-test table.

## Getting fit better

The reference variance is used to estimate the noise in measurements and it is used to weight the residuals in least squares. If it is underestimated, the residuals are too large and the fit is considered as poor. The reference variance can be estimated automatically like it was done till now. But It can also be set to a some  more realistic value manually. In our case, we can set it to from 0.05 mM$^2$ to 0.1 mM$^2$ which are reasonable values for NMR measurements. This can be done with `--sderr` option: `--sderr=GLC=0.05,G6P=0.05,FBP=0.1,F6P=0.05`

From experiment setup, we know that FBP and F6P are not measured in absolute concentration as there is a doubt about proton attribution in the measured signal. Let apply scaling factors to FBP and F6P:
```{r}
res=cli(c("-m", fmeas, "-s", fsto, "--decr", "GLC", "--incr", "FBP", "--skip", "24", "--sf=FBP,F6P", "-o", "", "--sderr=GLC=0.05,G6P=0.05,FBP=0.1,F6P=0.05"))
print(res$chi2tab)
```

Now, all $p$-values are well above 0.05 usual threshold.

## Exploring results

We are ready to see the main result: estimated reaction rates.
```{r}
matplot(res$tpp, res$vsp(res$tpp), t="l", ylab="Rate [1/min]", xlab="Time [min]", lwd=2)
legend("topright", colnames(bsppar(res$vsp)$qw), lty=1:5, col=1:6, cex=0.75, lwd=2)
```

Having estimated rates, we can now explore details of metabolic fluxes. Not only we have the total flux $\frac{dM_i}{dt}$ but also all its components coming from involved reactions $S_{ij}v_j$. For G6P, it gives:
```{r}
nm="G6P"
jnz=names(which(res$stofull[nm,] != 0)) # non-zero coeffs, i.e. reactions involved in G6P mass balance
fl=t(res$stofull[nm,jnz]*t(res$vsp(res$tpp, jnz))) # each rate v_j is multiplied by S_ij
fl=cbind(res$fsp(res$tpp, nm), fl) # add total flux
matplot(res$tpp, fl, t="l", ylab="Flux [mM/min]", xlab="Time [min]", main="G6P flux components", lwd=2)
abline(h=0)
legend("topright", legend=c("Total", jnz), lty=1:5, col=1:6, lwd=2)
```

Examples of plotting B-spline curves with smooth error intervals are given in a notebook available at [](https://github.com/MetaSys-LISBP/dynafluxr_notebook/)

## Conclusion

In this vignette, we have shown a step-by-step procedure for reaction rate estimation from metabolite concentration time courses and a stoichiometric model. We have seen how package parameters can be chosen, how fit quality can be evaluated and how data can be presented and exploited. Based on this vignette, user should be ready to start working with his own data set and stoichiometric model.