MetaboDynamics 1.1.2
The MetaboDynamics vignette explains the package workflow and interpretation of results with a SummarizedExperiment object as input. In this vignette I will show the usage of MetaboDynamics with a data frame as input.
library(MetaboDynamics)
library(SummarizedExperiment) # storing and manipulating simulated metabolomics data
library(ggplot2) # visualization
library(dplyr) # data handling
library(tidyr) # data handling
The MetaboDynamics package also includes a simulated example data set in a data frame format. Please note, that although both datasets were simulated with the same code different seeds were used so that the results can deviate from each other. To load the data set in data frame format, execute the following code:
data("longitudinalMetabolomics_df", package = "MetaboDynamics")
To keep run time of this vignette short we will execute the workflow on a subset of five metabolites across all conditions.
# take a sample of five metabolites and subset data
metabolites <- sample(x = unique(longitudinalMetabolomics_df$metabolite),size = 5)
data <- longitudinalMetabolomics_df%>%filter(metabolite%in%metabolites)
head(data)
## # A tibble: 6 × 8
## # Groups: metabolite, condition [1]
## metabolite condition time replicate measurement log_m m_scaled KEGG
## <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
## 1 L-Aspartic acid A 1 1 761. 2.88 0.0128 C00049
## 2 L-Aspartic acid A 1 2 3147. 3.50 0.975 C00049
## 3 L-Aspartic acid A 1 3 628. 2.80 -0.117 C00049
## 4 L-Aspartic acid A 2 1 224. 2.35 -0.816 C00049
## 5 L-Aspartic acid A 2 2 696. 2.84 -0.0470 C00049
## 6 L-Aspartic acid A 2 3 655. 2.82 -0.0882 C00049
The measured metabolites are stored in column “metabolite”, time points are specified in column “time”, and column “condition” specifies the experimental condition.
The required log-transformed and scaled (per metabolite and condition to a mean of 0 and sd of 1) metabolite abundances are stored in column “m_scaled”. The function Fit_dynamics_model() allows to specify other column names for metabolite, time points, condition and scaled measurement.
First we will fit the dynamics model and extract the diagnostics:
# fit dynamics model
fits <- # when using a data frame as input fits have to stored in a separate object
fit_dynamics_model(
data = data,
metabolite = "metabolite", # in which column are metabolite names stored?
time = "time", # in which column are categorical time points stored?
condition = "condition", # in which are categorical experimental conditions stored?,
scaled_measurement = "m_scaled", # in which column are scaled measurments stored?
max_treedepth = 10,
adapt_delta = 0.95, # default 0.95
iter = 2000,
warmup = 2000/4, # default is 1/4 of iterations
chains = 1, # only set to 1 in vignette, recommended default is 4!
cores = 1 # only set to 1 in vignette, can be same as chains if machine allows for parallelization
)
## Is your data normalized and standardized?
## We recommend normalization by log-transformation.
## Scaling and centering (mean=0, sd=1) should be metabolite and condition specific.
##
## SAMPLING FOR MODEL 'm_ANOVA_partial_pooling' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 1: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 1: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.376 seconds (Warm-up)
## Chain 1: 0.707 seconds (Sampling)
## Chain 1: 1.083 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'm_ANOVA_partial_pooling' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.26 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 1: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 1: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.38 seconds (Warm-up)
## Chain 1: 0.681 seconds (Sampling)
## Chain 1: 1.061 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'm_ANOVA_partial_pooling' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 1: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 1: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.404 seconds (Warm-up)
## Chain 1: 0.699 seconds (Sampling)
## Chain 1: 1.103 seconds (Total)
## Chain 1:
This returns a list of model fits that are named by the experimental condition (in the case of the simulated data set “A”,“B”,“C”).
# Extract diagnostics
diagnostics <- # when using a data frame as input diagnostics have to be stored in a separate object
diagnostics_dynamics(
data = data, # data frame that was used to fit the dynamics model,
fits = fits, # list of fits from dynamics model, result of fit_dynamics_mode function
iter = 2000, # how many iterations were used to fit the dynamics model
chains = 1, # how many chains were used to fit the dynamics model
)
That returns a list with elements [[“model_diagnostics”]] which holds a summary of the model diagnostics and the extracted posterior estimates for all conditions.
To visualize the diagnostics the following code is used:
plot_diagnostics(
data = data, # data frame used to fit the dynamics model
diagnostics = diagnostics[["model_diagnostics"]] # summary of diagnostics
)
## $divergences
##
## $max_treedepth
##
## $Rhat
##
## $n_eff
In this case the diagnostics are all fine: The model fitting did not result in
divergent transitions, the maximum treedepth was not exceeded, Rhat values were
below 1.01 and number of effective samples exceeded 100.
The extract the estimates one has to specify again the data, list of fits and number of iterations and chains used to fit the model. Additionally is is possible to specify the columns in which metabolite names, experimental conditions and time points are stored. This is equivalent to the fit_dynamics_model function.
Additionally one can specify how many samples should be drawn from the posterior to for example test clustering performance.
estimates <- # estimates have to be stored in a separate object when using data frames
estimates_dynamics(
data = data,
fits = fits,
iter = 2000,
warmup = 500,
chains = 1,
condition = "condition",
metabolite = "metabolite",
time = "time",
samples = 1 # 1 sample from posterior taken
)
This returns a list containing the estimates for every experimental condition. The estimates can be visulized in two ways by the following code:
# only visualize differences between time points
plot_estimates(data = data,
estimates = estimates,
delta_t = TRUE, # choose to visualize differences between time points
dynamics = FALSE)
## $plot_timepoint_differences
# only visualize dynamics
plot_estimates(data = data,
estimates = estimates,
delta_t = FALSE,
dynamics = TRUE) # choose to visualize the dynamics
## $plot_dynamics
# Clustering dynamics
To cluster the dynamics we need either the estimates from the dynamics model (stored in object “estimates”) or a data frame in which the means of less then triplicate measurements are stored in a column named “mu_mean”, categorical time points are stored in ascending order in column “time.ID” and experimental conditions in a column named “condition”.
The following shows the needed data frame format for the clustering function:
head(estimates)
## $A
## metabolite.ID metabolite KEGG time.ID condition
## 1 1 D-Glyceraldehyde 3-phosphate C00118 1 A
## 2 1 D-Glyceraldehyde 3-phosphate C00118 2 A
## 3 1 D-Glyceraldehyde 3-phosphate C00118 3 A
## 4 1 D-Glyceraldehyde 3-phosphate C00118 4 A
## 5 2 L-Aspartic acid C00049 1 A
## 6 2 L-Aspartic acid C00049 2 A
## 7 2 L-Aspartic acid C00049 3 A
## 8 2 L-Aspartic acid C00049 4 A
## 9 3 L-Cystine C00491 1 A
## 10 3 L-Cystine C00491 2 A
## 11 3 L-Cystine C00491 3 A
## 12 3 L-Cystine C00491 4 A
## 13 4 L-Ornithine C00077 1 A
## 14 4 L-Ornithine C00077 2 A
## 15 4 L-Ornithine C00077 3 A
## 16 4 L-Ornithine C00077 4 A
## 17 5 sn-Glycero-3-phosphocholine C00670 1 A
## 18 5 sn-Glycero-3-phosphocholine C00670 2 A
## 19 5 sn-Glycero-3-phosphocholine C00670 3 A
## 20 5 sn-Glycero-3-phosphocholine C00670 4 A
## mu_mean mu_lower mu_higher sigma_mean sigma_lower sigma_higher
## 1 1.09285775 -0.6651102 2.37120079 1.2995548 0.52303449 3.4515320
## 2 0.25262647 -0.9098330 1.29096759 0.9344726 0.32670535 2.8022779
## 3 -0.30794235 -1.1435804 0.60089473 0.6844344 0.22873122 2.0217489
## 4 0.23741487 -0.7085807 1.16276545 0.7075502 0.23345408 2.1475683
## 5 -0.10849439 -0.4562598 0.20458498 0.2596538 0.06871240 0.9426487
## 6 -0.67846624 -1.9420386 0.86149383 1.1602260 0.46628218 2.8782881
## 7 -0.32124112 -1.2451269 0.62835011 0.7651068 0.24355400 2.2485230
## 8 0.41303454 -1.8215400 2.37452873 2.1448787 0.99187343 4.9218041
## 9 -0.17215897 -1.8890118 1.63082794 1.7373680 0.80475064 4.0260973
## 10 -0.07577907 -1.1603609 1.19217924 0.9207439 0.32126330 2.5294930
## 11 -0.32002159 -0.5952277 0.05705463 0.1699861 0.02986270 0.8598720
## 12 0.21903397 -1.8436371 2.49877810 2.1940316 1.04196868 4.8477571
## 13 -0.08035207 -0.4215635 0.47835354 0.2291958 0.04227224 1.0215999
## 14 -0.73628230 -1.8012535 0.73739873 0.9955538 0.34524323 2.6602277
## 15 -0.52117187 -2.2198343 1.33746016 1.8270146 0.84552936 4.0299136
## 16 -0.15504015 -0.9418413 0.54666164 0.5851592 0.20112022 1.6111356
## 17 -0.15322977 -0.5108269 0.23503813 0.2290648 0.04467000 0.9753879
## 18 -0.13844720 -0.7081620 0.48003648 0.4315896 0.12833284 1.5093647
## 19 0.84890743 0.3291578 1.30581423 0.3156521 0.07631987 1.1521176
## 20 0.86555789 0.1315786 1.44065480 0.4815281 0.13903101 1.5134033
## lambda_mean lambda_lower lambda_higher delta_mu_mean delta_mu_lower
## 1 1.0119954 0.3224607 2.161149 -1.77132399 -3.48890091
## 2 1.0119954 0.3224607 2.161149 0.35844465 -1.26064894
## 3 1.0119954 0.3224607 2.161149 0.16498144 -0.78640366
## 4 1.0119954 0.3224607 2.161149 NA NA
## 5 0.8496381 0.2450912 1.921787 -0.57386759 -2.03863408
## 6 0.8496381 0.2450912 1.921787 0.54027509 -1.62669964
## 7 0.8496381 0.2450912 1.921787 -0.37226374 -2.71009629
## 8 0.8496381 0.2450912 1.921787 NA NA
## 9 0.9655158 0.2761364 2.107832 0.72097689 -1.65591945
## 10 0.9655158 0.2761364 2.107832 -0.49338662 -2.44383906
## 11 0.9655158 0.2761364 2.107832 -0.05809512 -0.84438401
## 12 0.9655158 0.2761364 2.107832 NA NA
## 13 0.9002908 0.2600295 1.917869 -0.40957383 -2.23018300
## 14 0.9002908 0.2600295 1.917869 -0.56412334 -2.59359352
## 15 0.9002908 0.2600295 1.917869 1.58518974 0.01422167
## 16 0.9002908 0.2600295 1.917869 NA NA
## 17 0.9598713 0.3137463 2.072663 0.03271531 -1.04888160
## 18 0.9598713 0.3137463 2.072663 -0.44539280 -2.55761170
## 19 0.9598713 0.3137463 2.072663 1.38672976 -0.61534241
## 20 0.9598713 0.3137463 2.072663 NA NA
## delta_mu_higher mu_sample_1
## 1 0.8538723 2.21676697
## 2 1.6642612 0.39225717
## 3 0.9269958 1.48437269
## 4 NA 0.40349436
## 5 0.8889232 -0.14014834
## 6 3.0985311 0.14249756
## 7 1.7658805 0.13809973
## 8 NA 0.28360175
## 9 2.8005916 0.09143915
## 10 1.7819946 0.25087111
## 11 0.6617985 -0.31479305
## 12 NA -1.28407143
## 13 1.5618297 -0.11072647
## 14 1.8658387 -1.19636200
## 15 2.7335349 -0.93604309
## 16 NA 0.12836604
## 17 1.3252517 -0.13168035
## 18 1.7230051 0.42697069
## 19 3.2675492 0.95906152
## 20 NA 0.66776524
##
## $B
## metabolite.ID metabolite KEGG time.ID condition
## 1 1 D-Glyceraldehyde 3-phosphate C00118 1 B
## 2 1 D-Glyceraldehyde 3-phosphate C00118 2 B
## 3 1 D-Glyceraldehyde 3-phosphate C00118 3 B
## 4 1 D-Glyceraldehyde 3-phosphate C00118 4 B
## 5 2 L-Aspartic acid C00049 1 B
## 6 2 L-Aspartic acid C00049 2 B
## 7 2 L-Aspartic acid C00049 3 B
## 8 2 L-Aspartic acid C00049 4 B
## 9 3 L-Cystine C00491 1 B
## 10 3 L-Cystine C00491 2 B
## 11 3 L-Cystine C00491 3 B
## 12 3 L-Cystine C00491 4 B
## 13 4 L-Ornithine C00077 1 B
## 14 4 L-Ornithine C00077 2 B
## 15 4 L-Ornithine C00077 3 B
## 16 4 L-Ornithine C00077 4 B
## 17 5 sn-Glycero-3-phosphocholine C00670 1 B
## 18 5 sn-Glycero-3-phosphocholine C00670 2 B
## 19 5 sn-Glycero-3-phosphocholine C00670 3 B
## 20 5 sn-Glycero-3-phosphocholine C00670 4 B
## mu_mean mu_lower mu_higher sigma_mean sigma_lower sigma_higher
## 1 0.680518545 -0.6683945 1.9473186 1.1885220 0.44558290 3.2371923
## 2 -0.612642819 -1.8182371 0.6171568 0.9443886 0.36109227 2.4301684
## 3 -0.222327353 -1.5220002 1.0538391 1.0132145 0.38410625 2.7158316
## 4 -1.494228313 -1.7757938 -1.0744890 0.2036843 0.04923181 0.7263239
## 5 0.050956084 -0.4727332 0.9003809 0.3864943 0.09230017 1.4849209
## 6 -0.062025898 -1.7673388 1.6882449 1.5704535 0.62572873 4.1281126
## 7 0.088384778 -1.6901636 1.8909290 1.7357989 0.77738642 3.8714469
## 8 1.078307185 -0.7182393 2.5886813 1.4283866 0.60378297 3.3618967
## 9 -0.008230427 -0.6488444 0.7873298 0.5124710 0.18206683 1.4225681
## 10 0.264102643 -1.7701855 2.2137816 1.9119613 0.83342830 4.5009558
## 11 -0.101576209 -0.4625136 0.2225920 0.2477641 0.06288445 0.8853796
## 12 0.917895883 -0.0243056 1.7802360 0.6220361 0.19685580 1.9487355
## 13 -0.482550541 -1.4637474 0.6717681 0.8596810 0.27894313 2.4089042
## 14 0.967952945 0.4893213 1.4348131 0.3465173 0.11607979 1.0075564
## 15 -0.354974350 -2.1830129 1.6469037 1.6198762 0.67866082 3.8525584
## 16 -0.477788035 -2.1992533 1.2664147 1.6274025 0.71505420 3.7102699
## 17 -0.425064853 -0.6815931 -0.1728949 0.1582022 0.03622450 0.7089352
## 18 -0.515724823 -0.9419665 -0.1465657 0.2699924 0.07182277 0.9473766
## 19 0.586294447 0.2997497 0.9167643 0.1669603 0.03623882 0.6841196
## 20 0.064070178 -1.2021024 1.3202963 1.0870999 0.38741534 3.0613136
## lambda_mean lambda_lower lambda_higher delta_mu_mean delta_mu_lower
## 1 0.7945484 0.2192600 1.733390 -0.74254444 -2.83105514
## 2 0.7945484 0.2192600 1.733390 -0.03955031 -1.83956450
## 3 0.7945484 0.2192600 1.733390 -0.37621183 -2.14954340
## 4 0.7945484 0.2192600 1.733390 NA NA
## 5 0.9434170 0.2695416 2.072497 0.70102760 -1.38944907
## 6 0.9434170 0.2695416 2.072497 0.82951110 -1.22415982
## 7 0.9434170 0.2695416 2.072497 -1.34296074 -2.22080432
## 8 0.9434170 0.2695416 2.072497 NA NA
## 9 0.9286296 0.2836068 2.100301 1.30063454 -0.75867414
## 10 0.9286296 0.2836068 2.100301 -1.56085773 -3.33981247
## 11 0.9286296 0.2836068 2.100301 -0.03317428 -1.19722426
## 12 0.9286296 0.2836068 2.100301 NA NA
## 13 1.5946526 0.4868129 3.406007 1.48599789 0.71864409
## 14 1.5946526 0.4868129 3.406007 0.97618337 0.06889415
## 15 1.5946526 0.4868129 3.406007 -0.38165850 -0.94581596
## 16 1.5946526 0.4868129 3.406007 NA NA
## 17 0.7433722 0.2158311 1.647455 0.21314656 -2.06228935
## 18 0.7433722 0.2158311 1.647455 -0.61907699 -3.22558925
## 19 0.7433722 0.2158311 1.647455 0.41904453 -1.87426982
## 20 0.7433722 0.2158311 1.647455 NA NA
## delta_mu_higher mu_sample_1
## 1 1.4031269 1.19546880
## 2 1.7200697 -0.44512454
## 3 1.3467566 0.09602693
## 4 NA -1.51330575
## 5 2.8300996 0.65195686
## 6 2.8702968 -0.33817004
## 7 -0.3550539 0.55379305
## 8 NA 0.81453032
## 9 3.2215829 -0.05306674
## 10 0.5900914 -1.05535346
## 11 1.0039149 -0.12200799
## 12 NA 0.98830136
## 13 2.3262310 -0.49947209
## 14 1.8197588 0.84316614
## 15 0.1987320 -0.35238429
## 16 NA 0.04841430
## 17 2.2383621 -0.40558598
## 18 2.0935159 -0.61161284
## 19 2.7658248 0.61388345
## 20 NA -0.95641449
##
## $C
## metabolite.ID metabolite KEGG time.ID condition
## 1 1 D-Glyceraldehyde 3-phosphate C00118 1 C
## 2 1 D-Glyceraldehyde 3-phosphate C00118 2 C
## 3 1 D-Glyceraldehyde 3-phosphate C00118 3 C
## 4 1 D-Glyceraldehyde 3-phosphate C00118 4 C
## 5 2 L-Aspartic acid C00049 1 C
## 6 2 L-Aspartic acid C00049 2 C
## 7 2 L-Aspartic acid C00049 3 C
## 8 2 L-Aspartic acid C00049 4 C
## 9 3 L-Cystine C00491 1 C
## 10 3 L-Cystine C00491 2 C
## 11 3 L-Cystine C00491 3 C
## 12 3 L-Cystine C00491 4 C
## 13 4 L-Ornithine C00077 1 C
## 14 4 L-Ornithine C00077 2 C
## 15 4 L-Ornithine C00077 3 C
## 16 4 L-Ornithine C00077 4 C
## 17 5 sn-Glycero-3-phosphocholine C00670 1 C
## 18 5 sn-Glycero-3-phosphocholine C00670 2 C
## 19 5 sn-Glycero-3-phosphocholine C00670 3 C
## 20 5 sn-Glycero-3-phosphocholine C00670 4 C
## mu_mean mu_lower mu_higher sigma_mean sigma_lower sigma_higher
## 1 -0.064276892 -1.40217622 1.3848032009 1.2105146 0.43608513 3.2387780
## 2 0.032083342 -1.56192182 1.4593243385 1.2993705 0.48266958 3.5416920
## 3 0.170725767 -0.41321996 0.7176219980 0.4050488 0.11000769 1.4313810
## 4 0.579038565 -0.09599894 1.1862044006 0.5027497 0.14675493 1.6589945
## 5 -0.279207266 -0.69448726 0.0897290802 0.3061713 0.07442345 1.1995745
## 6 0.408262511 -1.77489080 2.4236555700 2.0946117 0.90781027 4.7320348
## 7 -0.272792814 -1.84246927 1.3811018603 1.3711082 0.53300195 3.7231221
## 8 -0.172837206 -2.53751013 1.9342711978 2.1420151 1.03525812 4.5967800
## 9 -0.413342461 -2.40249012 1.8373410878 2.0413866 0.94320573 4.3751741
## 10 -0.194221834 -2.18422926 1.8419586673 2.1063278 0.95011270 4.9050232
## 11 -0.002704189 -0.94145925 0.9893370333 0.7467817 0.26064876 2.0711889
## 12 0.271635565 -1.74317626 2.0867469730 1.9845848 0.89190546 4.3462537
## 13 0.297453605 -0.56257565 1.0310860608 0.6050895 0.18638032 1.9006589
## 14 -0.054741803 -1.29153376 1.3100191712 1.1499665 0.43590723 3.0371308
## 15 0.222394853 -0.95528678 1.5573844523 1.0282203 0.35711448 2.8746179
## 16 -0.439862243 -1.25546285 0.6077990507 0.6113345 0.16489347 2.1331977
## 17 -0.106135406 -0.33386664 0.0921285274 0.1608721 0.03584342 0.6670460
## 18 -0.288866289 -0.43005218 -0.0003489687 0.1213769 0.02247084 0.5840422
## 19 0.044784379 -0.86428671 0.9959822976 0.7265719 0.22599222 2.2267314
## 20 0.254203751 -1.20084355 1.6740986545 1.2238226 0.49251146 2.8165528
## lambda_mean lambda_lower lambda_higher delta_mu_mean delta_mu_lower
## 1 0.7961459 0.2264445 1.783069 0.47253940 -2.263884
## 2 0.7961459 0.2264445 1.783069 -0.41096670 -2.687232
## 3 0.7961459 0.2264445 1.783069 -0.43715805 -1.665824
## 4 0.7961459 0.2264445 1.783069 NA NA
## 5 0.7627807 0.2048076 1.664875 -0.30487616 -2.506041
## 6 0.7627807 0.2048076 1.664875 0.54442838 -2.022419
## 7 0.7627807 0.2048076 1.664875 -0.37777097 -2.199659
## 8 0.7627807 0.2048076 1.664875 NA NA
## 9 0.9874880 0.2682886 2.145868 -0.34356297 -2.741970
## 10 0.9874880 0.2682886 2.145868 0.47029081 -1.858476
## 11 0.9874880 0.2682886 2.145868 -0.58631989 -1.343416
## 12 0.9874880 0.2682886 2.145868 NA NA
## 13 0.8012866 0.2363499 1.765453 -0.99238103 -3.062444
## 14 0.8012866 0.2363499 1.765453 0.35860066 -2.180114
## 15 0.8012866 0.2363499 1.765453 0.09952618 -1.527789
## 16 0.8012866 0.2363499 1.765453 NA NA
## 17 0.7903309 0.2328690 1.707296 0.08498543 -1.992324
## 18 0.7903309 0.2328690 1.707296 0.41661669 -1.993618
## 19 0.7903309 0.2328690 1.707296 0.03180890 -1.776007
## 20 0.7903309 0.2328690 1.707296 NA NA
## delta_mu_higher mu_sample_1
## 1 2.8553899 1.12081373
## 2 1.9534251 -0.66709147
## 3 0.9048894 0.39631588
## 4 NA 0.78676178
## 5 1.8828278 -0.30455647
## 6 2.9608471 1.89317764
## 7 1.6466283 -0.85497478
## 8 NA 0.73463907
## 9 1.8323460 1.83233580
## 10 2.9655165 0.64477948
## 11 0.3955908 -0.04424522
## 12 NA -0.35649096
## 13 1.4977957 0.34143415
## 14 2.6848815 -0.80992498
## 15 1.6867788 -0.89426526
## 16 NA -0.22670254
## 17 2.0957328 -0.10259067
## 18 2.9648922 -0.28119329
## 19 1.8340024 -1.06972306
## 20 NA -0.11341971
With the estimates obtained with the estimates_dynamics function we can cluster metabolite dynamics per experimental condition.
cluster <- # clustering results have to be stored in separate object when using data frame as input
cluster_dynamics(
data = estimates, # data is now the estimates or a data frame ob similar structure
distance = "euclidean", # which distance method should be used
agglomeration = "ward.D2", # which agglomeration method for hierarchical clustering should be used
deepSplit = 2, # sensitivity of cluster analysis,
minClusterSize = 1 # minimum number of metabolites in one cluster
)
## Is your data normalized and standardized?
## We recommend normalization by log-transformation.
## Scaling and centering (mean=0, sd=1) should be metabolite and condition specific.
## ..cutHeight not given, setting it to 1.82 ===> 99% of the (truncated) height range in dendro.
## ..done.
## ..cutHeight not given, setting it to 2.82 ===> 99% of the (truncated) height range in dendro.
## ..done.
## ..cutHeight not given, setting it to 1.2 ===> 99% of the (truncated) height range in dendro.
## ..done.
To visualize the clustering results we can use the function plot_cluster which returns a list. One element “lineplot” which visualizes the metabolite dynamics in each cluster for every condition:
plots <- plot_cluster(cluster)
plots[["lineplot"]]
Additionally we can visualize for every experimental condition the dendrogram and a Principal component analysis of the clustering solution.
# For condition A
plots[["A"]]
## $dendrogram
##
## $PCA_plot
## Warning: Computation failed in `stat_ellipse()`.
## Caused by error in `if (all(abs(w - w0) < tol)) ...`:
## ! missing value where TRUE/FALSE needed
To conduct the over-representation analysis KEGG IDs of the experimental metabolites are needed. Additionally information on all metabolites that are stored in the KEGG database is needed. This information is stored in the dataset “modules_compounds”. To apply ORA to a specific dataset a second dataset “metabolite_modules” is needed that can be obtained by filtering “modules_compounds” for the experimental metabolites.
The clustering result also needs to be converted from a list to a single data frame.
# load background dataset
data("modules_compounds")
# get KEGGs from data
KEGGs <- unique(data$KEGG)
# construct "metabolite_modules" data set
metabolite_modules <- modules_compounds%>%filter(KEGG%in%KEGGs)
# convert clustering results from list to dataframe
cluster_data <- rbind(cluster[["A"]][["data"]],
cluster[["B"]][["data"]],
cluster[["C"]][["data"]])
The over-representation analysis and visualization can be achieved with the following code
ora <- # store ORA result to separate object when using data frames as input
ORA_hypergeometric(
background = modules_compounds,
annotations = metabolite_modules,
data = cluster_data, # dataframe format of clustering output
tested_column = "middle_hierarchy"
)
# Visualization
plot_ORA(
data = ora,
tested_column = "middle_hierarchy"
)
For the comparison of dynamics clusters between experimental conditions one can employ two apporaches: comparing dynamics between clusters and comparing metabolite composition between clusters.
To compare dynamics between clusters awhile using a data frame as input use the following code:
comparison_dynamics <- # result needs to be stored in a separate object when using data frames
compare_dynamics(
data = cluster_data, # clustering result
dynamics = c("1","2","3","4"), # which columns specify the time point specific mean metabolite abundances?
cores = 1 # only set to 1 for vignette, can be increased to 4 for parallelization
)
##
## SAMPLING FOR MODEL 'm_cluster_distances_padded' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 1: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 1: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 1: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.144 seconds (Warm-up)
## Chain 1: 0.342 seconds (Sampling)
## Chain 1: 0.486 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'm_cluster_distances_padded' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 2: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 2: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 2: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 2: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 2: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 2: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.149 seconds (Warm-up)
## Chain 2: 0.305 seconds (Sampling)
## Chain 2: 0.454 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'm_cluster_distances_padded' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 3: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 3: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 3: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 3: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 3: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 3: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.144 seconds (Warm-up)
## Chain 3: 0.352 seconds (Sampling)
## Chain 3: 0.496 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'm_cluster_distances_padded' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 501 / 2000 [ 25%] (Sampling)
## Chain 4: Iteration: 700 / 2000 [ 35%] (Sampling)
## Chain 4: Iteration: 900 / 2000 [ 45%] (Sampling)
## Chain 4: Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4: Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4: Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4: Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 4: Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.227 seconds (Warm-up)
## Chain 4: 0.376 seconds (Sampling)
## Chain 4: 0.603 seconds (Total)
## Chain 4:
## Warning: There were 280 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
The data frame needed for visualization of the results is the list element [[“estimates”]] of the function results. To visualize the results run the following code:
# Visualize comparison results
heatmap_dynamics(estimates = comparison_dynamics[["estimates"]],
data = cluster_data)
The comparison of metabolite composition follows the same principle as the comparison of dynamics between clusters:
# compare metabolite composition
compare_metabolites <-
compare_metabolites(
data = cluster_data
)
# Visualization
heatmap_metabolites(
distances = compare_metabolites,
data = cluster_data
)
The combination of both comparisons may facilitate detection of differences of longitudinal metabolomes between experimental conditions.
# combine comparison results
temp <- left_join(
comparison_dynamics[["estimates"]], # dynamics comparison
compare_metabolites,
join_by("cluster_a","cluster_b") # join by cluster comparisons
)
# get unique clusters
x <- unique(c(temp[,"cluster_a"], temp[,"cluster_b"]))
# draw plot
ggplot(temp, aes(x = cluster_b, y = cluster_a)) +
geom_point(aes(size = Jaccard, col = mu_mean)) +
theme_bw() +
scale_color_viridis_c(option = "magma") +
scale_x_discrete(limits = x) +
xlab("") +
ylab("") +
scale_y_discrete(limits = x) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
labs(col = "dynamics distance", size = "metabolite similarity") +
ggtitle("comparison of clusters", "label = condition + cluster ID")
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] tidyr_1.3.1 dplyr_1.1.4
## [3] ggplot2_3.5.2 SummarizedExperiment_1.39.1
## [5] Biobase_2.69.0 GenomicRanges_1.61.1
## [7] Seqinfo_0.99.2 IRanges_2.43.0
## [9] S4Vectors_0.47.0 BiocGenerics_0.55.0
## [11] generics_0.1.4 MatrixGenerics_1.21.0
## [13] matrixStats_1.5.0 MetaboDynamics_1.1.2
## [15] BiocStyle_2.37.0
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.52 bslib_0.9.0
## [4] QuickJSR_1.8.0 inline_0.3.21 lattice_0.22-7
## [7] vctrs_0.6.5 tools_4.5.1 curl_6.4.0
## [10] parallel_4.5.1 tibble_3.3.0 pkgconfig_2.0.3
## [13] Matrix_1.7-3 RColorBrewer_1.1-3 RcppParallel_5.1.10
## [16] lifecycle_1.0.4 compiler_4.5.1 farver_2.1.2
## [19] tinytex_0.57 codetools_0.2-20 htmltools_0.5.8.1
## [22] sass_0.4.10 yaml_2.3.10 pillar_1.11.0
## [25] crayon_1.5.3 jquerylib_0.1.4 MASS_7.3-65
## [28] DelayedArray_0.35.2 cachem_1.1.0 magick_2.8.7
## [31] StanHeaders_2.32.10 viridis_0.6.5 abind_1.4-8
## [34] rstan_2.32.7 tidyselect_1.2.1 digest_0.6.37
## [37] purrr_1.1.0 bookdown_0.43 labeling_0.4.3
## [40] fastmap_1.2.0 grid_4.5.1 cli_3.6.5
## [43] SparseArray_1.9.1 magrittr_2.0.3 loo_2.8.0
## [46] S4Arrays_1.9.1 utf8_1.2.6 dichromat_2.0-0.1
## [49] pkgbuild_1.4.8 dynamicTreeCut_1.63-1 withr_3.0.2
## [52] scales_1.4.0 rmarkdown_2.29 XVector_0.49.0
## [55] gridExtra_2.3 evaluate_1.0.4 knitr_1.50
## [58] V8_6.0.4 viridisLite_0.4.2 rstantools_2.4.0
## [61] rlang_1.1.6 Rcpp_1.1.0 dendextend_1.19.1
## [64] glue_1.8.0 BiocManager_1.30.26 jsonlite_2.0.0
## [67] R6_2.6.1