---
title: "MetaboDynamics: analyzing longitudinal metabolomics data with probabilistic models"
package: MetaboDynamics
author: "Katja Danielzik (katja.danielzik@uni-due.de)"

output:
  BiocStyle::html_document:
    toc: true
    toc_depth: 2
vignette: >
  %\VignetteIndexEntry{1. MetaboDynamics}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r style, echo = FALSE, results = 'asis'}
BiocStyle::markdown()
```

# Data requirements
- MetaboDynamics requires at least triplicates of metabolite abundances for all
analyzed time points and conditions. No minimum number of metabolites is required.
Time points are treated as categorical variables and have to be the same for all
conditions.
- Data can be a data frame (see Vignette "using MetaboDynamics with data frames)
or a SummarizedExperiment object (demonstrated in this vignette).
- Metabolite abundances have to be stored in column named "metabolite", time points
in "time" and the experimental condition in "condition"
- For clustering of metabolite abundance dynamics are at least two time points required
- To conduct over-representation analysis on KEGG functional modules KEGG IDs of
metabolites are required
- To conduct comparison of metabolite abundance dynamics between clusters at least
two experimental conditions are required
- If an in vitro experiment is analyzed andc ell counts should be incorporated in model, 
cell counts have to be estimated (i.e. cell counter not counting every cell) and 
independent (i.e. separate wells for cell counts and metabolite abundances) as a
poisson distribution is assumed. At least one replicate of cell counts per time
point and condition is required. 

Metabolite abundances have to be adequately normalized and scaled.
The expected scaling is explained in the following workflow. 

In the case of an in vitro experiment with available cell counts for all time points
and conditions no scaling or normalization is necessary as this will be handled
by the fit_dynamics_model() function (model="raw_plus_counts")

# Analysis time:
For two conditions of 100 metabolites each the fitting of the metabolite abundance 
dynamics model will take 10-15 minutes (one core per chain). Total workflow time
is approximately 15-20 minutes.

# Background 
Metabolomics data from longitudinal or time-course high-throughput experiments, 
such as untargeted liquid chromatography-mass spectrometry (LC-MS), allows for 
insights into changes of metabolite abundances over time. However, most existing 
tools are limited to comparing only two time points or experimental conditions,
and they often rely on frequentist statistical methods. Additionally, most tool
s are only able to analyze groups of metabolites on a pathway level, potentially
missing more general patterns of changes in the metabolism in sparse data.

Here, we introduce MetaboDynamics, a modular workflow of probabilistic models 
for the analysis of longitudinal metabolomics data. This package provides a 
flexible framework for analyzing longitudinal data, allowing users to model 
changes in metabolite abundances over time and identify patterns of interest.

Metabolomics data is often noisy and limited by few replicates due to high costs. 
To address these challenges, we employ a Bayesian hierarchical model that balances
between under-fitting and over-fitting. This model provides a robust method for 
estimating mean abundances at every time point as well as abundance differences 
between subsequent time points, even with limited replicates. An additional output
of the model is the difference between dynamics of different experimental conditions.

The MetaboDynamics workflow requires abundance tables as input, which can 
include metabolites and their observed abundances, LC-MS peak intensities or areas.
The workflow can handle any number of time points, and the dynamics model requires 
at least three replicates. However, functions to compare dynamics
with each other do not require a certain number of replicates and can be applied
to calculated mean abundances of e.g. two replicates.

In this vignette, I will demonstrate how to use the MetaboDynamics package to
analyze longitudinal metabolomics data, including data preprocessing, 
model fitting, and interpretation of results. We will show how to apply 
the package to a simulated data set and how to visualize and interpret the results.

# Setup: load required packages
```{r setup}
library(MetaboDynamics)
library(SummarizedExperiment) # storing and manipulating simulated metabolomics data
library(ggplot2) # visualization
library(dplyr) # data handling
library(tidyr) # data handling
```

# Load data and plot data overview

The MetaboDynamics package includes a simulated data set called 
"longitudinalMetabolomics". This data set consists of 98 metabolites with three 
replicates at four time points (1-4) across two experimental conditions (A-B).
The data set is represented as a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) object, where the abundance tables of each experimental condition are stored in assays (raw abundances in "concentrations", log-transformed transformations in 
"log_con" and scaled (mean=0, sd=1) log-transformed abundances" in "scaled_log"), and the metabolite names, 
KEGG IDs, experimental conditions, and clustering solutions per experimental 
condition are stored in colData and rowData. 

If you have a data frame as input and do not want to convert to a SummarizedExperiment object
please see Vignette "using MetaboDynamics with data frames" as guidance. 

To load the data, execute the following code:

```{r}
data("longitudinalMetabolomics", package = "MetaboDynamics")
```

The next plot shows the raw metabolite abundances.

```{r,fig.wide=TRUE}
# convert to dataframe
abundances <- as.data.frame(cbind(
  as.data.frame(t(assays(longitudinalMetabolomics)$concentrations)),
  as.data.frame(colData(longitudinalMetabolomics))
))
abundances <- abundances %>% pivot_longer(
  cols = seq_len(4), names_to = "time",
  values_to = "abundance"
)
ggplot(abundances, aes(x = abundance)) +
  geom_freqpoly() +
  theme_bw() +
  facet_grid(cols = vars(time), rows = vars(condition)) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
  ggtitle("raw data", "raw measurements")
```

In this plot I visualize the distribution of metabolite abundances. We can
see many observations with low abundances and few observations for higher
abundances. This raw data is not distributed normally. So let's log-transform the values.
In the integrated simulated data set this is already done in the column "log_m".

```{r,fig.wide=TRUE}
ggplot(abundances, aes(x = abundance)) +
  geom_density() +
  scale_x_log10() +
  theme_bw() +
  facet_grid(cols = vars(time), rows = vars(condition)) +
  ggtitle("data", "log-transformed values")
```

We can see that the metabolite abundances vary over magnitudes and are
normally distributed after log-transformation. 

The next plot shows the log-transformed dynamics of single metabolites. 

```{r,fig.wide=TRUE}
ggplot(abundances) +
  geom_line(aes(
    x = time, y = abundance, col = metabolite,
    group = interaction(metabolite, replicate)
  )) +
  scale_y_log10() +
  theme_bw() +
  xlab("timepoint") +
  scale_color_viridis_d() +
  theme(legend.position = "none") +
  facet_grid(rows = vars(condition)) +
  ggtitle("raw metabolite dynamics", "color=metabolite")
```

The dynamics profiles of single metabolites have intercepts (i.e. mean metabolite
abundances) that differ by magnitudes from each other. To be able to compare
metabolites with each other we define dynamics as deviations at the observed
time points from the metabolite's mean abundance. For this we standardize each 
metabolite at each experimental condition to a mean of zero and a standard deviation of one. 

In the simulated data set the scaled measurements
are in column "m_scaled".

```{r,fig.wide=TRUE}
data("longitudinalMetabolomics")
# convert to dataframe
abundances <- as.data.frame(cbind(
  as.data.frame(t(assays(longitudinalMetabolomics)$scaled_log)),
  as.data.frame(colData(longitudinalMetabolomics))
))
abundances <- abundances %>% pivot_longer(
  cols = seq_len(4), names_to = "time",
  values_to = "abundance"
)
ggplot(abundances) +
  geom_line(aes(
    x = time,
    y = abundance, col = metabolite,
    group = interaction(metabolite, replicate)
  )) +
  theme_bw() +
  scale_color_viridis_d() +
  xlab("timepoint") +
  ylab("deviation from mean abundance") +
  facet_grid(rows = vars(condition)) +
  theme(legend.position = "none") +
  ggtitle("standardized dynamics", "color=metabolite")
```


# Model dynamics
To analyze the dynamics data, we use a Bayesian hierarchical model. 
This model is used to account for the complexity and variability of the data and 
to provide a flexible and robust framework for analysis. The model is defined as
follows with: con = metabolite abundances,
m = metabolite, c = experimental condition and t = time point ID:

\begin{align*}
\log(con_{m,c,t})&\sim {\sf normal}(\mu_{m,c,t},\sigma_{m,c,t}) \\
\mu_{m,c,t}&\sim {\sf normal}(0,2) \\
\sigma_{m,c,t}&\sim {\sf exponential}(\lambda_{m,c}) \\
\lambda_{m,c}&\sim {\sf exponential}(2) 
\end{align*}

The model assumes a normal distribution of log-transformed and scaled metabolite
abundances and estimates a mean $\mu$ per metabolite and time point. The spread $\sigma$ 
of the replicate measurements is also estimated per metabolite and time point, 
but the spread of different time points inform each other through the hierarchical
structure of the hyper-parameter $\lambda$. This partial pooling allows for the
balance between over- and under fitting.

If the untargeted LC-MS experiment is conducted in vitro and independent cell counts
(i.e. cell counts were measured in a different well than metabolite abundances) are
available an adapted model can be run that takes the *raw* (i.e. not normalized and not scaled)
metabolite abundances and the cell counts as input. The "raw_plus_counts" model is
slower but has the advantage of additionally incorporating the uncertainty of cell counts
into the estimates of the Bayesian model.  

The code below shows how to fit the Bayesian model with the function
fit_dynamics_model(). Just in this vignette the number of chains is set to 1.
To get reliable estimates at least four chains should be used. If more than
one CPU core is available chains can run in parallel (parameter "chains").
Here we fit the dynamics modelfor a small subset of the simulated metabolites. 

```{r,fig.wide=TRUE}
# we can hand a SummarizedExperiment object to the function
data("longitudinalMetabolomics")
# we only use a subsection of the simulated data (1 condition and subsample of
# the whole dataset) for demonstration purposes
samples <- c(
  "UMP", "PEP",
  "2-Aminomuconate","ATP"
)
# only one condition
data <- longitudinalMetabolomics[, longitudinalMetabolomics$condition %in% c("A", "B") &
  longitudinalMetabolomics$metabolite %in% samples]

# fit model
data <- fit_dynamics_model(
  model = "scaled_log",
  data = data, scaled_measurement = "m_scaled",
  assay = "scaled_log",
  adapt_delta = 0.95, # default 0.95
  iter = 2000,
  cores = 1,
  chains = 1 # only set to 1 for vignette, use 4 chains
)
```

This returns a fit. If data is a SummarizedExperiment object the fits are stored in 
metadata(data)[["dynamic_fit"]]. 

MetaboDynamics fits Bayesian models with MCMC (Markov Chain Monte Carlo) which
returns a list of diagnostic criteria: rhat = R-hat convergence diagnostic (should be 1),
neff = number of effective samples (should be above 100), divergences = number of
divergent transitions (should be 0), max_treedepth = number of iterations that
exceed maximum treedepth (should be 0). Hints on how to adjust parameters for
fit_dynamics_model() can be found in the function documentation (type 
"?fit_dynamics_model" in R console if MetaboDynamics is loaded).

With diagnostics_dynamics() we can extract all
the diagnostic criteria of the model fit and visualize them. If data is a 
SummarizedExperiment object the diagnostics are stored in metadata(data)[["diagnostics_dynamics"]].
The diagnostic criteria can be visualized with the function plot_diagnostics().

```{r}
# extract diagnostics
data <- diagnostics_dynamics(
  data = data,
  iter = 2000, # number of iterations used for model fitting
  # the dynamic model
  chains = 1 # number of chains used for model fitting
)

plot_diagnostics(
  data = data
)
```

In this example we have zero divergent transitions, zero iterations exceeding
maximum tree depth, Rhat values are consistently below 1.01 and the number of
effective samples are comfortably above 100. All of this indicates that the
fitting of the model went well. 

The result of a Bayesian model (the posterior) should predict the experimental
data well. This can be checked with a posterior predictive check (PPC) in which
the experimental data (points) should lie within the predictions (violins).

```{r}
# PPCs can be accessed with
plot_PPC(
  data = data
)
```

The PPC plot shows that the experimental data points are consistent with the 
predicted distribution, suggesting that the model is a good fit to the data.

## Model outputs
After checking the diagnostic criteria and the PPC we can extract the estimates from the
model fit using the estimates_dynamics() function:
If data is a SummarizedExperiment the estimates are stored in metadata(data)[["estimates_dynamics"]].


```{r,fig.wide=TRUE}
# #extract estimates
data <- estimates_dynamics(
  data = data
)
```

To visualize the estimates we can use the plot_estimates() function. 

## Differences of metabolite abundances between time points
First we visualize the differences of metabolite abundances between time points
(Parameter "delta_t").

In the following plots the column label "A" indicates the experimental condition.
The row labels "timepoint 2 - timepoint 1". The metabolites are ordered on 
the x-axis according to the estimated mean differences between time points and the
estimated abundance differences ("delta") are on the y-axis. 

```{r,fig.wide=TRUE}
# 1) the differences between two timepoints
plots <- plot_estimates(
  data = data,
  delta_t = TRUE,
  dynamics = FALSE, distance_conditions = FALSE
)
plots$delta_t$`A_time_point_2-time_point_1`
plots$delta_t$`A_time_point_3-time_point_2`
```

The results of bayesian models can be interpreted intuitively: the 95% credible
interval (CrI) (95% highest density interval) indicates the 95% probability range
of the parameter (i.e. in this case the differences in abundances between time points). That
means that with 95% probability the desired values lies within the range of the 
error bars. 
If the 95% CrI lies below zero, we can conclude that there is a decrease in 
abundance between the two subsequent time points (e.g. PEP between time point 1 
and 2, red error bar). If the 95% CrI lies above zero, we can conclude that there
is an increase in abundance between the two subsequent time points. If the 95% CrI contains zero, we
cannot conclude that there is a difference in abundance between the two time points.

### Difference of dynamcis between experimental conditions
If the parameter "distance_conditions" is set to TRUE a ranked list of differences
between metabolite dynamics of different conditions is visualized.

```{r,fig.wide=TRUE}
# 1) the differences between two timepoints
plot_estimates(
  data = data,
  delta_t = FALSE,
  dynamics = FALSE, distance_conditions = TRUE
)
```

In the case of our simulated data set the uncertainty (i.e. the CrI width) for
the estimated differences between dynamics is pretty high. Bigger euclidean
distances indicate big differences between dynamics, small euclidean distances
small differences between dynamics. 

### Dynamic profiles

The dynamic profiles of each metabolite can be visualized using the estimated 
mean abundances at each time point (parameter "dynamics"). The plot below shows the estimated mean 
abundances at each time point for each metabolite.

```{r,fig.wide=TRUE}
# 2) dynamic profiles
plot_estimates(
  data = data, dynamics = TRUE,
  distance_conditions = FALSE,
  delta_t = FALSE
)
```

The dynamic profiles can be used to identify patterns and trends in the data,
such as shifts in abundance over time, and to visualize the changes in abundance 
for each metabolite over the observed time interval. 


The estimates of the dynamics model are returned by the function estimates_dynamics() 
and can be accessed with. The estimates are ordered by condition. 

```{r}
# get estimates
estimates <- metadata(data)[["estimates_dynamics"]]
# only show first rows of condition A
estimates$euclidean_distances
```

To identify groups of metabolites that have similar dynamics we could now cluster these 
metabolite specific dynamics vectors (estimates$mu) to 
determine if groups of metabolites have similar dynamics.

# Cluster dynamics
MetaboDynamics has a build-in wrapper function for clustering with the "hybrid"
method of the package [dynamicTreeCut](https://cran.r-project.org/web/packages/dynamicTreeCut/index.html).
Bootstrapping of the clustering solution is implemented with draws from the 
posterior distribution of mean metabolite abundances. Bootstrapping allows
for the estimation of clustering reliability.

From here on I will demonstrate the workflow on the whole simulated data set.

The code below shows how to use the cluster wrapper function cluster_dynamics
on a SummarizedExperiment object where the dynamics estimates are stored in
metadata(data)[["estimates_dynamics"]]. 

```{r,fig.wide=TRUE}
data <- cluster_dynamics(data,
  deepSplit = 2,
  B = 10 # number of bootstrapps
) # low clustering sensitivity
```

For the visualization of the clustering results we can use the function
plot_cluster()

```{r}
plots <- plot_cluster(data)
```

The functions returns three outputs:
1) phylogenetic trees ([ggtree](https://www.bioconductor.org/packages/release/bioc/html/ggtree.html)) (dendrograms) with number on nodes indicating the number of bootstrapss with 
the same clustering solution. Similar visualizations have been implemented
in [scbubbletree](https://bioconductor.org/packages/release/bioc/html/scBubbletree.html) and 
[cellmig](https://github.com/snaketron/cellmig).

To plot phylogenetic trees of condition A run:

```{r}
plots$trees$B
```

In half of the 10 bootstrapps 2-Amminomuconate and UMP are clustered together, but
ATP and PEP are only clustered together in 1/10 bootstrapps. 

2) lineplots of the clustered metabolite dynamics
```{r}
plots$lineplots$B
```

and 3) a [patchwork](https://cran.r-project.org/web/packages/patchwork/index.html) 
plot of the phylogenetic tree, cluster affiliations and the metabolite dynamics. 

```{r}
plots$patchwork$B
```


# Over-representation analysis of functional modules in dynamics clusters
To quantify the possible biological function of these dynamics clusters we 
retrieved from the KEGG-database the following information:
1. to which functional modules our experimental metabolites are annotated and
2. which metabolites are annotated to functional modules in general.

The functional modules of the KEGG-database are organised in three hierarchies:
upper, middle and lower. The middle hierarchy holds modules such as "Amino acid
metabolism", the lower for example "branched chain amino acid metabolism". To 
facilitate analysis the data frames "metabolite_modules", which 
holds the information about the metabolites of the simulated data set, and 
"modules_compounds", which holds the information about which metabolites are in 
general annotated to functional modules, were prepared. 

To construct the needed annotation data frame for your experimental data set
simply filter the "modules_compounds" data set for your experimental metabolites'
KEGG IDs. 

We load both data sets and can inspect the
documentation.

```{r}
data("metabolite_modules")
help("metabolite_modules")
head(metabolite_modules)
data("modules_compounds")
help("modules_compounds")
head(modules_compounds)
```

It is important to note that not all KEGG modules are suitable for testing on 
every observed organism and experimental condition. For example, the modules 
"Xenobiotics biodegradation", "Biosynthesis of other secondary metabolites", 
and "Biosynthesis of terpenoids and polyketides" should not be analyzed in a 
human lung cancer cell line.

For the ORA we additionally need a data frame annotating metabolites to KEGG
IDs. For the simulated example data set the results are in data("IDs")
```{r}
data("IDs")
head(IDs)
```

For the functional analysis we employ a hypergeometric model. We consider a 
functional module as over-represented in a cluster if the 95% inter-quantile
range (ICR) of the log-transformed probabilities of OvEs (observed vs. expected)
lies above zero. OvE refers to the ratio of observed metabolites in a cluster 
being mapped to a functional module over the number of expected metabolites in a
cluster being mapped to a module under the assumption of a hypergeometric 
distribution (=drawing without replacement). If data is a SummarizedExperiment
object where the clustering solution is stored in metadata(data)[["cluster"]]
the ORA results are stored in metadata(data)[["ORA_tested_column"]]
We apply the functional analysis to the middle hierarchy of functional
modules. 


```{r,fig.wide=TRUE}
data <- ORA_hypergeometric(
  background = modules_compounds,
  annotations = metabolite_modules,
  IDs = IDs,
  data = data, tested_column = "middle_hierarchy"
)

plot_ORA(data = data)
```

Great, we can now see which functional module is over- 
(green points and error-bars) or under-represented (none in this example) in 
which dynamics cluster! For instance in cluster 1 of condition A module
"Nucleotide metabolism" is over-represented. 

If we want to attach the over-representation analysis to the previously 
generated patchwork plot of the cluster dynamics, we need to store the clustering
plots and set argument "patchwork" of plot_ORA to TRUE.

```{r}
# get cluster plots
cluster <- plot_cluster(data = data)

# set patchwork to TRUE and provide cluster plots
ora <- plot_ORA(data = data, tested_column = "middle_hierarchy",
         patchwork = TRUE, plot_cluster = cluster)
```

To visualize the dynamics and ORA results run: 
```{r,fig.width=12,fig.height=6}
# patch plots together
library(patchwork)
p <- cluster$patchwork$A|ora$ora_patchwork$A
p+plot_layout(ncol=4, widths = c(2,1,2,2))
```

# Comparison of clusters of different experimental conditions
## Dynamics 
We can not only conduct over-representation analysis of KEGG-functional modules but
also compare dynamics clusters across different experimental conditions. For 
this we employ a Bayesian model that estimates the mean difference as well as 
the standard deviation of differences between dynamics clusters. If data is a 
SummarizedExperiment object the results of compare_metabolites are stored 
in metadata(data)[["comparison_dynamics"]].

dist = vector of pairwise euclidean distances between every dynamic vector of 
cluster a and every dynamic vector of cluster b, ID = cluster pair ID
\begin{align*}
dist_{ID}&\sim {\sf normal}(\mu_{ID},\sigma_{ID}) \\
\mu_{ID}&\sim {\sf normal^+}(0,2) \\
\sigma_{ID}&\sim {\sf exponential}(1) 
\end{align*}

```{r,fig.aling='center',fig.dpi=150}
data("longitudinalMetabolomics")
longitudinalMetabolomics <- compare_dynamics(
  data = longitudinalMetabolomics,
  cores = 1 # just set to 1 for vignette, set to 4 if possible
  )
```

We can visualize the posterior estimates of the model with:

```{r,fig.aling='center',fig.width=8,fig.height=6}
heatmap_dynamics(data = longitudinalMetabolomics)
```
This results in a bubble heat map where each cluster of every condition (i.e.
A_2 = cluster 2 of condition A) is compared with all clusters of every other condition.
The comparison of clusters is reciprocal and therefore each comparison pair is 
only visualized once. 
The bigger and brighter a point, the smaller is the mean distance between
dynamics clusters and the smaller is the standard deviation. That means that big 
bright points indicate high dynamic similarity which small spread. Here A_7 and B_8
have high similarity in dynamics. 

## Metabolites
We can also compare metabolite composition of clusters. For this we employ
the Jaccard Index which is a metric for similarity of two vectors. Values near
0 indicate low similarity, values near 1 high similarity. If data is a 
SummarizedExperiment object the results of compare_metabolites are stored 
in metadata(data)[["comparison_metabolites"]].

```{r,fig.aling='center',fig.width=8,fig.height=6}
longitudinalMetabolomics <- compare_metabolites(
  data = longitudinalMetabolomics
)
heatmap_metabolites(data = longitudinalMetabolomics)
```
The brighter a tile, the higher is the similarity of two clusters in regard
to their metabolite composition. The comparison of clusters of one condition
here always have a Jaccard Index of zero, as every metabolite is only assigned
to one cluster per condition, hence there is no similarity between clusters
of one experimental condition. 
We have two clusters that are similar in their metabolite composition:
A_12 and B_8. If we compare that to the dynamics profiles and ORA analysis
we see that similar functional modules are over-expressed as expected BUT
the dynamics differ between the two radiation doses. 

Can we facilitate visualization for the combination of both measures?

## Combine both
```{r,fig.aling='center',fig.height=6,fig.width=8}
dynamics <- metadata(longitudinalMetabolomics)[["comparison_dynamics"]][["estimates"]]
metabolites <- metadata(longitudinalMetabolomics)[["comparison_metabolites"]]
temp <- left_join(dynamics, metabolites, join_by("cluster_a", "cluster_b"))
x <- unique(c(temp[, "cluster_a"], temp[, "cluster_b"]))
temp <- temp %>% mutate(scale_Jaccard = scale(Jaccard))

ggplot(temp, aes(x = cluster_b, y = cluster_a)) +
  geom_point(aes(size = Jaccard, col = mu_mean)) +
  theme_bw() +
  scale_color_viridis_c(option = "magma") +
  scale_x_discrete(limits = x) +
  xlab("") +
  ylab("") +
  scale_y_discrete(limits = x) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
  labs(col = "dynamics distance", size = "metabolite similarity") +
  ggtitle("comparison of clusters", "label = condition + cluster ID")
```

With this form of visualization we can easily spot clusters in which the
experimental condition (e.g. kind of treatment/ dose of treatment) has the
biggest effect. These clusters are similar in regards to their composing metabolites
(i.e. big points)
but dissimilar (i.e. bright color) in regards to their dynamics. Here this is 
for example the cluster pair A_11 and B_12 (not a big point but bright color).

Clusters of metabolites that are not much affected by the experimental condition
can be spotted by big (i.e. high metabolite similarity) black (i.e. low dissimilarity
in dynamics) points, for example cluster pair A_6 and B_9.

```{r}
sessionInfo()
```