To paraphrase its web site, Tidymodels provides a series of packages for modeling and machine learning using the tidyverse principles. {equatiomatic} is now partly compatible with it, meaning that it can extract the equation of certain models.
Here is an example (adapted from the {workflows} main page):
# Preparation of the dataset using {recipes}
spline_cars <- recipe(mpg ~ ., data = mtcars) |>
step_ns(disp, deg_free = 10)
spline_cars_prepped <- prep(spline_cars, mtcars)
Here is a simple (tidy)model:
# Fitting of a least-square linear model
lm_fit <- linear_reg() |>
fit(mpg ~ ., data = juice(spline_cars_prepped))
We can extract the equation of this model with
extract_eq()
:
\[ \begin{aligned} \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{hp}) + \beta_{3}(\operatorname{drat})\ + \\ &\quad \beta_{4}(\operatorname{wt}) + \beta_{5}(\operatorname{qsec}) + \beta_{6}(\operatorname{vs}) + \beta_{7}(\operatorname{am})\ + \\ &\quad \beta_{8}(\operatorname{gear}) + \beta_{9}(\operatorname{carb}) + \beta_{10}(\operatorname{disp\_ns\_01}) + \beta_{11}(\operatorname{disp\_ns\_02})\ + \\ &\quad \beta_{12}(\operatorname{disp\_ns\_03}) + \beta_{13}(\operatorname{disp\_ns\_04}) + \beta_{14}(\operatorname{disp\_ns\_05}) + \beta_{15}(\operatorname{disp\_ns\_06})\ + \\ &\quad \beta_{16}(\operatorname{disp\_ns\_07}) + \beta_{17}(\operatorname{disp\_ns\_08}) + \beta_{18}(\operatorname{disp\_ns\_09}) + \beta_{19}(\operatorname{disp\_ns\_10})\ + \\ &\quad \epsilon \end{aligned} \]
The {equatiomatic} extract_eq()
also works with models
fitted using the {workflows} package.
library(workflows)
# A model compatible with {equatiomatic}
linear_lm <- linear_reg()
# A workflow object
car_wflow <- workflow() |>
add_recipe(spline_cars) |>
add_model(linear_lm)
Now you can prepare the recipe and estimate the model via a single
call to fit()
:
You can also extract the equation from wflow_fit
:
\[ \begin{aligned} \operatorname{..y} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{hp}) + \beta_{3}(\operatorname{drat})\ + \\ &\quad \beta_{4}(\operatorname{wt}) + \beta_{5}(\operatorname{qsec}) + \beta_{6}(\operatorname{vs}) + \beta_{7}(\operatorname{am})\ + \\ &\quad \beta_{8}(\operatorname{gear}) + \beta_{9}(\operatorname{carb}) + \beta_{10}(\operatorname{disp\_ns\_01}) + \beta_{11}(\operatorname{disp\_ns\_02})\ + \\ &\quad \beta_{12}(\operatorname{disp\_ns\_03}) + \beta_{13}(\operatorname{disp\_ns\_04}) + \beta_{14}(\operatorname{disp\_ns\_05}) + \beta_{15}(\operatorname{disp\_ns\_06})\ + \\ &\quad \beta_{16}(\operatorname{disp\_ns\_07}) + \beta_{17}(\operatorname{disp\_ns\_08}) + \beta_{18}(\operatorname{disp\_ns\_09}) + \beta_{19}(\operatorname{disp\_ns\_10})\ + \\ &\quad \epsilon \end{aligned} \]
You notice that the original name of the dependent variable is lost,
but you can reset it manually using swapt_var_names=
:
\[ \begin{aligned} \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{hp}) + \beta_{3}(\operatorname{drat})\ + \\ &\quad \beta_{4}(\operatorname{wt}) + \beta_{5}(\operatorname{qsec}) + \beta_{6}(\operatorname{vs}) + \beta_{7}(\operatorname{am})\ + \\ &\quad \beta_{8}(\operatorname{gear}) + \beta_{9}(\operatorname{carb}) + \beta_{10}(\operatorname{disp\_ns\_01}) + \beta_{11}(\operatorname{disp\_ns\_02})\ + \\ &\quad \beta_{12}(\operatorname{disp\_ns\_03}) + \beta_{13}(\operatorname{disp\_ns\_04}) + \beta_{14}(\operatorname{disp\_ns\_05}) + \beta_{15}(\operatorname{disp\_ns\_06})\ + \\ &\quad \beta_{16}(\operatorname{disp\_ns\_07}) + \beta_{17}(\operatorname{disp\_ns\_08}) + \beta_{18}(\operatorname{disp\_ns\_09}) + \beta_{19}(\operatorname{disp\_ns\_10})\ + \\ &\quad \epsilon \end{aligned} \]
For some models, {broom} is not enough. You need also to
library(broom.mixed)
before you can extract the equation.
This is the case of a Bayes linear model using "stan"
.
Note: this code is not run in the vignette to avoid heavy
extra-dependencies, but you can run this code in your R process.
library(broom.mixed) # Required for some models, or extract_eq() will choke!
bayes_fit <- linear_reg() |>
set_engine("stan") |>
fit(mpg ~ hp + drat, data = mtcars)
And the equation would be obtained with:
\[ E( \operatorname{mpg} ) = \alpha + \beta_{1}(\operatorname{hp}) + \beta_{2}(\operatorname{drat}) \]