--- title: "ANCOVA with Combined Treatment Groups" author: "C&SP Methodology Team" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{ANCOVA with Combined Treatment Groups} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>") ``` # Overview This vignette provides an high-level illustration of **ANCOVA with combined treatment groups** in `junco`. The main purpose is to highlight the ability to construct tables with combined groups using `junco` function `a_summarize_ancova_j()`. The various options such as weighting strategies, interaction models, and alternative estimands will be described. # Clinical Context ANCOVA is commonly used for covariate-adjusted analyses of change from baseline in clinical trials. Regulatory tables often require pooling of treatment arms while preserving the original estimand. # Example Dataset The dataset used as an example is a small dataset with simulated values for `BASE` and `CHG`. ```{r data, message = FALSE} library(dplyr) library(rtables) library(junco) set.seed(101) adchg <- tibble( USUBJID = sprintf("SUBJ-%03d", 1:300), TRT01A = factor(rep(c("Placebo", "Low Dose", "High Dose"), length.out = 300)), REGION = factor(sample(c("EU", "US"), 300, replace = TRUE, prob = c(0.6, 0.4))), BASE = rnorm(300, 50, 10), CHG = rnorm(300, 0, 8) ) ``` # Combined Column Strategy A combined treatment column can be added in an `rtables` layout, by utilizing a `combodf` dataframe. As seen further in this document, adding `split_fun = add_combo_levels(combodf)` in `split_cols_by` call, will result in a combined treatment column in the final table. ```{r combodf} combodf <- tibble::tribble( ~valname, ~label, ~levelcombo, ~exargs, "ACTIVE", "Active", c("Low Dose", "High Dose"), list() ) ``` # ANCOVA Model The ANCOVA model used here is: ``` CHG ~ TRT01A + BASE + REGION ``` The model is fitted once using the original randomized treatment variable, for all columns, including the combined column. # Least-squares means approach The recommended approach for dealing with a combined treatment group column in an ANCOVA table, is to derive the estimates for that column as contrasts (least-squares means) from the linear ANCOVA model. This can be achieved by specifying the argument `method_combo = "contrasts"`. ```{r lyt_table} lyt <- basic_table() |> split_cols_by( "TRT01A", ref_group = "Placebo", split_fun = add_combo_levels(combodf) ) %>% add_colcounts() |> analyze( vars = "CHG", afun = a_summarize_ancova_j, var_labels = "Change from Baseline", extra_args = list( variables = list(arm = "TRT01A", covariates = c("BASE", "REGION")), ref_path = c("TRT01A", "Placebo"), method_combo = "contrasts", weights_combo = "equal", weights_emmeans = "equal", conf_level = 0.95 ) ) build_table(lyt, adchg) ``` ## Weighting Strategies Note the argument `weights_combo` in the above layout definition. When defining contrasts in a linear model, weights can specified to use in averaging predictions. See `?emmeans::emmeans()` for more details. The following 3 weight scenarios are available for `weights_combo`: - `equal`: equal contribution of arms - `proportional`: weighted by observed sample size - `proportional_marginal`: marginal weights across interaction strata If no interaction term is included in the model, the choices `proportional` and `proportional_marginal` lead to the same weights (by observed sample size), reducing to 2 options only. While, if an interaction, for example a term like `TRT01A * REGION` is included, `proportional` and `proportional_marginal` are slightly different, with `proportional` according to the proportions observed in the interaction strata. # Collapsing Treatment Arms The option `method_combo = "collapse"` is available as alternative approach, mainly as a quick verification method, and for consistency reason with another internal system. In this scenario, the ANCOVA model is re-fitted on pooled arms and as such changes the estimand. While it is available as an option, our recommendation is to use the `contrasts` approach, and only utilize `method_combo = "collapse"` when explicitly specified in the SAP. # Summary - The `method_combo = "contrasts"` approach is aligned with `emmeans` theory - Combined LS means and differences are implemented as linear contrasts - Full covariance propagation is preserved - Interaction models require explicit decisions on within- and across-stratum weighting See also [junco::a_summarize_ancova_j()]. # Statistical Appendix: emmeans-based Estimation ## Notation Let: - \( \hat{\mu}_k \) denote the LS mean for treatment arm \( k \) - \( \mathbf{\hat{\mu}} = (\hat{\mu}_1, \ldots, \hat{\mu}_K)^T \) - \( \mathbf{V} = \mathrm{Var}(\mathbf{\hat{\mu}}) \) All quantities are obtained from the fitted ANCOVA model. --- ## A. Combined LS Means A combined treatment LS mean for a set of arms \( C \) is defined as: \[ \hat{\mu}_{C} = \sum_{k \in C} w_k \, \hat{\mu}_k , \qquad \sum_{k \in C} w_k = 1 \] where \( w_k \) are user-specified weights (e.g. equal or proportional). ### Variance The variance of the combined LS mean is: \[ \mathrm{Var}(\hat{\mu}_{C}) = \mathbf{w}^T \, \mathbf{V} \, \mathbf{w} \] This corresponds exactly to a linear contrast implemented via `emmeans::contrast()`. --- ## B. Contrasts Between Treatment Groups ### Simple Treatment Differences The estimated difference between two individual treatment arms \( k \) and \( r \) (e.g. Active vs Placebo) is defined as the contrast: \[ \hat{\delta}_{k,r} = \hat{\mu}_k - \hat{\mu}_r \] with variance: \[ \mathrm{Var}(\hat{\delta}_{k,r}) = \mathrm{Var}(\hat{\mu}_k) + \mathrm{Var}(\hat{\mu}_r) - 2 \, \mathrm{Cov}(\hat{\mu}_k, \hat{\mu}_r) \] This covariance term is automatically accounted for when contrasts are derived from the joint covariance matrix \( \mathbf{V} \). --- ### Differences Involving Combined Arms For a combined treatment \( C \) compared to reference arm \( r \): \[ \hat{\delta}_{C,r} = \hat{\mu}_{C} - \hat{\mu}_r \] This can be written as a single linear contrast with weight vector: \[ \mathbf{w}_{C,r} = (w_1, \ldots, w_K, -1) \] and variance: \[ \mathrm{Var}(\hat{\delta}_{C,r}) = \mathbf{w}_{C,r}^T \, \mathbf{V} \, \mathbf{w}_{C,r} \] This ensures internally consistent standard errors, confidence intervals, and p-values for combined treatment comparisons. --- ## C. Contrasts in Interaction Models When the ANCOVA model includes interaction terms (e.g. `TRT01A * REGION`), LS means are estimated **within each interaction stratum**. In this setting: - Treatment-specific LS means \( \hat{\mu}_{k,s} \) are defined per stratum \( s \) - Combined means are first formed *within stratum* - Pooling across strata is then applied via user-defined weights Formally, the combined LS mean marginal over strata is: \[ \hat{\mu}_{C} = \sum_{s} \pi_s \, \Big( \sum_{k \in C} w_{k|s} \, \hat{\mu}_{k,s} \Big) \] where: - \( w_{k|s} \) are within-stratum treatment weights - \( \pi_s \) are stratum-level weights The distinction between `weights_combo = "proportional"` and `"proportional_marginal"` determines whether \( w_{k|s} \) or \( \pi_s \) reflect observed sample sizes within or across strata. This two-stage weighting clarifies how estimands change in the presence of interactions and avoids implicit, undocumented averaging.