--- title: "A Causal Framework for Hierarchical Outcome Analysis" output: rmarkdown::html_vignette bibliography: biblio.bib vignette: > %\VignetteIndexEntry{A Causal Framework for Hierarchical Outcome Analysis} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(causalWins) ``` ## Description Quantifying causal effects with complex, multivariate outcomes remains a challenge in treatment evaluation. For hierarchical outcomes, regulatory agencies such as the U.S. Food and Drug Administration recommend methods based on the Win Ratio (@pocock2012win) and Generalized Pairwise Comparisons (@buyse2010generalized). These approaches typically estimate (in this vignette using the WINS package by @WINS) a population-level quantity — the probability that a randomly selected treated patient fares better than a randomly selected control patient. In @even2025, the authors studied another quantity that can be very interesting in this individual context: the probability that a given individual would have a better outcome under treatment than under control. Unfortunately, this quantity is not identifiable, because for each individual we do not observe both the treated and control outcomes. @even2025 thus proposed a proxy for this quantity that is identifiable, and three ways to estimate it. This package implements that methodology and the three complementary estimators: (i) a nearest-neighbor matching estimator (nn_win_ratio), (ii) a distributional regression estimator (drf_win_ratio), and (iii) a semiparametric efficient estimator (eif_win_ratio). ## How to install The latest release of the package can be installed through CRAN (soon): ```{r, eval=FALSE} install.packages("causalWins") ``` ## Example 1 The goal of this example is to illustrate, in a simple setting, how to use the functions `nn_win_ratio`, `drf_win_ratio`, and `eif_win_ratio`. The estimators produced by these functions target the causal (non-identifiable) quantity $$ \tau_{\text{indiv}} := \mathbb{E}\Big[w\big(Y_i(1) \big| Y_i(0)\big)\Big], $$ where $Y_i(0)$ and $Y_i(1)$ denote the potential outcomes for individual $i$ under control and treatment, respectively. The function $w(\cdot| \cdot)$ compares these outcomes to determine whether a patient benefits more from treatment or from control. This measure is not identifiable. The authors of @buyse2010generalized introduce the following population quantity estimated via nearest-neighbor matching: \[ \tau_{\text{pop}} := \mathbb{E}\Big[w\big(Y_i(1)\,\big|\, Y_j(0)\big)\Big], \] where $i$ and $j$ are two different and independent individuals. In @even2025 the authors studied a proxy of the individual quantity, $$ \tau^\star := \mathbb{E}\Big[ w\big(Y^{(X_i)}(1), Y_i(0)\big) \Big], \quad \text{ where } \tau^\star(x) := \mathbb{E}\Big[ w\big(Y^{(x)}(1), Y_i(0)\big) \;\big|\; X_i = x \Big], $$ and where for $x \in X$, $\{Y^{(x)}(0), Y^{(x)}(1)\}$ denotes an independent copy of $\{Y_i(0), Y_i(1)\} \mid X_i = x$. For more details on how to use them, please refer to the function documentation `?nn_win_ratio`, `?drf_win_ratio` and `eif_win_ratio`. ```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE} # --------------------------------------------------------- # Generate data with mixed covariates # --------------------------------------------------------- set.seed(123) n <- 100 base <- data.frame( X1 = rnorm(n, mean = 5, sd = 2), # numeric covariate 1 X2 = rnorm(n, mean = 10, sd = 3), # numeric covariate 2 X3 = factor(sample(c("A", "B", "C"), n, replace = TRUE)), # factor covariate Y1 = rnorm(n, mean = 50, sd = 10), # numeric outcome 1 Y2 = rnorm(n, mean = 100, sd = 20), # numeric outcome 2 arm = sample(c(0, 1), n, replace = TRUE) # binary treatment arm ) # --------------------------------------------------------- # Compute win ratio with nearest neighbor pairing # --------------------------------------------------------- res_nn <- nn_win_ratio( base = base, treatment_name = "arm", outcome_names = c("Y1", "Y2") ) # --------------------------------------------------------- # Compute win ratio with Distributional Random Forests # --------------------------------------------------------- res_drf <- drf_win_ratio( base = base, treatment_name = "arm", outcome_names = c("Y1", "Y2") ) # --------------------------------------------------------- # Compute win ratio with Efficient Influence Functions # --------------------------------------------------------- res_eif <- eif_win_ratio( base = base, treatment_name = "arm", outcome_names = c("Y1", "Y2") ) ``` ## Example 2 Drawing on Example 2 from @even2025, the following simulation illustrates the potential discrepancy between the historical Win Ratio and the individual-level estimator proposed in that work. The historical Win Ratio estimated here using the `WINS` package (@WINS), which is based on complete pairing, differs from the estimates obtained via the individual-level approach and the `causalWins` package, which employs nearest-neighbor (NN) matching. ```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE} library(WINS) # In this example, there are two subpopulations: A and B. The potential outcomes are y_0a <- 4 # potential outcome of individuals of subpopulation A without treatment y_1b <- 3 # potential outcome of individuals of subpopulation B under treatment y_0b <- 2 # potential outcome of individuals of subpopulation B without treatment y_1a <- 1 # potential outcome of individuals of subpopulation A under treatment alpha <- 1 / 3 # proportion of women n_rounds <- 100 n <- 300*1:5 # number of individuals 300, 600, 900, 1200, 1500 rct_prob <- .5 # probability of assignment to treatment/control group results_nn <- data.frame(row_id = 1:n_rounds) results_comp <- data.frame(row_id = 1:n_rounds) for (n_individuals in n) { current_result_nn <- c() current_result_comp <- c() for (round in 1:n_rounds) { set.seed(123 + n_individuals + round) # --------------------------------------------------- # Generate Data # --------------------------------------------------- X <- rbinom(n_individuals, 1, alpha) # Sample element in A with prob `alpha` treatment <- rbinom(n_individuals, 1, rct_prob) # Assign to treatment with prob `rct_prob` Y <- X * treatment * y_1a + X * (1 - treatment) * y_0a + (1 - X) * treatment * y_1b + (1 - X) * (1 - treatment) * y_0b # Outcomes base <- data.frame(X = X, Y = Y, arm = treatment) # --------------------------------------------------- # Compute Win Proportion for Nearest Neighbor Pairing # --------------------------------------------------- res_nn <- nn_win_ratio( base = base, treatment_name = "arm", outcome_names = "Y" ) res_nn <- res_nn$win_proportion$value current_result_nn <- c(current_result_nn, res_nn) # --------------------------------------------------- # Compute Win Proportion for Complete Pairing # --------------------------------------------------- data <- base colnames(data)[colnames(data) == "Y"] <- "Y_1" # Format for use with the WINS Package data$id <- 1:n_individuals # Format for use with the WINS Package res_comp <- WINS::win.stat( data = data, ep_type = "continuous", arm.name = c(1, 0), priority = c(1), summary.print = FALSE ) res_comp <- res_comp$Win_prop[[2]] current_result_comp <- c(current_result_comp, res_comp) } results_nn[[as.character(n_individuals)]] <- current_result_nn results_comp[[as.character(n_individuals)]] <- current_result_comp } ``` ### Boxplots We now present boxplots of the simulated win proportions calculated using the `causalWins` and `WINS` packages. The results indicate that, under substantial heterogeneity, the two methods are not equivalent and can result in different treatment recommendations. ```{r myplot, fig.width=6, fig.height=4, dev='png', dpi=96, eval = FALSE} results_nn$row_id <- NULL results_comp$row_id <- NULL # # Interleave columns for side-by-side plotting combined <- unlist(Map(list, results_nn, results_comp), recursive = FALSE) # Set names for x-axis names(combined) <- rep(names(results_nn), each = 2) # Plot boxplot(combined, col = rep(c("skyblue", "orange"), ncol(results_nn)), las = 2, main = "Comparison of the win proportions computed \nwith Nearest Neighbors and Complete Pairings", ylab = "Win proportion", xlab = "Number of patients (n)" ) legend("bottomleft", legend = c("NN", "Complete"), fill = c("skyblue", "orange") ) abline(h = 0.5, lty = 2, col = "red") ``` ```{r, echo=FALSE, out.width="70%"} knitr::include_graphics("figures/Example2.png") ``` ## Example 3 In contrast with the previous example, the following simulation demonstrates that in an RCT with a more homogeneous population, the complete pairing and nearest neighbor approaches yield broadly similar results. ```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE} library(MASS) library(WINS) n_rounds <- 100 n <- 300* 1:5 # number of individuals 300, 600, 900, 1200, 1500 mu <- c(0, 0) # Covariates mean sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) # Covariates covariance tau_cont <- 2 # ATE of continuous outcome tau_bin <- 0.8 # ATE on logit scale of binary outcome rct_prob <- .5 # probability of assignment to treatment/control group results_nn <- data.frame(row_id = 1:n_rounds) results_comp <- data.frame(row_id = 1:n_rounds) for (n_individuals in n) { current_result_nn <- c() current_result_comp <- c() for (round in 1:n_rounds) { set.seed(123 + n_individuals + round) # --------------------------------------------------- # Generate Covariates and Treatment # --------------------------------------------------- covariates <- as.data.frame(mvrnorm(n_individuals, mu = mu, Sigma = sigma)) # covariates colnames(covariates) <- c("X1", "X2") treatment <- rbinom(n_individuals, 1, rct_prob) # Assign to treatment with prob `rct_prob` # --------------------------------------------------- # Generate Binary Outcomes # --------------------------------------------------- # Potential Outcomes lin_pred0 <- -1 + 0.3 * covariates$X1 - 0.5 * covariates$X2 lin_pred1 <- lin_pred0 + tau_bin # Sample binary potential outcomes prob0 <- 1 / (1 + exp(-lin_pred0)) prob1 <- 1 / (1 + exp(-lin_pred1)) Y0_bin <- rbinom(n_individuals, size = 1, prob = prob0) Y1_bin <- rbinom(n_individuals, size = 1, prob = prob1) # Observed binary outcome Y_1 <- Y0_bin * (1 - treatment) + Y1_bin * treatment # --------------------------------------------------- # Generate Continuous Outcomes # --------------------------------------------------- # Potential Outcomes Y0_cont <- 1 + 0.5 * covariates$X1 - 0.7 * covariates$X2 + rnorm(n_individuals, 0, 1) Y1_cont <- Y0_cont + tau_cont # Observed continuous outcome Y_2 <- Y0_cont * (1 - treatment) + Y1_cont * treatment # --------------------------------------------------- # Compute Win Proportion for Nearest Neighbor Pairing # --------------------------------------------------- base <- data.frame(covariates, arm = treatment, Y_1, Y_2) res_nn <- nn_win_ratio( base = base, treatment_name = "arm", outcome_names = c("Y_1", "Y_2") ) res_nn <- res_nn$win_proportion$value current_result_nn <- c(current_result_nn, res_nn) # --------------------------------------------------- # Compute Win Proportion for Complete Pairing # --------------------------------------------------- data <- base data$id <- 1:n_individuals # Format for use with the WINS Package res_comp <- WINS::win.stat( data = data, ep_type = c("binary", "continuous"), arm.name = c(1, 0), priority = c(1, 2), summary.print = FALSE ) res_comp <- res_comp$Win_prop[[2]] current_result_comp <- c(current_result_comp, res_comp) } results_nn[[as.character(n_individuals)]] <- current_result_nn results_comp[[as.character(n_individuals)]] <- current_result_comp } ``` ### Boxplots Unlike the previous plot, we now observe that both approaches yield similar treatment recommendations. ```{r myplot2, fig.width=6, fig.height=4, dev='png', dpi=96, eval = FALSE} results_nn$row_id <- NULL results_comp$row_id <- NULL # # Interleave columns for side-by-side plotting combined <- unlist(Map(list, results_nn, results_comp), recursive = FALSE) # Set names for x-axis names(combined) <- rep(names(results_nn), each = 2) # Plot boxplot(combined, col = rep(c("skyblue", "orange"), ncol(results_nn)), las = 2, main = "Comparison of the win proportions computed \nwith Nearest Neighbors and Complete Pairings", ylab = "Win proportion", xlab = "Number of patients (n)" ) legend("bottomleft", legend = c("NN", "Complete"), fill = c("skyblue", "orange") ) abline(h = 0.5, lty = 2, col = "red") ``` ```{r, echo=FALSE, out.width="70%"} knitr::include_graphics("figures/Example3.png") ``` ## Example 4 This simulation illustrates the application of the Efficient Influence Function (EIF) method. The data-generating process is similar to the previous example: covariates are produced in the same way. However, the treatment assignment is no longer randomized. Instead, it depends on the covariates, meaning the setting corresponds to an observational study. ```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE} library(doParallel) library(foreach) n_rounds <- 100 n <- 10000 mu <- c(0, 0) sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) tau_cont <- 2 tau_bin <- 0.8 # Set up cluster n_cores <- parallel::detectCores() - 1 # leave 1 core free cl <- makeCluster(n_cores) registerDoParallel(cl) results_eif <- foreach( round = 1:n_rounds, .combine = c, .packages = c("MASS", "causalWins") ) %dopar% { set.seed(123 + n + round) # ----------------------------------------------------------- # Generate Covariates and Treatment (observational) # ----------------------------------------------------------- covariates <- mvrnorm(n, mu = mu, Sigma = sigma) # covariates colnames(covariates) <- c("X1", "X2") covariates <- as.data.frame(covariates) p_treatment <- 1 / (1 + exp(-(0.5 * covariates$X1 - 0.3 * covariates$X2))) # non-rct treatment <- rbinom(n, 1, p_treatment) # ----------------------------------------------------------- # Generate Binary Outcomes # ----------------------------------------------------------- # Potential Outcomes lin_pred0 <- -1 + 0.3 * covariates$X1 - 0.5 * covariates$X2 lin_pred1 <- lin_pred0 + tau_bin # Sample binary potential outcomes prob0 <- 1 / (1 + exp(-lin_pred0)) prob1 <- 1 / (1 + exp(-lin_pred1)) Y0_bin <- rbinom(n, size = 1, prob = prob0) Y1_bin <- rbinom(n, size = 1, prob = prob1) # Observed binary outcome Y_1 <- Y0_bin * (1 - treatment) + Y1_bin * treatment # ----------------------------------------------------------- # Generate Continuous Outcomes # ----------------------------------------------------------- # Potential Outcomes Y0_cont <- 1 + 0.5 * covariates$X1 - 0.7 * covariates$X2 + rnorm(n, 0, 1) Y1_cont <- Y0_cont + tau_cont # Observed continuous outcome Y_2 <- Y0_cont * (1 - treatment) + Y1_cont * treatment # ----------------------------------------------------------- # Compute Win Proportion for drf and eif # ----------------------------------------------------------- base <- data.frame(covariates, arm = treatment, Y_1, Y_2) res_eif <- eif_win_ratio( base = base, treatment_name = "arm", outcome_names = c("Y_1", "Y_2"), propensity_model = "glm" ) res_eif <- res_eif$win_proportion$value res_eif } stopCluster(cl) ``` ### Approximation of the true causal measure In what follows, we construct an approximation of the causal measure $\tau^{\star}$ defined in @even2025 and which is revisited in Example 1 for completeness. ```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE} library(MASS) mu <- c(0, 0) # Covariates mean sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) tau_cont <- 2 # ATE of continuous outcome tau_bin <- 0.8 # ATE on logit scale of binary outcome # Default win function win_function <- function(y, z) { if (all(y == z)) { return(0) # Tie } result <- (y > z)[which.max(y != z)] * 1 return(result) } mean_given <- function(x) { n_samples <- 10000 # Binary potential outcomes lin_pred0 <- -1 + 0.3 * x[1] - 0.5 * x[2] lin_pred1 <- lin_pred0 + tau_bin prob0 <- 1 / (1 + exp(-lin_pred0)) prob1 <- 1 / (1 + exp(-lin_pred1)) Y0_bin <- rbinom(n_samples, size = 1, prob = prob0) Y1_bin <- rbinom(n_samples, size = 1, prob = prob1) # Continuous Potential Outcomes Y0_cont <- 1 + 0.5 * x[1] - 0.7 * x[2] + rnorm(n_samples, 0, 1) Y1_cont <- 1 + 0.5 * x[1] - 0.7 * x[2] + rnorm(n_samples, 0, 1) + tau_cont wins_given_x <- mapply(function(yb, yc, zb, zc) { win_function(c(yb, yc), c(zb, zc)) }, Y1_bin, Y1_cont, Y0_bin, Y0_cont) return(mean(wins_given_x)) } set.seed(123) covariates <- as.data.frame(mvrnorm(10000, mu = mu, Sigma = sigma)) tau_star <- mean(apply(covariates, 1, function(row) mean_given(row))) ``` ### Plot The plot displays the simulation boxplots alongside the target causal measure. ```{r myplot3, fig.width=6, fig.height=4, dev='png', dpi=96, eval = FALSE} boxplot(results_eif, main = "Win proportion with EIF", xlab = "Number of patients 10000", ylab = "Win proportion" ) abline(h = tau_star, lty = 2, col = "red") ``` ```{r, echo=FALSE, out.width="70%"} knitr::include_graphics("figures/EIF_resPlot.png") ``` ## Computation of the Confidence Interval for Win Ratio Let $w$ and $l$ denote the win and loss vectors, respectively, as computed by any of the three methods. The win ratio is defined as $\bar{w} / \bar{l}$, where \[ \bar{w} = \frac{1}{n} \sum_{i=1}^{n} w_i \quad \text{and} \quad \bar{l} = \frac{1}{n} \sum_{i=1}^{n} l_i, \] with $n$ denoting the number of individuals. Let $s_w^2$, $s_l^2$, and $s^2_{wl}$ denote the sample variances of $w$, $l$ and the sample covariance of $(w,l)$, i.e., \begin{equation} \begin{aligned} s^2_w &= \frac{1}{n-1}\sum_i (w_i-\bar w)^2, \quad s^2_l = \frac{1}{n-1}\sum_i (l_i-\bar l)^2\quad \text{ and} \\ s^2_{wl} &= \frac{1}{n-1}\sum_i (w_i-\bar w)(l_i-\bar l) \end{aligned} \end{equation} To obtain the confidence interval for the win ratio we first compute \begin{equation} s_\text{log} = \sqrt{\frac{s^2_w}{n\bar w^2}+\frac{s^2_l}{n\bar l^2}-2\frac{s^2_{wl}}{n \bar w \bar l}} \end{equation} which can be seen to be an approximation of the variance of $\log(\bar w/ \bar l)=\log(\bar w) -\log(\bar l)$ using the delta method. The confidence interval is then computed by \begin{equation} \Big[\exp\big(\log(\bar w/ \bar l)-1.96 s_{\text{log}})\big), \exp\big(\log(\bar w \bar l)+1.96 s_{\text{log}})\big)\Big]; \end{equation} which can be seen as the exponential of the confidence interval for the $\log(\bar w/\bar l)$. ## References