---
title: "A Causal
Framework for Hierarchical Outcome Analysis"
output: rmarkdown::html_vignette
bibliography: biblio.bib
vignette: >
  %\VignetteIndexEntry{A Causal Framework for Hierarchical Outcome Analysis}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(causalWins)
```

## Description

Quantifying causal effects with complex, multivariate outcomes remains a challenge in treatment evaluation. For hierarchical outcomes, regulatory agencies such as the U.S. Food and Drug Administration recommend methods based on the Win Ratio (@pocock2012win) and Generalized Pairwise Comparisons (@buyse2010generalized). These approaches typically estimate (in this vignette using the WINS package by @WINS) a population-level quantity — the probability that a randomly selected treated patient fares better than a randomly selected control patient. In @even2025, the authors studied another quantity that can be very interesting in this individual context: the probability that a given individual would have a better outcome under treatment than under control. Unfortunately, this quantity is not identifiable, because for each individual we do not observe both the treated and control outcomes. @even2025 thus proposed a proxy for this quantity that is identifiable, and three ways to estimate it. This package implements that methodology and the three complementary estimators: (i) a nearest-neighbor matching estimator (nn_win_ratio), (ii) a distributional regression estimator (drf_win_ratio), and (iii) a semiparametric efficient estimator (eif_win_ratio).

## How to install

The latest release of the package can be installed through CRAN (soon):

```{r, eval=FALSE}
install.packages("causalWins")
```

## Example 1

The goal of this example is to illustrate, in a simple setting, how to use the functions `nn_win_ratio`, `drf_win_ratio`, and `eif_win_ratio`. The estimators produced by these functions target the causal (non-identifiable) quantity

$$
\tau_{\text{indiv}} := \mathbb{E}\Big[w\big(Y_i(1) \big| Y_i(0)\big)\Big],
$$

where $Y_i(0)$ and $Y_i(1)$ denote the potential outcomes for individual $i$ under control and treatment, respectively. The function $w(\cdot| \cdot)$ compares these outcomes to determine whether a patient benefits more from treatment or from control. This measure is not identifiable.  The authors of @buyse2010generalized introduce the following population quantity estimated via nearest-neighbor matching:
\[
\tau_{\text{pop}} := \mathbb{E}\Big[w\big(Y_i(1)\,\big|\, Y_j(0)\big)\Big],
\]
where $i$ and $j$ are two different and independent individuals. In @even2025 the authors studied a proxy of the individual quantity,

$$
\tau^\star := \mathbb{E}\Big[ w\big(Y^{(X_i)}(1), Y_i(0)\big) \Big], \quad \text{ where } \tau^\star(x) := \mathbb{E}\Big[ w\big(Y^{(x)}(1), Y_i(0)\big) \;\big|\; X_i = x \Big],
$$
and where for $x \in X$, $\{Y^{(x)}(0), Y^{(x)}(1)\}$ denotes an independent copy of
$\{Y_i(0), Y_i(1)\} \mid X_i = x$.


For more details on how to use them, please refer to the function documentation `?nn_win_ratio`, `?drf_win_ratio` and `eif_win_ratio`.

```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE}
# ---------------------------------------------------------
    # Generate data with mixed covariates
# ---------------------------------------------------------
set.seed(123)
n <- 100
base <- data.frame(
  X1 = rnorm(n, mean = 5, sd = 2), # numeric covariate 1
  X2 = rnorm(n, mean = 10, sd = 3), # numeric covariate 2
  X3 = factor(sample(c("A", "B", "C"), n, replace = TRUE)), # factor covariate
  Y1 = rnorm(n, mean = 50, sd = 10), # numeric outcome 1
  Y2 = rnorm(n, mean = 100, sd = 20), # numeric outcome 2
  arm = sample(c(0, 1), n, replace = TRUE) # binary treatment arm
)
# ---------------------------------------------------------
    # Compute win ratio with nearest neighbor pairing
# ---------------------------------------------------------
res_nn <- nn_win_ratio(
  base = base,
  treatment_name = "arm",
  outcome_names = c("Y1", "Y2")
)
# ---------------------------------------------------------
    # Compute win ratio with Distributional Random Forests
# ---------------------------------------------------------
res_drf <- drf_win_ratio(
  base = base,
  treatment_name = "arm",
  outcome_names = c("Y1", "Y2")
)
# ---------------------------------------------------------
    # Compute win ratio with Efficient Influence Functions
# ---------------------------------------------------------
res_eif <- eif_win_ratio(
  base = base,
  treatment_name = "arm",
  outcome_names = c("Y1", "Y2")
)
```

## Example 2  

Drawing on Example 2 from @even2025, the following simulation illustrates the potential discrepancy between the historical Win Ratio and the individual-level estimator proposed in that work. The historical Win Ratio estimated here using the `WINS` package (@WINS), which is based on complete pairing, differs from the estimates obtained via the individual-level approach and the `causalWins` package, which employs nearest-neighbor (NN) matching.

```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE}
library(WINS)

# In this example, there are two subpopulations: A and B. The potential outcomes are

y_0a <- 4 # potential outcome of individuals of subpopulation A without treatment
y_1b <- 3 # potential outcome of individuals of subpopulation B under treatment
y_0b <- 2 # potential outcome of individuals of subpopulation B without treatment
y_1a <- 1 # potential outcome of individuals of subpopulation A under treatment

alpha <- 1 / 3 # proportion of women

n_rounds <- 100
n <- 300*1:5  # number of individuals 300, 600, 900, 1200, 1500

rct_prob <- .5 # probability of assignment to treatment/control group

results_nn <- data.frame(row_id = 1:n_rounds)
results_comp <- data.frame(row_id = 1:n_rounds)

for (n_individuals in n) {
  current_result_nn <- c()
  current_result_comp <- c()

  for (round in 1:n_rounds) {
    set.seed(123 + n_individuals + round)

    # ---------------------------------------------------
    # Generate Data
    # ---------------------------------------------------

    X <- rbinom(n_individuals, 1, alpha) # Sample element in A with prob `alpha`
    treatment <- rbinom(n_individuals, 1, rct_prob) # Assign to treatment with prob `rct_prob`
    Y <- X * treatment * y_1a + X * (1 - treatment) * y_0a +
      (1 - X) * treatment * y_1b + (1 - X) * (1 - treatment) * y_0b # Outcomes
    base <- data.frame(X = X, Y = Y, arm = treatment)

    # ---------------------------------------------------
    # Compute Win Proportion for Nearest Neighbor Pairing
    # ---------------------------------------------------
    res_nn <- nn_win_ratio(
      base = base,
      treatment_name = "arm",
      outcome_names = "Y"
    )
    res_nn <- res_nn$win_proportion$value
    current_result_nn <- c(current_result_nn, res_nn)

    # ---------------------------------------------------
    # Compute Win Proportion for Complete Pairing
    # ---------------------------------------------------
    data <- base

    colnames(data)[colnames(data) == "Y"] <- "Y_1" # Format for use with the WINS Package
    data$id <- 1:n_individuals # Format for use with the WINS Package
    res_comp <- WINS::win.stat(
      data = data,
      ep_type = "continuous",
      arm.name = c(1, 0),
      priority = c(1),
      summary.print = FALSE
    )
    res_comp <- res_comp$Win_prop[[2]]
    current_result_comp <- c(current_result_comp, res_comp)
  }
  results_nn[[as.character(n_individuals)]] <- current_result_nn
  results_comp[[as.character(n_individuals)]] <- current_result_comp
}
```

### Boxplots

We now present boxplots of the simulated win proportions calculated using the `causalWins` and `WINS` packages. The results indicate that, under substantial heterogeneity, the two methods are not equivalent and can result in different treatment recommendations.

```{r myplot, fig.width=6, fig.height=4, dev='png', dpi=96, eval = FALSE}
results_nn$row_id <- NULL
results_comp$row_id <- NULL

# # Interleave columns for side-by-side plotting
combined <- unlist(Map(list, results_nn, results_comp), recursive = FALSE)

# Set names for x-axis
names(combined) <- rep(names(results_nn), each = 2)

# Plot
boxplot(combined,
  col = rep(c("skyblue", "orange"), ncol(results_nn)),
  las = 2,
  main = "Comparison of the win proportions computed \nwith Nearest Neighbors and Complete Pairings",
  ylab = "Win proportion",
  xlab = "Number of patients (n)"
)

legend("bottomleft",
  legend = c("NN", "Complete"),
  fill = c("skyblue", "orange")
)

abline(h = 0.5, lty = 2, col = "red")
```

```{r, echo=FALSE, out.width="70%"}
knitr::include_graphics("figures/Example2.png")
```

## Example 3
In contrast with the previous example, the following simulation demonstrates that in an RCT with a more homogeneous population, the complete pairing and nearest neighbor approaches yield broadly similar results.

```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE}
library(MASS)
library(WINS)

n_rounds <- 100
n <- 300* 1:5 # number of individuals 300, 600, 900, 1200, 1500

mu <- c(0, 0) # Covariates mean
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2) # Covariates covariance

tau_cont <- 2 # ATE of continuous outcome
tau_bin <- 0.8 # ATE on logit scale of binary outcome

rct_prob <- .5 # probability of assignment to treatment/control group

results_nn <- data.frame(row_id = 1:n_rounds)
results_comp <- data.frame(row_id = 1:n_rounds)

for (n_individuals in n) {
  current_result_nn <- c()
  current_result_comp <- c()

  for (round in 1:n_rounds) {
    set.seed(123 + n_individuals + round)

    # ---------------------------------------------------
    # Generate Covariates and Treatment
    # ---------------------------------------------------

    covariates <- as.data.frame(mvrnorm(n_individuals, mu = mu, Sigma = sigma)) # covariates
    colnames(covariates) <- c("X1", "X2")
    treatment <- rbinom(n_individuals, 1, rct_prob) # Assign to treatment with prob `rct_prob`

    # ---------------------------------------------------
    # Generate Binary Outcomes
    # ---------------------------------------------------
    # Potential Outcomes
    lin_pred0 <- -1 + 0.3 * covariates$X1 - 0.5 * covariates$X2
    lin_pred1 <- lin_pred0 + tau_bin

    # Sample binary potential outcomes
    prob0 <- 1 / (1 + exp(-lin_pred0))
    prob1 <- 1 / (1 + exp(-lin_pred1))
    Y0_bin <- rbinom(n_individuals, size = 1, prob = prob0)
    Y1_bin <- rbinom(n_individuals, size = 1, prob = prob1)

    # Observed binary outcome
    Y_1 <- Y0_bin * (1 - treatment) + Y1_bin * treatment

    # ---------------------------------------------------
    # Generate Continuous Outcomes
    # ---------------------------------------------------
    # Potential Outcomes
    Y0_cont <- 1 + 0.5 * covariates$X1 - 0.7 * covariates$X2 + rnorm(n_individuals, 0, 1)
    Y1_cont <- Y0_cont + tau_cont

    # Observed continuous outcome
    Y_2 <- Y0_cont * (1 - treatment) + Y1_cont * treatment

    # ---------------------------------------------------
    # Compute Win Proportion for Nearest Neighbor Pairing
    # ---------------------------------------------------
    base <- data.frame(covariates, arm = treatment, Y_1, Y_2)

    res_nn <- nn_win_ratio(
      base = base,
      treatment_name = "arm",
      outcome_names = c("Y_1", "Y_2")
    )
    res_nn <- res_nn$win_proportion$value
    current_result_nn <- c(current_result_nn, res_nn)
    # ---------------------------------------------------
    # Compute Win Proportion for Complete Pairing
    # ---------------------------------------------------
    data <- base

    data$id <- 1:n_individuals # Format for use with the WINS Package
    res_comp <- WINS::win.stat(
      data = data,
      ep_type = c("binary", "continuous"),
      arm.name = c(1, 0),
      priority = c(1, 2),
      summary.print = FALSE
    )
    res_comp <- res_comp$Win_prop[[2]]
    current_result_comp <- c(current_result_comp, res_comp)
  }
  results_nn[[as.character(n_individuals)]] <- current_result_nn
  results_comp[[as.character(n_individuals)]] <- current_result_comp
}
```

### Boxplots

Unlike the previous plot, we now observe that both approaches yield similar treatment recommendations.

```{r myplot2, fig.width=6, fig.height=4, dev='png', dpi=96, eval = FALSE}
results_nn$row_id <- NULL
results_comp$row_id <- NULL

# # Interleave columns for side-by-side plotting
combined <- unlist(Map(list, results_nn, results_comp), recursive = FALSE)

# Set names for x-axis
names(combined) <- rep(names(results_nn), each = 2)

# Plot
boxplot(combined,
  col = rep(c("skyblue", "orange"), ncol(results_nn)),
  las = 2,
  main = "Comparison of the win proportions computed \nwith Nearest Neighbors and Complete Pairings",
  ylab = "Win proportion",
  xlab = "Number of patients (n)"
)

legend("bottomleft",
  legend = c("NN", "Complete"),
  fill = c("skyblue", "orange")
)

abline(h = 0.5, lty = 2, col = "red")
```

```{r, echo=FALSE, out.width="70%"}
knitr::include_graphics("figures/Example3.png")
```

## Example 4
This simulation illustrates the application of the Efficient Influence Function (EIF) method. The data-generating process is similar to the previous example: covariates are produced in the same way. However, the treatment assignment is no longer randomized. Instead, it depends on the covariates, meaning the setting corresponds to an observational study. 

```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE}
library(doParallel)
library(foreach)

n_rounds <- 100
n <- 10000
mu <- c(0, 0)
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
tau_cont <- 2
tau_bin <- 0.8

# Set up cluster
n_cores <- parallel::detectCores() - 1  # leave 1 core free
cl <- makeCluster(n_cores)
registerDoParallel(cl)

results_eif <- foreach(
  round = 1:n_rounds,
  .combine = c,
  .packages = c("MASS", "causalWins")
) %dopar% {
  set.seed(123 + n + round)

  # -----------------------------------------------------------
  # Generate Covariates and Treatment (observational)
  # -----------------------------------------------------------

  covariates <- mvrnorm(n, mu = mu, Sigma = sigma) # covariates
  colnames(covariates) <- c("X1", "X2")
  covariates <- as.data.frame(covariates)

  p_treatment <- 1 / (1 + exp(-(0.5 * covariates$X1 - 0.3 * covariates$X2))) # non-rct
  treatment <- rbinom(n, 1, p_treatment)

  # -----------------------------------------------------------
  # Generate Binary Outcomes
  # -----------------------------------------------------------
  # Potential Outcomes
  lin_pred0 <- -1 + 0.3 * covariates$X1 - 0.5 * covariates$X2
  lin_pred1 <- lin_pred0 + tau_bin

  # Sample binary potential outcomes
  prob0 <- 1 / (1 + exp(-lin_pred0))
  prob1 <- 1 / (1 + exp(-lin_pred1))
  Y0_bin <- rbinom(n, size = 1, prob = prob0)
  Y1_bin <- rbinom(n, size = 1, prob = prob1)

  # Observed binary outcome
  Y_1 <- Y0_bin * (1 - treatment) + Y1_bin * treatment

  # -----------------------------------------------------------
  # Generate Continuous Outcomes
  # -----------------------------------------------------------
  # Potential Outcomes
  Y0_cont <- 1 + 0.5 * covariates$X1 - 0.7 * covariates$X2 + rnorm(n, 0, 1)
  Y1_cont <- Y0_cont + tau_cont

  # Observed continuous outcome
  Y_2 <- Y0_cont * (1 - treatment) + Y1_cont * treatment

  # -----------------------------------------------------------
  # Compute Win Proportion for drf and eif
  # -----------------------------------------------------------
  base <- data.frame(covariates, arm = treatment, Y_1, Y_2)

  res_eif <- eif_win_ratio(
    base = base,
    treatment_name = "arm",
    outcome_names = c("Y_1", "Y_2"),
    propensity_model = "glm"
  )

  res_eif <- res_eif$win_proportion$value

  res_eif
}
stopCluster(cl)
```


### Approximation of the true causal measure

In what follows, we construct an approximation of the causal measure $\tau^{\star}$ defined in @even2025 and which is revisited in Example 1 for completeness.

```{r, message=FALSE, warning=FALSE, results='hide', cache=TRUE, eval = FALSE}
library(MASS)
mu <- c(0, 0) # Covariates mean
sigma <- matrix(c(1, 0.6, 0.6, 1), 2, 2)
tau_cont <- 2 # ATE of continuous outcome
tau_bin <- 0.8 # ATE on logit scale of binary outcome

# Default win function
win_function <- function(y, z) {
  if (all(y == z)) {
    return(0) # Tie
  }
  result <- (y > z)[which.max(y != z)] * 1
  return(result)
}

mean_given <- function(x) {
  n_samples <- 10000
  # Binary potential outcomes
  lin_pred0 <- -1 + 0.3 * x[1] - 0.5 * x[2]
  lin_pred1 <- lin_pred0 + tau_bin
  prob0 <- 1 / (1 + exp(-lin_pred0))
  prob1 <- 1 / (1 + exp(-lin_pred1))
  Y0_bin <- rbinom(n_samples, size = 1, prob = prob0)
  Y1_bin <- rbinom(n_samples, size = 1, prob = prob1)

  # Continuous Potential Outcomes
  Y0_cont <- 1 + 0.5 * x[1] - 0.7 * x[2] + rnorm(n_samples, 0, 1)
  Y1_cont <- 1 + 0.5 * x[1] - 0.7 * x[2] + rnorm(n_samples, 0, 1) + tau_cont

  wins_given_x <- mapply(function(yb, yc, zb, zc) {
    win_function(c(yb, yc), c(zb, zc))
  }, Y1_bin, Y1_cont, Y0_bin, Y0_cont)

  return(mean(wins_given_x))
}

set.seed(123)

covariates <- as.data.frame(mvrnorm(10000, mu = mu, Sigma = sigma))

tau_star <- mean(apply(covariates, 1, function(row) mean_given(row)))

```

### Plot

The plot displays the simulation boxplots alongside the target causal measure.

```{r myplot3, fig.width=6, fig.height=4, dev='png', dpi=96, eval = FALSE}
boxplot(results_eif,
  main = "Win proportion with EIF",
  xlab = "Number of patients 10000",
  ylab = "Win proportion"
)
abline(h = tau_star, lty = 2, col = "red")
```

```{r, echo=FALSE, out.width="70%"}
knitr::include_graphics("figures/EIF_resPlot.png")
```

## Computation of the Confidence Interval for Win Ratio
Let $w$ and $l$ denote the win and loss vectors, respectively, as computed by any of the three methods. The win ratio is defined as $\bar{w} / \bar{l}$, where
\[
\bar{w} = \frac{1}{n} \sum_{i=1}^{n} w_i
\quad \text{and} \quad
\bar{l} = \frac{1}{n} \sum_{i=1}^{n} l_i,
\]
with $n$ denoting the number of individuals. Let $s_w^2$, $s_l^2$, and $s^2_{wl}$ denote the sample variances of $w$, $l$ and the sample covariance of $(w,l)$, i.e.,
\begin{equation}
\begin{aligned}
s^2_w &= \frac{1}{n-1}\sum_i (w_i-\bar w)^2, \quad 
s^2_l = \frac{1}{n-1}\sum_i (l_i-\bar l)^2\quad \text{ and} \\
s^2_{wl} &= \frac{1}{n-1}\sum_i (w_i-\bar w)(l_i-\bar l)
\end{aligned}
\end{equation}
To obtain the confidence interval for the win ratio we first compute 
\begin{equation}
s_\text{log} = \sqrt{\frac{s^2_w}{n\bar w^2}+\frac{s^2_l}{n\bar l^2}-2\frac{s^2_{wl}}{n \bar w \bar l}}
\end{equation}
which can be seen to be an approximation of the variance of $\log(\bar w/ \bar l)=\log(\bar w) -\log(\bar l)$ using the delta method. The confidence interval is then computed by
\begin{equation}
\Big[\exp\big(\log(\bar w/ \bar l)-1.96 s_{\text{log}})\big), \exp\big(\log(\bar w \bar l)+1.96 s_{\text{log}})\big)\Big];
\end{equation}
which can be seen as the exponential of the confidence interval for the $\log(\bar w/\bar l)$.

## References