---
title: "STC and Naive Benchmarks"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{STC and Naive Benchmarks}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE, purl = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE,
  message = FALSE,
  warning = FALSE
)
```

## Overview

In addition to ML-UMR, mlumr provides two frequentist methods:

- **STC** (Simulated Treatment Comparison): adjusts for cross-trial
  covariate differences using parametric G-computation
- **Naive**: unadjusted comparison of crude outcome summaries

These serve as important benchmarks alongside the Bayesian ML-UMR approach.

## Naive estimate

For binary outcomes, the naive method compares crude event rates without any
covariate adjustment:

$$
\text{LOR}_{\text{naive}} = \text{logit}(\hat{p}_A) - \text{logit}(\hat{p}_B)
$$

where $\hat{p}_A = \bar{Y}_{\text{IPD}}$ and $\hat{p}_B = r_B / n_B$ from
the AgD. The standard error is computed via the delta method:

$$
\text{SE} = \sqrt{\frac{1}{n_A \hat{p}_A(1-\hat{p}_A)} + \frac{1}{n_B \hat{p}_B(1-\hat{p}_B)}}
$$

### Usage

The chunks below operate on a small toy `dat`. Run this setup once first
so `naive()`, `stc()`, and the comparison block all have data to work
with.

```{r, eval = TRUE, purl = FALSE}
library(mlumr)
set.seed(2026)

trial_a_data <- data.frame(
  trt      = "Drug_A",
  response = rbinom(300, 1, 0.55),
  age_cat  = rbinom(300, 1, 0.40),
  sex      = rbinom(300, 1, 0.55)
)
trial_b_data <- data.frame(
  trt           = "Drug_B",
  n_total       = 400,
  n_events      = 160,
  age_cat_mean  = 0.35,
  sex_mean      = 0.50
)

ipd <- set_ipd(trial_a_data, treatment = "trt", outcome = "response",
               covariates = c("age_cat", "sex"))
agd <- set_agd(trial_b_data, treatment = "trt",
               outcome_n = "n_total", outcome_r = "n_events",
               cov_means = c("age_cat_mean", "sex_mean"),
               cov_types = c("binary", "binary"))
```

```{r, eval = TRUE, purl = FALSE}
# Prepare data (no integration points needed)
dat <- combine_data(ipd, agd)

result <- naive(dat)
print(result)
```

### Interpreting the naive estimate

The naive LOR is **biased** when covariate distributions differ between
the IPD and AgD populations. It provides a useful reference point:

- If the naive and adjusted estimates agree, covariate imbalance has little
  impact.
- If they disagree substantially, adjustment is important and the naive
  estimate should not be used for inference.

### Extreme rates

The function includes continuity-correction-style boundary guards for
extreme proportions (0% or 100% event rates), ensuring finite log odds
ratios even at the boundaries. Event-probability confidence intervals use
Wald standard errors and are bounded to `[0, 1]`.

## Simulated Treatment Comparison (STC)

STC uses parametric G-computation to adjust for covariate differences:

1. **Fit an outcome model** on IPD:
   $\text{logit}(P(Y_i = 1)) = \alpha + \mathbf{x}_i^T \boldsymbol{\beta}$
2. **Predict** counterfactual outcomes for the comparator population
3. **Marginalize**: $\hat{p}_{A|B} = \frac{1}{N} \sum_j \hat{P}(Y = 1 | \mathbf{x}_j)$
4. **Compute LOR**: $\text{LOR} = \text{logit}(\hat{p}_{A|B}) - \text{logit}(\hat{p}_B)$

The standard error uses the delta method, propagating parameter
uncertainty through the logit-of-mean transformation.

### Usage

```{r, eval = TRUE, purl = FALSE}
# Without integration points (uses AgD means)
result_stc <- stc(dat)

# With integration points (better marginalization)
dat <- add_integration(dat, n_int = 64,
                       age_cat = distr(qbern, prob = age_cat_mean),
                       sex = distr(qbern, prob = sex_mean))
result_stc <- stc(dat)

print(result_stc)
```

### With vs without integration points

- **Without integration points**: STC predicts only at the AgD covariate
  **means** (a single point). This can introduce bias for non-linear
  models (Jensen's inequality).
- **With integration points**: STC marginalizes over the full covariate
  distribution, giving a more accurate estimate. We recommend always using
  integration points.

For binary outcomes, event-probability confidence intervals are bounded to
`[0, 1]`. For Poisson outcomes, STC uses a 0.5 continuity correction for the
comparator log rate when the AgD event count is zero; the reported comparator
rate remains the observed rate.

### Accessing the GLM fit

The underlying logistic regression model is stored in the result:

```{r, eval = TRUE, purl = FALSE}
# Coefficients
coef(result_stc$glm_fit)

# Full GLM summary
summary(result_stc$glm_fit)

# Predicted probabilities
fitted(result_stc$glm_fit)
```

## Delta method variance

The delta method SE for STC accounts for uncertainty in:

1. **Outcome model parameters** (via the GLM's variance-covariance matrix)
2. **AgD event rate** (binomial sampling variance)

The gradient of $\text{logit}(\text{mean}(\sigma(\mathbf{X}\boldsymbol{\beta})))$
with respect to $\boldsymbol{\beta}$ is computed analytically:

$$
\frac{\partial}{\partial \boldsymbol{\beta}} \text{logit}\left(\frac{1}{N}\sum_j \sigma(\mathbf{x}_j^T\boldsymbol{\beta})\right) = \frac{1}{\bar{p}(1-\bar{p})} \cdot \frac{1}{N} \sum_j \sigma'(\eta_j) \mathbf{x}_j
$$

where $\sigma(\eta) = 1/(1 + e^{-\eta})$ and $\sigma'(\eta) = \sigma(\eta)(1 - \sigma(\eta))$.

## Continuous and count outcomes

The same `naive()` and `stc()` functions support normal and Poisson outcomes.
For normal outcomes, `naive()` compares the IPD mean against the inverse-
variance weighted AgD mean, while `stc()` fits a Gaussian GLM and
G-computes the index-treatment mean in the comparator population. For Poisson
outcomes, both methods report log rate ratios; zero observed event counts use
a 0.5 continuity correction on the log-rate scale so estimates remain finite.

```{r, eval = TRUE, purl = FALSE}
# Normal-family benchmark
ipd_normal <- set_ipd(
  data.frame(
    trt = "Drug_A",
    score = rnorm(120, mean = 3.0, sd = 1.0),
    age_cat = rbinom(120, 1, 0.40)
  ),
  treatment = "trt",
  outcome = "score",
  covariates = "age_cat",
  family = "normal"
)
agd_normal <- set_agd(
  data.frame(trt = "Drug_B", y_mean = 2.7, se = 0.12, age_cat_mean = 0.35),
  treatment = "trt",
  family = "normal",
  outcome_mean = "y_mean",
  outcome_se = "se",
  cov_means = "age_cat_mean",
  cov_types = "binary"
)
dat_normal <- combine_data(ipd_normal, agd_normal)
naive(dat_normal)
stc(dat_normal)
```

```{r, eval = TRUE, purl = FALSE}
# Poisson-family benchmark
exposure <- runif(120, 0.5, 2.0)
ipd_poisson <- set_ipd(
  data.frame(
    trt = "Drug_A",
    events = rpois(120, exp(0.2) * exposure),
    person_years = exposure,
    age_cat = rbinom(120, 1, 0.40)
  ),
  treatment = "trt",
  outcome = "events",
  covariates = "age_cat",
  family = "poisson",
  exposure = "person_years"
)
agd_poisson <- set_agd(
  data.frame(trt = "Drug_B", n_events = 40, person_years = 180,
             age_cat_mean = 0.35),
  treatment = "trt",
  family = "poisson",
  outcome_r = "n_events",
  outcome_E = "person_years",
  cov_means = "age_cat_mean",
  cov_types = "binary"
)
dat_poisson <- combine_data(ipd_poisson, agd_poisson)
naive(dat_poisson)
stc(dat_poisson)
```

## Comparing all methods

```{r, eval = TRUE, purl = FALSE}
# Fit all three methods
naive_result <- naive(dat)
stc_result <- stc(dat)
mlumr_result <- mlumr(
  dat, model = "spfa",
  chains = 2, iter = 500, warmup = 250,
  seed = 42, refresh = 0, verbose = FALSE
)

# Extract LORs for comparison
le_naive <- naive_result$link_effect
le_stc <- stc_result$link_effect
lor_mlumr <- marginal_effects(mlumr_result, effect = "lor",
                               population = "comparator")

cat("Method comparison (LOR in comparator population):\n")
cat(sprintf("  Naive:  %.3f [%.3f, %.3f]\n",
            naive_result$link_effect, naive_result$ci_lower, naive_result$ci_upper))
cat(sprintf("  STC:    %.3f [%.3f, %.3f]\n",
            stc_result$link_effect, stc_result$ci_lower, stc_result$ci_upper))
cat(sprintf("  ML-UMR: %.3f [%.3f, %.3f]\n",
            lor_mlumr$mean, lor_mlumr$q2.5, lor_mlumr$q97.5))
```

## Confidence level

Both `naive()` and `stc()` accept a `conf_level` parameter:

```{r, eval = TRUE, purl = FALSE}
# 90% confidence intervals
naive_90 <- naive(dat, conf_level = 0.90)
stc_90 <- stc(dat, conf_level = 0.90)
print(naive_90)
print(stc_90)
```

## Key differences from ML-UMR

| Feature | STC | Naive | ML-UMR |
|---------|-----|-------|--------|
| Covariate adjustment | Outcome model | None | Joint model |
| Population weighting | G-computation | None | QMC integration |
| Uncertainty | Delta method | Delta method | Posterior |
| Effect modification | Not captured | N/A | Relaxed model |
| Speed | Instant | Instant | Minutes |
| No. parameters | p+1 | 0 | 2+p (SPFA) or 2+2p (Relaxed) |

STC is faster but makes stronger modeling assumptions. ML-UMR jointly
models both data sources and naturally propagates all sources of
uncertainty through the posterior.