---
title: "Heterogeneity and person-specific effects"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Heterogeneity and person-specific effects}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Estimands: population, partial pooling, and no pooling

Mixed models estimate **population** fixed effects and **variance components** for random effects.
Person-specific quantities are usually summarized by **conditional modes** (empirical Bayes / BLUPs in `lme4`):

- **`ranef(fit)`**: deviations from the population mean.
- **`coef(fit)`**: person-specific coefficients (population + deviation), i.e. partial pooling.

These are **not** the same as fitting a separate regression in each person (**no pooling**), which
`ild_person_model()` implements for teaching and idiographic workflows. No-pooling estimates do not
shrink and can be very unstable with few observations per person.

Bayesian fits from `ild_brms()` report person-specific summaries via posterior distributions;
`ild_heterogeneity()` reads `coef(fit, summary = TRUE)` for posterior means and intervals.

## `ild_heterogeneity()`

After `ild_lme()` or `ild_brms()` with random effects, use:

```{r eval = requireNamespace("lme4", quietly = TRUE)}
library(tidyILD)
d <- ild_simulate(n_id = 20, n_obs_per = 10, seed = 7)
x <- ild_prepare(d, id = "id", time = "time")
x <- ild_center(x, y)
fit <- ild_lme(y ~ y_wp + y_bp + (1 | id), data = x)
h <- ild_heterogeneity(fit)
print(h$summary)
head(ild_tidy(h))
```

The `$summary` table includes the proportion of person-specific **total** coefficients greater than zero,
quantiles, and (for `lmer`) joins `VarCorr` standard deviations when names align.

Optional `threshold` and `scale = c("raw", "sd_x", "sd_y")` define substantively motivated cutoffs for
the proportion exceeding a threshold (e.g. a fraction of the SD of $x$ or $y$).

## Diagnostics bundle and plots

`ild_diagnose()` stores `fit$heterogeneity` when extraction succeeds. Plot with:

```{r eval = FALSE}
ild_autoplot(bundle, section = "fit", type = "heterogeneity", term = "y_wp")
```

Use `heterogeneity_type = "histogram"` in addition if you prefer a histogram (passed as `...`).

Guardrails **`GR_RE_SLOPE_VARIANCE_VERSUS_RESIDUAL_LOW`** and **`GR_PERSON_SPECIFIC_SLOPES_EMPIRICALLY_TIGHT`**
flag cases where estimated slope heterogeneity is small relative to residual noise (heuristic interpretation aids).

## Stratified descriptive comparison

`ild_heterogeneity_stratified()` refits the same formula within levels of a grouping column and binds
per-subgroup summaries. This is a **descriptive** tool, not a formal test of differences in variance
components; use with adequate $N$ per subgroup (`min_n_id`).

```{r eval = FALSE}
ild_heterogeneity_stratified(
  y ~ y_wp + (y_wp | id),
  data = x,
  subgroup = "cohort",
  min_n_id = 8L
)
```

## See also

`vignette("developer-contracts", package = "tidyILD")` documents the optional `fit$heterogeneity` slot
on `ild_diagnostics_bundle`.