---
title: "Frequently Asked Questions"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Frequently Asked Questions}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## 1. What's the difference between a multiple-membership model and a conventional multilevel model?

In a conventional hierarchical multilevel model, each lower-level unit belongs to exactly one higher-level unit (e.g., students nested in schools). In a *multiple-membership model (MMMM)*, lower-level units can belong to multiple higher-level units simultaneously, or alternatively, higher-level outcomes can be influenced by multiple lower-level units.

For example:

- **Conventional MLM:** Students (level 1) nested in schools (level 2) — each student attends exactly one school.
- **MMMM:** Coalition governments (level 2) composed of multiple political parties (level 1) — each party can participate in multiple governments over time, and each government comprises multiple parties.

The MMMM accounts for this complex membership structure by weighting the contributions of each lower-level unit, allowing researchers to model how multiple units jointly shape higher-level outcomes.

## 2. What's the difference between a conventional MMMM and the extended MMMM implemented in `bml`?

The *conventional MMMM* (as implemented in MLwiN, brms, or other software) uses *fixed, pre-specified weights* to aggregate lower-level effects. For instance, you might use equal weights (1/n) or weights based on time spent in each context.

The *extended MMMM* in `bml` allows you to:

1. **Parameterize the weight function:** Instead of fixing weights, you can specify a functional form for weights (e.g., `w ~ 1/n^exp(b*X)`) and estimate the parameters that determine how lower-level units are aggregated.

2. **Test alternative aggregation mechanisms:** Compare different weighting schemes (equal weights, proportional weights, functions of covariates) to determine which best fits the data.

3. **Endogenize weight matrices:** Rather than imposing spatial or network weights externally, let the data determine connection strengths as functions of covariates.

This flexibility enables researchers to explicitly model the *micro-to-macro link* --- how lower-level characteristics aggregate to produce higher-level outcomes.

## 3. When should I use `bml` instead of other multilevel modeling packages?

Use `bml` when:

- **You have multiple-membership structures:** Higher-level outcomes depend on multiple lower-level units (e.g., coalitions composed of parties, teams composed of individuals, neighborhoods influenced by surrounding areas).

- **You need flexible weight functions:** Your theory suggests weights should depend on covariates, group size, or other features, not be fixed in advance.

- **You're studying aggregation processes / micro-to-macro relationships:** Your research question focuses on how lower-level units jointly shape higher-level outcomes, rather than how higher-level contexts shape lower-level outcomes. Rather than assuming a fixed aggregation (e.g., simple average), you want to test and estimate how lower-level effects combine.

For standard hierarchical models without multiple membership, packages like `lme4`, `brms`, or `MCMCglmm` will be more efficient. For conventional MMMMs with fixed weights, `brms` or MLwiN are excellent alternatives.

## 4. What outcome types and distributions does `bml` support?

`bml` supports a variety of outcome distributions commonly used in social science research:

- **Continuous outcomes:** Gaussian (normal) regression
- **Binary outcomes:** Logit (logistic) regression
- **Survival/duration outcomes:**
  - Cox proportional hazards model
  - Weibull accelerated failure time (AFT) model

You can also specify hierarchical random effects (`hm()` blocks) in addition to multiple-membership structures, allowing for hierarchically nested and cross-classified designs.

## 5. How do I specify the weight function, and what are the `c` and `ar` parameters?

The weight function is specified in the `fn()` container within an `mm()` block:

```r
mm(
  id = id(member_id, group_id),
  vars = vars(X),
  fn = fn(w ~ 1/n, c = TRUE),
  RE = TRUE,
  ar = FALSE
)
```

**Weight function components:**

- **`w ~ ...`**: Specifies the functional form for weights. You can use:
  - Group size: `n` (number of members in each group)
  - Covariates: Any variable in your data
  - Mathematical functions: `exp()`, `log()`, `sqrt()`, etc.
  - Examples: `w ~ 1/n` (equal weights), `w ~ tenure` (proportional to tenure), `w ~ 1/n^exp(b*similarity)` (similarity-weighted)

- **`c` parameter:** Controls weight normalization
  - `c = TRUE` (default): Weights are normalized to sum to 1 within each group (row-standardization)
  - `c = FALSE`: Weights are not normalized (useful when aggregating sums rather than averages)

- **`ar` parameter:** Autoregressive random effects
  - `ar = FALSE` (default): Standard independent random effects for each member
  - `ar = TRUE`: Member-level random effects evolve as a random walk across repeated group participations, capturing dynamics where a member's unobserved heterogeneity changes over time

**Important:** Only one `mm()` block can have `RE = TRUE` in a given model.

## 6. How do I fix parameters to known values?

There are two ways to fix parameters in `bml`, depending on where in the model they appear.

### Main equation and `hm()` blocks: Using `fix()`

Use the `fix()` helper inside `vars()` to hold a coefficient at a specified value rather than estimating it. This works both in the main equation and in `hm()` blocks.

**Main equation:**

```r
# Fix the coefficient of 'exposure' to 1 (i.e., use as an offset)
m <- bml(
  y ~ 1 + fix(exposure, 1) + x1 + x2,
  family = "Gaussian",
  data = dat
)
```

This is equivalent to adding `exposure` as a predictor but constraining its coefficient to 1 instead of estimating it. This is useful for offsets or when prior theory dictates a specific coefficient value.

**In `hm()` blocks:**

```r
m <- bml(
  y ~ 1 + x1 +
    mm(id = id(pid, gid), fn = fn(w ~ 1/n), RE = TRUE) +
    hm(id = id(cid), vars = vars(fix(investiture, 0.5) + gdp), type = "FE"),
  family = "Gaussian",
  data   = coalgov
)
```

**In `mm()` blocks:**

```r
m <- bml(
  Surv(dur_wkb, event_wkb) ~ 1 + majority +
    mm(
      id   = id(pid, gid),
      vars = vars(fix(cohesion, 1) + finance),
      fn   = fn(w ~ 1/n, c = TRUE),
      RE   = TRUE
    ),
  family = "Weibull",
  data   = coalgov
)
```

Here, the coefficient of `cohesion` is fixed at 1 while `finance` is freely estimated.

### Weight function `fn()`: Omitting parameters

In the weight function specified via `fn()`, you fix weights by simply not including any free parameters (`b0`, `b1`, ...) in the formula. When no parameters appear, the weight function is treated as a known, fixed transformation.

```r
# Fixed equal weights (no parameters to estimate)
fn(w ~ 1/n, c = TRUE)

# Fixed weights proportional to seat share (no parameters to estimate)
fn(w ~ pseat, c = TRUE)
```

Compare this with weight functions that include free parameters:

```r
# Estimable: b is estimated from data
fn(w ~ 1/n^exp(b * x), c = TRUE)

# Estimable: b0 is estimated from data
fn(w ~ 1 / (1 + (n - 1) * exp(-b0)), c = FALSE)
```

The key distinction: if the `fn()` formula contains `b`, `b0`, `b1`, etc., these are estimated. If the formula contains only data variables and constants (like `n`, `pseat`, `1`), the weights are fixed.

## 7. I get "Error in node w.1[...]: Invalid parent values" — what does this mean?

This error occurs when JAGS cannot evaluate the weight function for a particular observation because the computed weight is numerically invalid (e.g., `NaN`, `Inf`, or a value outside the domain of a downstream function). It most commonly arises with **parameterized weight functions** during the initialization phase.

**Why it happens:** Weight function parameters (`b0`, `b1`, ...) are given vague priors by default (`dnorm(0, 0.0001)` in JAGS precision, which corresponds to SD = 100). When JAGS initializes the MCMC chains, it may draw extreme starting values (e.g., `b0 = 80`). For weight functions involving nonlinear transformations like `ilogit()` or `exp()`, extreme parameter values can cause numerical issues downstream — even if the weight function itself is mathematically well-defined, the resulting weights may produce overflow in the likelihood (e.g., `exp(-mu * shape)` in the Weibull model).

**Built-in safeguard:** `bml` initializes all weight parameters at 0 by default. This ensures numerically stable starting values (e.g., `ilogit(0) = 0.5`). However, if the error persists or occurs during sampling (not just initialization), consider the following steps.

**How to fix it:**

1. **Use more informative priors.** Narrow the prior spread for weight parameters so that the sampler stays in a numerically stable region:

    ```r
    m <- bml(
      ...,
      priors = list(
        "b.w.1[1] ~ dnorm(0, 1)",   # SD = 1 instead of 100
        "b.w.1[2] ~ dnorm(0, 1)"
      )
    )
    ```

    This is especially important for parameters inside `ilogit()`, `exp()`, or other functions that saturate or explode at extreme inputs.

2. **Ensure the weight function is bounded.** Unbounded weight functions can produce extreme values that destabilize the likelihood. Common strategies:

    - Use `ilogit()` to bound weights between 0 and 1: `fn(w ~ ilogit(b0 + b1 * x), c = TRUE)`
    - Use `exp()` carefully — it grows rapidly, so pair it with constraints like `c = TRUE` (normalization) or wrap the argument: `fn(w ~ exp(b1 * x), c = TRUE)` where `x` is standardized

3. **Standardize covariates in the weight function.** If a weight variable has a large range (e.g., income in thousands), the product `b1 * x` can easily overflow. Standardize such variables before including them in `fn()`.

4. **Supply explicit initial values.** If the default initialization at 0 doesn't work for your model, provide custom starting values:

    ```r
    m <- bml(
      ...,
      inits = list("b.w.1" = c(0.5, -0.1))
    )
    ```

5. **Re-run the model.** Since the error can be seed-dependent (different chains draw different initial values), simply re-running may resolve it. However, this indicates a fragile parameterization — consider steps 1--3 for a robust solution.