6. Repeated-Measures Workflows

Scope

Repeated-measures analysis requires a different data layout and, usually, a different estimand. The wide-data matrix workflow is not enough because the analysis must account for the repeated structure within subject.

This vignette covers:

rmcorr()
ba_rm()
ccc_rm_ustat()
ccc_rm_reml()
icc_rm_reml()

Repeated-measures functions use a subject argument for the subject identifier role, including rmcorr(), ba_rm(), ccc_rm_ustat(), ccc_rm_reml(), and icc_rm_reml().

A common long-format dataset

library(matrixCorr)

set.seed(50)
n_id <- 14
n_time <- 4

dat <- expand.grid(
  id = factor(seq_len(n_id)),
  time = factor(seq_len(n_time)),
  method = factor(c("A", "B"))
)

dat$time_index <- as.integer(dat$time)

subj <- rnorm(n_id, sd = 1.0)[dat$id]
subject_method <- rnorm(n_id * 2, sd = 0.25)
sm <- subject_method[(as.integer(dat$id) - 1L) * 2L + as.integer(dat$method)]
subject_time <- rnorm(n_id * n_time, sd = 0.75)
st <- subject_time[(as.integer(dat$id) - 1L) * n_time + as.integer(dat$time)]

dat$y <- subj + sm + st + 0.35 * (dat$method == "B") +
  rnorm(nrow(dat), sd = 0.35)

Repeated-measures correlation

rmcorr() targets within-subject association, not agreement. It is the right function when the question is whether two responses vary together within subjects after removing subject-level offsets.

set.seed(51)
dat_rmcorr <- data.frame(
  id = rep(seq_len(n_id), each = n_time),
  x = rnorm(n_id * n_time),
  y = rnorm(n_id * n_time),
  z = rnorm(n_id * n_time)
)

dat_rmcorr$y <- 0.7 * dat_rmcorr$x +
  rnorm(n_id, sd = 1)[dat_rmcorr$id] +
  rnorm(nrow(dat_rmcorr), sd = 0.3)

fit_rmcorr <- rmcorr(dat_rmcorr, response = c("x", "y", "z"), subject = "id")
summary(fit_rmcorr)
#> Correlation summary
#>   output      : matrix
#>   dimensions  : 3 x 3
#>   retained_pairs: 6
#>   threshold   : 0.0000
#>   diag        : included
#>   estimate    : -0.3995 to 1.0000
#> 
#>  item1 item2 estimate n_complete fisher_z
#>  x     y     0.9397   56         1.7356  
#>  y     z     -0.3995  56         -0.4230 
#>  x     z     -0.3843  56         -0.4051 
#> 
#> Strongest pairs by |estimate|
#> 
#>  item1 item2 estimate   n_complete fisher_z  
#>  x     y      0.9397160 56          1.7356154
#>  y     z     -0.3994915 56         -0.4230437
#>  x     z     -0.3843429 56         -0.4051453

Repeated-measures agreement

Agreement methods require method labels and, for paired repeated analysis, a time or replicate key.

ba_rm() is the repeated-measures Bland-Altman route. It models subject-time matched paired differences and returns bias and limits of agreement.

fit_ba_rm <- ba_rm(
  dat,
  response = "y",
  subject = "id",
  method = "method",
  time = "time_index"
)

summary(fit_ba_rm)
#> 
#> Bland-Altman (two methods) (95% CI)
#> 
#> Agreement estimates
#> 
#>  item1    item2    n_obs bias  sd_loa loa_low loa_up width
#>  method 1 method 2 56    0.654 0.651  -0.622  1.93   2.552
#> 
#> Confidence intervals
#> 
#>  bias_lwr bias_upr lo_lwr lo_upr up_lwr up_upr
#>  0.483    0.824    -0.793 -0.452 1.759  2.101 
#> 
#> Model details
#> 
#>  sigma2_subject sigma2_resid residual_model
#>  0              0.424        iid

Repeated CCC

The package provides two repeated-measures CCC routes.

ccc_rm_ustat() is a nonparametric U-statistic approach.
ccc_rm_reml() uses the REML mixed-model backend.

fit_ccc_ustat <- ccc_rm_ustat(
  dat,
  response = "y",
  subject = "id",
  method = "method",
  time = "time_index"
)

fit_ccc_reml <- ccc_rm_reml(
  dat,
  response = "y",
  subject = "id",
  method = "method",
  time = "time_index",
  ci = FALSE
)

summary(fit_ccc_ustat)
#> Lin's concordance summary
#>   method      : Repeated-measures Lin's concordance
#>   dimensions  : 2 x 2
#>   pairs       : 1
#>   estimate    : 0.7180
#>   most_negative: A-B (0.7180)
#>   most_positive: A-B (0.7180)
#> 
#> Strongest pairs by |estimate|
#> 
#>  item1 item2 estimate
#>  A     B     0.718
summary(fit_ccc_reml)
#> 
#> Repeated-measures concordance (REML)
#> 
#> Concordance estimates
#> 
#>  item1 item2 estimate n_subjects n_obs SB     se_ccc
#>  A     B     0.6542   14         112   0.2047 0.0607
#> 
#> Variance components
#> 
#>  sigma2_subject sigma2_subject_method sigma2_subject_time sigma2_error
#>  0.4785         0.0997                0.5943              0.1085      
#> 
#> AR(1) diagnostics
#> 
#>  ar1_rho ar1_rho_lag1 ar1_rho_mom ar1_pairs ar1_pval use_ar1 ar1_recommend
#>  -0.3941 -0.3941      -0.3941     84        3e-04    FALSE   FALSE

The two functions are related, but they are not interchangeable. The U-statistic route is useful when its design assumptions are appropriate. The REML route is the more flexible model-based path when variance components and residual structure need to be handled explicitly.

Repeated ICC

icc_rm_reml() uses the same REML and Woodbury backend as repeated CCC, but it targets a different reliability quantity.

fit_icc_cons <- icc_rm_reml(
  dat,
  response = "y",
  subject = "id",
  method = "method",
  time = "time_index",
  type = "consistency",
  ci = TRUE
)

fit_icc_agr <- icc_rm_reml(
  dat,
  response = "y",
  subject = "id",
  method = "method",
  time = "time_index",
  type = "agreement",
  ci = FALSE
)

summary(fit_icc_cons)
#> 
#> Repeated-measures intraclass correlation (REML) (95% CI)
#> 
#> ICC estimates
#> 
#>  item1 item2 estimate lwr  upr  n_subjects n_obs se_icc residual_model
#>  A     B     0.6347   0.63 0.64 14         112   0.0022 iid           
#> 
#> Variance components
#> 
#>  sigma2_subject sigma2_subject_method sigma2_subject_time sigma2_error SB    
#>  0.4785         0.0997                0.5943              0.1085       0.2047
#> 
#> AR(1) diagnostics
#> 
#>  ar1_rho_lag1 ar1_rho_mom ar1_pairs ar1_pval use_ar1 ar1_recommend
#>  -0.3941      -0.3941     84        3e-04    FALSE   FALSE
summary(fit_icc_agr)
#> 
#> Repeated-measures intraclass correlation (REML)
#> 
#> ICC estimates
#> 
#>  item1 item2 estimate n_subjects n_obs se_icc residual_model
#>  A     B     0.4992   14         112   0.0463 iid           
#> 
#> Variance components
#> 
#>  sigma2_subject sigma2_subject_method sigma2_subject_time sigma2_error SB    
#>  0.4785         0.0997                0.5943              0.1085       0.2047
#> 
#> AR(1) diagnostics
#> 
#>  ar1_rho_lag1 ar1_rho_mom ar1_pairs ar1_pval use_ar1 ar1_recommend
#>  -0.3941      -0.3941     84        3e-04    FALSE   FALSE

The simulation above gives the subject-time component a visibly non-trivial variance contribution. That makes the CCC-versus-ICC distinction easier to see:

data.frame(
  method = c("Repeated CCC (REML)", "Repeated ICC (agreement, REML)"),
  estimate = c(
    fit_ccc_reml[1, 2],
    fit_icc_agr[1, 2]
  )
)
#>                           method  estimate
#> 1            Repeated CCC (REML) 0.6541643
#> 2 Repeated ICC (agreement, REML) 0.4991620

The most important distinction between repeated CCC and repeated ICC is the numerator. Repeated ICC uses only the stable between-subject variance in the numerator. Repeated CCC also credits the time-averaged subject-time component. As a result, the two summaries can differ materially even when they are fitted through the same backend.

Choosing among repeated-measures methods

The method choice should follow the scientific question.

Use rmcorr() for within-subject association.
Use ba_rm() for repeated-measures bias and limits of agreement.
Use ccc_rm_ustat() or ccc_rm_reml() for repeated concordance.
Use icc_rm_reml() for repeated reliability under a variance-components interpretation.

The shared long-format interface is intentional. The statistical targets are different, but the package keeps the surrounding workflow stable.