Multi-task regression and network estimation with missing responses — no imputation required!
missoNet jointly estimates regression
coefficients and the response network (precision
matrix) from multi-response data where some responses are
missing (MCAR/MAR/MNAR). Estimation is based on unbiased estimating
equations with separate L1 regularization for
coefficients and the precision matrix, enabling robust multi-trait
analysis under incomplete outcomes.
Beta) and
conditional dependency structure (Theta).If you only have a single response, classical lasso/elastic net (e.g.,
glmnet) is simpler and likely faster.
CRAN (stable)
install.packages("missoNet")GitHub (development)
# install.packages("devtools")
devtools::install_github("yixiao-zeng/missoNet", build_vignettes = TRUE)library(missoNet)
# Example data with ~15% missing responses (MCAR)
sim <- generateData(n = 300, p = 50, q = 10, rho = 0.15, missing.type = "MCAR")
# Fit along two lambda paths; choose via BIC (no CV)
fit <- missoNet(X = sim$X, Y = sim$Z, GoF = "BIC")
# Extract estimates at the selected solution
Beta <- fit$est.min$Beta # p x q regression coefficients
Theta <- fit$est.min$Theta # q x q precision (conditional network)
# Visualize selection path
plot(fit, type = "scatter")# 5-fold CV over (lambda.beta, lambda.theta)
cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5)
# Inspect CV heatmap and selected models (min and 1-SE variants)
plot(cvfit, type = "heatmap")
# Predict responses on new data
Y_hat <- predict(cvfit, newx = sim$X, s = "lambda.min")Tip: Try s = "lambda.1se.beta" or
"lambda.1se.theta" for more conservative sparsity when
available.
library(parallel)
cl <- makeCluster(max(1, detectCores() - 1))
cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5,
parallel = TRUE, cl = cl)
stopCluster(cl)# Lessen the penalty for prior-important predictors
p <- ncol(sim$X); q <- ncol(sim$Z)
beta.pen.factor <- matrix(1, p, q)
beta.pen.factor[c(1, 2), ] <- 0.1
fit <- missoNet(X = sim$X, Y = sim$Z,
beta.pen.factor = beta.pen.factor)fit <- missoNet(X = sim$X, Y = sim$Z,
adaptive.search = TRUE,
n.lambda.beta = 50,
n.lambda.theta = 50)vignette("missoNet-introduction")
vignette("missoNet-cross-validation")
vignette("missoNet-case-study")If vignettes are not available from CRAN binaries on your platform,
install from source using the GitHub command above with
build_vignettes = TRUE.
Actual performance will depend on sparsity, signal-to-noise, and missingness mechanisms.
Great for
Not ideal for - Single-response regression (use
glmnet or similar) - Extremely sparse information (e.g.,
>50% missing responses across most traits)
If you use missoNet in your research, please cite:
@article{zeng2025missonet,
title = {Multivariate regression with missing response data for modelling regional DNA methylation QTLs},
author = {Zeng, Yixiao and Alam, Shomoita and Bernatsky, Sasha and Hudson, Marie and Colmegna, In{\'e}s and Stephens, David A and Greenwood, Celia MT and Yang, Archer Y},
journal = {arXiv preprint arXiv:2507.05990},
year = {2025},
url = {https://arxiv.org/abs/2507.05990}
}Contributions and issues are welcome! Please open a discussion or pull request on the GitHub repository.
GPL-2. See the LICENSE file.