Getting started with aftPenCDA
package
Overview
aftPenCDA is an R package for fitting penalized
accelerated failure time (AFT) models using induced smoothing. The
package supports variable selection for both right-censored and
clustered partly interval-censored survival data.
Several penalty functions are implemented, including broken adaptive
ridge (BAR), LASSO, adaptive LASSO (ALASSO), and SCAD. For variance
estimation, the package provides both a closed-form estimator and a
perturbation-based estimator.
Core computational routines are implemented in ‘C++’ via ‘Rcpp’
(‘RcppArmadillo’ backend) to ensure scalability for high-dimensional
settings.
Methodological background
The accelerated failure time (AFT) model with rank-based estimating
equations involves nonsmooth objective functions, which pose challenges
for numerical optimization.
Induced smoothing replaces the nonsmooth estimating equations with
smooth approximations, allowing the use of gradient-based methods. This
approach avoids direct optimization of nonsmooth rank-based estimating
equations, significantly improving computational efficiency.
This leads to a quadratic approximation of the objective function. By
applying a Cholesky decomposition, the problem is transformed into a
least-squares-type formulation, which enables efficient coordinate
descent updates for penalized estimation in high-dimensional
settings.
The resulting formulation enables efficient computation even when the
number of covariates is large relative to the sample size.
Installation
You can install the development version of aftPenCDA
from GitHub:
devtools::install_github("seonsy/aftPenCDA")
Main functions
The main functions in aftPenCDA are:
aftpen(): penalized AFT model for right-censored
data
aftpen_pic(): penalized AFT model for clustered partly
interval-censored data
Both functions support the following penalty types:
"BAR": Broken Adaptive Ridge
"LASSO": LASSO penalty
"ALASSO": Adaptive LASSO penalty
"SCAD": Smoothly Clipped Absolute Deviation
penalty
Example 1: Right-censored data
We use the example right-censored dataset included in the package and
fit the penalized estimator.
We fit the model using the BAR penalty.
fit_bar <- aftpen(simdat_rc, lambda = 0.3, se = "CF", type = "BAR")
fit_bar$beta
Other penalties are also available.
fit_lasso <- aftpen(simdat_rc, lambda = 0.1, se = "CF", type = "LASSO")
fit_alasso <- aftpen(simdat_rc, lambda = 0.1, se = "CF", type = "ALASSO")
fit_scad <- aftpen(simdat_rc, lambda = 0.1, se = "CF", type = "SCAD")
Example 2: Clustered partly interval-censored data
We use the example clustered partly interval-censored dataset
included in the package and apply the proposed method.
We fit the model using the BAR penalty.
fit_pic <- aftpen_pic(simdat_pic, lambda = 0.0005, se = "CF", type = "BAR")
fit_pic$beta
Other penalties are also available for partly interval-censored
data.
fit_pic_lasso <- aftpen_pic(simdat_pic, lambda = 0.001, se = "CF", type = "LASSO")
fit_pic_alasso <- aftpen_pic(simdat_pic, lambda = 0.001, se = "CF", type = "ALASSO")
fit_pic_scad <- aftpen_pic(simdat_pic, lambda = 0.001, se = "CF", type = "SCAD")
Variance estimation
The argument se specifies the variance estimation
method.
"CF": closed-form estimator
"ZL": perturbation-based estimator
For example:
fit_zl <- aftpen(simdat_rc, lambda = 0.1, se = "ZL", type = "BAR")
References
Wang, You-Gan, and Yudong Zhao (2008). “Weighted Rank Regression for
Clustered Data Analysis.” Biometrics 64(1),
39–45.
Dai, L., K. Chen, Z. Sun, Z. Liu, and G. Li (2018). “Broken Adaptive
Ridge Regression and Its Asymptotic Properties.” Journal of
Multivariate Analysis 168, 334–351.
Zeng, Donglin, and D. Y. Lin (2008).“Efficient Resampling Methods for
Nonsmooth Estimating Functions.” Biostatistics
9(2), 355–363.
Tibshirani, Robert (1996).“Regression Shrinkage and Selection via the
Lasso.” Journal of the Royal Statistical Society: Series B
58(1), 267–288.
Fan, Jianqing, and Runze Li (2001). “Variable Selection via
Nonconcave Penalized Likelihood and Its Oracle Properties.” Journal
of the American Statistical Association 96(456),
1348–1360.
Zou, Hui (2006).“The Adaptive Lasso and Its Oracle Properties.”
Journal of the American Statistical Association
101(476), 1418–1429.