--- title: "Multiscale GLasso" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Multiscale GLasso} %\VignetteEngine{knitr::knitr} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Method This repository provides an implementation of the `MGLasso` (Multiscale Graphical Lasso) algorithm: an approach for estimating sparse Gaussian Graphical Models with the addition of a group-fused Lasso penalty. `MGLasso` is described in the paper [Inference of Multiscale Gaussian Graphical Model](https://desanou.github.io/multiscale_glasso/). `MGLasso` has these contributions: - We simultaneously infer a network and estimate a clustering structure by combining the [neighborhood selection](https://arxiv.org/abs/math/0608017) approach (Meinshausen and Bühlman, 2006) and [convex clustering](https://www.di.ens.fr/~fbach/419_icmlpaper.pdf) (Hocking et al. 2011). - We use a continuation with Nesterov smoothing in a shrinkage-thresholding algorithm ([`CONESTA`](https://arxiv.org/abs/1605.09658), Hadj-Selem et al. 2018) to solve the optimization problem. To solve the `MGLasso` problem, we seek the regression vectors $\beta^i$ that minimize $J_{\lambda_1, \lambda_2}(\boldsymbol{\beta}; \mathbf{X} ) = \frac{1}{2} \sum_{i=1}^p \left \lVert \mathbf{X}^i - \mathbf{X}^{\setminus i} \boldsymbol{\beta}^i \right \rVert_2 ^2 + \lambda_1 \sum_{i = 1}^p \left \lVert \boldsymbol{\beta}^i \right \rVert_1 + \lambda_2 \sum_{i < j} \left \lVert \boldsymbol{\beta}^i - \tau_{ij}(\boldsymbol{\beta}^j) \right \rVert_2.$ `MGLasso` package is based on the python implementation of the solver `CONESTA` available in [pylearn-parsimony](https://github.com/neurospin/pylearn-parsimony) library. ## Package requirements and installation - Install the `reticulate` package and Miniconda if no conda distribution available on the OS. ```{r eval = FALSE} install.packages('reticulate') reticulate::install_miniconda() ``` - Install `MGLasso`, its python dependencies and configure the conda environment `rmglasso`. ```{r eval = FALSE} # install.packages('mglasso') remotes::install_github("desanou/mglasso") ``` ```{r} library(mglasso) install_pylearn_parsimony(envname = "rmglasso", method = "conda") reticulate::use_condaenv("rmglasso", required = TRUE) reticulate::py_config() ``` The `conesta_solver` is delay loaded. See `reticulate::import_from_path` for details. An example of use is given below. ## Illustration on a simple model ### Simulate a block diagonal model We simulate a $3$-block diagonal model where each block contains $3$ variables. The intra-block correlation level is set to $0.85$ while the correlations outside the blocks are kept to $0$. ```{r } library(Matrix) n = 50 K = 3 p = 9 rho = 0.85 blocs <- list() for (j in 1:K) { bloc <- matrix(rho, nrow = p/K, ncol = p/K) for(i in 1:(p/K)) { bloc[i,i] <- 1 } blocs[[j]] <- bloc } mat.correlation <- Matrix::bdiag(blocs) corrplot::corrplot(as.matrix(mat.correlation), method = "color", tl.col="black") ``` #### Simulate gaussian data from the covariance matrix ```{r } set.seed(11) X <- mvtnorm::rmvnorm(n, mean = rep(0,p), sigma = as.matrix(mat.correlation)) colnames(X) <- LETTERS[1:9] ``` ### Run `mglasso()` We set the sparsity level $\lambda_1$ to $0.2$ and rescaled it with the size of the sample. ```{r } X <- scale(X) res <- mglasso(X, lambda1 = 0.2*n, lambda2_start = 0.1, fuse_thresh = 1e-3, verbose = FALSE) ``` To launch a unique run of the objective function call the `conesta` function. ```{r} temp <- mglasso::conesta(X, lam1 = 0.2*n, lam2 = 0.1) ``` #### Estimated clustering path We plot the clustering path of `mglasso` method on the 2 principal components axis of $X$. The path is drawn on the predicted $X$'s. ```{r} library(ggplot2) library(ggrepel) mglasso:::plot_clusterpath(as.matrix(X), res) ``` #### Estimated adjacency matrices along the clustering path As the the fusion penalty increases from `level9` to `level1` we observe a progressive fusion of adjacent edges. ```{r} plot_mglasso(res) ``` ## Reference Edmond, Sanou; Christophe, Ambroise; Geneviève, Robin; (2022): Inference of Multiscale Gaussian Graphical Model. ArXiv. Preprint.