Type: | Package |
Title: | NPLS Regression with L1 Penalization |
Version: | 1.0.27 |
Author: | David Hervas |
Maintainer: | David Hervas <ddhervas@yahoo.es> |
Depends: | R (≥ 2.10) |
Imports: | clickR, future, future.apply, ggplot2, ggrepel, ks, MASS, Matrix, pbapply |
Description: | Tools for performing variable selection in three-way data using N-PLS in combination with L1 penalization, Selectivity Ratio and VIP scores. The N-PLS model (Rasmus Bro, 1996 <doi:10.1002/(SICI)1099-128X(199601)10:1%3C47::AID-CEM400%3E3.0.CO;2-C>) is the natural extension of PLS (Partial Least Squares) to N-way structures, and tries to maximize the covariance between X and Y data arrays. The package also adds variable selection through L1 penalization, Selectivity Ratio and VIP scores. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.1.1 |
NeedsCompilation: | no |
Packaged: | 2020-12-16 10:18:46 UTC; aghil |
Repository: | CRAN |
Date/Publication: | 2020-12-16 12:50:02 UTC |
R-matrix from a sNPLS model fit
Description
Builds the R-matrix from a sNPLS model fit
Usage
Rmatrix(x)
Arguments
x |
A sNPLS model obtained from |
Value
Returns the R-matrix of the model, needed to compute the coefficients
Compute Selectivity Ratio for a sNPLS model
Description
Estimates Selectivity Ratio for the different components of a sNPLS model fit
Usage
SR(model)
Arguments
model |
A sNPLS model |
Value
A list of data.frames, each of them including the computed Selectivity Ratios for each variable
Bread data
Description
Evaluation of ten bread with respect to eleven attributes by eight judges (Xbread). The outcome is the salt content of each bread (Ybread).
Usage
data(bread)
Format
An object of class list
of length 2.
References
Bro, R, Multi-way Analysis in the Food Industry. Models, Algorithms, and Applications. 1998. PhD thesis, University of Amsterdam (NL) & Royal Veterinary and Agricultural University (DK).
Coefficients from a sNPLS model
Description
Extract coefficients from a sNPLS model
Usage
## S3 method for class 'sNPLS'
coef(object, as.matrix = FALSE, ...)
Arguments
object |
A sNPLS model fit |
as.matrix |
Should the coefficients be presented as matrix or vector? |
... |
Further arguments passed to |
Value
A matrix (or vector) of coefficients
Internal function for cv_snpls
Description
Internal function for cv_snpls
Usage
cv_fit(
xtrain,
ytrain,
xval,
yval,
ncomp,
threshold_j = NULL,
threshold_k = NULL,
keepJ = NULL,
keepK = NULL,
method,
...
)
Arguments
xtrain |
A three-way training array |
ytrain |
A response training matrix |
xval |
A three-way test array |
yval |
A response test matrix |
ncomp |
Number of components for the sNPLS model |
threshold_j |
Threshold value on Wj. Scaled between [0, 1) |
threshold_k |
Threshold value on Wk. Scaled between [0, 1) |
keepJ |
Number of variables to keep for each component, ignored if threshold_j is provided |
keepK |
Number of 'times' to keep for each component, ignored if threshold_k is provided |
method |
Select between sNPLS, sNPLS-SR or sNPLS-VIP |
... |
Further arguments passed to sNPLS |
Value
Returns the CV mean squared error
Cross-validation for a sNPLS model
Description
Performs cross-validation for a sNPLS model
Usage
cv_snpls(
X_npls,
Y_npls,
ncomp = 1:3,
samples = 20,
keepJ = NULL,
keepK = NULL,
nfold = 10,
parallel = TRUE,
method = "sNPLS",
...
)
Arguments
X_npls |
A three-way array containing the predictors. |
Y_npls |
A matrix containing the response. |
ncomp |
A vector with the different number of components to test |
samples |
Number of samples for performing random search in continuous thresholding |
keepJ |
A vector with the different number of selected variables to test for discrete thresholding |
keepK |
A vector with the different number of selected 'times' to test for discrete thresholding |
nfold |
Number of folds for the cross-validation |
parallel |
Should the computations be performed in parallel? Set up strategy first with |
method |
Select between sNPLS, sNPLS-SR or sNPLS-VIP |
... |
Further arguments passed to sNPLS |
Value
A list with the best parameters for the model and the CV error
Examples
## Not run:
X_npls<-array(rpois(7500, 10), dim=c(50, 50, 3))
Y_npls<-matrix(2+0.4*X_npls[,5,1]+0.7*X_npls[,10,1]-0.9*X_npls[,15,1]+
0.6*X_npls[,20,1]- 0.5*X_npls[,25,1]+rnorm(50), ncol=1)
#Grid search for discrete thresholding
cv1<- cv_snpls(X_npls, Y_npls, ncomp=1:2, keepJ = 1:3, keepK = 1:2, parallel = FALSE)
#Random search for continuous thresholding
cv2<- cv_snpls(X_npls, Y_npls, ncomp=1:2, samples=20, parallel = FALSE)
## End(Not run)
Fitted method for sNPLS models
Description
Fitted method for sNPLS models
Usage
## S3 method for class 'sNPLS'
fitted(object, ...)
Arguments
object |
A sNPLS model fit |
... |
Further arguments passed to |
Value
Fitted values for the sNPLS model
Plot cross validation results for sNPLS objects
Description
Plot function for visualization of cross validation results for sNPLS models
Usage
## S3 method for class 'cvsNPLS'
plot(x, ...)
Arguments
x |
A cv_sNPLS object |
... |
Not used |
Value
A facet plot with the results of the cross validation
Density plot for repeat_cv results
Description
Plots a grid of slices from the 3-D kernel denity estimates of the repeat_cv function
Usage
## S3 method for class 'repeatcv'
plot(x, ...)
Arguments
x |
A repeatcv object |
... |
Further arguments passed to plot |
Value
A grid of slices from a 3-D density plot of the results of the repeated cross-validation
Plots for sNPLS model fits
Description
Different plots for sNPLS model fits
Usage
## S3 method for class 'sNPLS'
plot(x, type = "T", comps = c(1, 2), labels = TRUE, group = NULL, ...)
Arguments
x |
A sNPLS model fit |
type |
The type of plot. One of those: "T", "U", "Wj", "Wk", "time" or "variables" |
comps |
Vector with the components to plot. It can be of length |
labels |
Should rownames be added as labels to the plot? |
group |
Vector with categorical variable defining groups (optional) |
... |
Not used |
Value
A plot of the type specified in the type
parameter
Internal function for plot.sNPLS
Description
Internal function for plot.sNPLS
Usage
plot_T(x, comps, labels, group = NULL)
Arguments
x |
A sNPLS model fit |
comps |
A vector of length two with the components to plot |
labels |
Should rownames be added as labels to the plot? |
group |
Vector with categorical variable defining groups |
Value
A plot of the T matrix of a sNPLS model fit
Internal function for plot.sNPLS
Description
Internal function for plot.sNPLS
Usage
plot_U(x, comps, labels, group = NULL)
Arguments
x |
A sNPLS model fit |
comps |
A vector of length two with the components to plot |
labels |
Should rownames be added as labels to the plot? |
group |
Vector with categorical variable defining groups |
Value
A plot of the U matrix of a sNPLS model fit
Internal function for plot.sNPLS
Description
Internal function for plot.sNPLS
Usage
plot_Wj(x, comps, labels)
Arguments
x |
A sNPLS model fit |
comps |
A vector of length two with the components to plot |
labels |
Should rownames be added as labels to the plot? |
Value
A plot of Wj coefficients
Internal function for plot.sNPLS
Description
Internal function for plot.sNPLS
Usage
plot_Wk(x, comps, labels)
Arguments
x |
A sNPLS model fit |
comps |
A vector of length two with the components to plot |
labels |
Should rownames be added as labels to the plot? |
Value
A plot of the Wk coefficients
Internal function for plot.sNPLS
Description
Internal function for plot.sNPLS
Usage
plot_time(x, comps)
Arguments
x |
A sNPLS model fit |
comps |
A vector with the components to plot |
Value
A plot of Wk coefficients for each component
Internal function for plot.sNPLS
Description
Internal function for plot.sNPLS
Usage
plot_variables(x, comps)
Arguments
x |
A sNPLS model fit |
comps |
A vector with the components to plot |
Value
A plot of Wj coefficients for each component
Predict for sNPLS models
Description
Predict function for sNPLS models
Usage
## S3 method for class 'sNPLS'
predict(object, newX, rescale = TRUE, ...)
Arguments
object |
A sNPLS model fit |
newX |
A three-way array containing the new data |
rescale |
Should the prediction be rescaled to the original scale? |
... |
Further arguments passed to |
Value
A matrix with the predictions
Repeated cross-validation for sNPLS models
Description
Performs repeated cross-validatiodn and represents results in a plot
Usage
repeat_cv(
X_npls,
Y_npls,
ncomp = 1:3,
samples = 20,
keepJ = NULL,
keepK = NULL,
nfold = 10,
times = 30,
parallel = TRUE,
method = "sNPLS",
...
)
Arguments
X_npls |
A three-way array containing the predictors. |
Y_npls |
A matrix containing the response. |
ncomp |
A vector with the different number of components to test |
samples |
Number of samples for performing random search in continuous thresholding |
keepJ |
A vector with the different number of selected variables to test in discrete thresholding |
keepK |
A vector with the different number of selected 'times' to test in discrete thresholding |
nfold |
Number of folds for the cross-validation |
times |
Number of repetitions of the cross-validation |
parallel |
Should the computations be performed in parallel? Set up strategy first with |
method |
Select between sNPLS, sNPLS-SR or sNPLS-VIP |
... |
Further arguments passed to cv_snpls |
Value
A density plot with the results of the cross-validation and an (invisible) data.frame
with these results
Fit a sNPLS model
Description
Fits a N-PLS regression model imposing sparsity on wj
and wk
matrices
Usage
sNPLS(
XN,
Y,
ncomp = 2,
threshold_j = 0.5,
threshold_k = 0.5,
keepJ = NULL,
keepK = NULL,
scale.X = TRUE,
center.X = TRUE,
scale.Y = TRUE,
center.Y = TRUE,
conver = 1e-16,
max.iteration = 10000,
silent = F,
method = "sNPLS"
)
Arguments
XN |
A three-way array containing the predictors. |
Y |
A matrix containing the response. |
ncomp |
Number of components in the projection |
threshold_j |
Threshold value on Wj. Scaled between [0, 1) |
threshold_k |
Threshold value on Wk. scaled between [0, 1) |
keepJ |
Number of variables to keep for each component, ignored if threshold_j is provided |
keepK |
Number of 'times' to keep for each component, ignored if threshold_k is provided |
scale.X |
Perform unit variance scaling on X? |
center.X |
Perform mean centering on X? |
scale.Y |
Perform unit variance scaling on Y? |
center.Y |
Perform mean centering on Y? |
conver |
Convergence criterion |
max.iteration |
Maximum number of iterations |
silent |
Show output? |
method |
Select between L1 penalization (sNPLS), variable selection with Selectivity Ratio (sNPLS-SR) or variable selection with VIP (sNPLS-VIP) |
Value
A fitted sNPLS model
References
C. A. Andersson and R. Bro. The N-way Toolbox for MATLAB Chemometrics & Intelligent Laboratory Systems. 52 (1):1-4, 2000.
Hervas, D. Prats-Montalban, J. M., Garcia-CaƱaveras, J. C., Lahoz, A., & Ferrer, A. (2019). Sparse N-way partial least squares by L1-penalization. Chemometrics and Intelligent Laboratory Systems, 185, 85-91.
Examples
X_npls<-array(rpois(7500, 10), dim=c(50, 50, 3))
Y_npls <- matrix(2+0.4*X_npls[,5,1]+0.7*X_npls[,10,1]-0.9*X_npls[,15,1]+
0.6*X_npls[,20,1]- 0.5*X_npls[,25,1]+rnorm(50), ncol=1)
#Discrete thresholding
fit <- sNPLS(X_npls, Y_npls, ncomp=3, keepJ = rep(2,3) , keepK = rep(1,3))
#Continuous thresholding
fit2 <- sNPLS(X_npls, Y_npls, ncomp=3, threshold_j=0.5, threshold_k=0.5)
#USe sNPLS-SR method
fit3 <- sNPLS(X_npls, Y_npls, ncomp=3, threshold_j=0.5, threshold_k=0.5, method="sNPLS-SR")
Summary for sNPLS models
Description
Summary of a sNPLS model fit
Usage
## S3 method for class 'sNPLS'
summary(object, ...)
Arguments
object |
A sNPLS object |
... |
Further arguments passed to summary.default |
Value
A summary inclunding number of components, squared error and coefficients of the fitted model
Unfolding of three-way arrays
Description
Unfolds a three-way array into a matrix
Usage
unfold3w(x)
Arguments
x |
A three-way array |
Value
Returns a matrix with dimensions dim(x)[1] x dim(x)[2]*dim(x([3]))