Title: | Expectile Regression in Reproducing Kernel Hilbert Space |
Version: | 1.0.0 |
Date: | 2015-8-26 |
Author: | Yi Yang, Teng Zhang, Hui Zou |
Maintainer: | Yi Yang <yiyang@umn.edu> |
Depends: | methods |
Description: | An efficient algorithm inspired by majorization-minimization principle for solving the entire solution path of a flexible nonparametric expectile regression estimator constructed in a reproducing kernel Hilbert space. |
License: | GPL-2 |
Packaged: | 2015-08-27 13:38:41 UTC; emeryyi |
Repository: | CRAN |
Date/Publication: | 2015-08-28 00:35:09 |
NeedsCompilation: | yes |
Fits the regularization paths for the kernel expectile regression.
Description
Fits a regularization path for the kernel expectile regression at a sequence of regularization parameters lambda.
Usage
KERE(x, y, kern, lambda = NULL, eps = 1e-08, maxit = 1e4,
omega = 0.5, gamma = 1e-06, option = c("fast", "normal"))
Arguments
x |
matrix of predictors, of dimension |
y |
response variable. |
kern |
the built-in kernel classes in KERE.
The
Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.) |
lambda |
a user supplied |
eps |
convergence threshold for majorization minimization algorithm. Each majorization descent loop continues until the relative change in any coefficient |
maxit |
maximum number of loop iterations allowed at fixed lambda value. Default is 1e4. If models do not converge, consider increasing |
omega |
the parameter |
gamma |
a scalar number. If it is specified, the number will be added to each diagonal element of the kernel matrix as perturbation. The default is |
option |
users can choose which method to use to update the inverse matrix in the MM algorithm. |
Details
Note that the objective function in KERE
is
Loss(y- \alpha_0 - K * \alpha )) + \lambda * \alpha^T * K * \alpha,
where the \alpha_0
is the intercept, \alpha
is the solution vector, and K
is the kernel matrix with K_{ij}=K(x_i,x_j)
. Users can specify the kernel function to use, options include Radial Basis kernel, Polynomial kernel, Linear kernel, Hyperbolic tangent kernel, Laplacian kernel, Bessel kernel, ANOVA RBF kernel, the Spline kernel. Users can also tweak the penalty by choosing different lambda
.
For computing speed reason, if models are not converging or running slow, consider increasing eps
before increasing maxit
.
Value
An object with S3 class KERE
.
call |
the call that produced this object. |
alpha |
a |
lambda |
the actual sequence of |
npass |
total number of loop iterations corresponding to each lambda value. |
jerr |
error flag, for warnings and errors, 0 if no error. |
Author(s)
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <yiyang@umn.edu>
References
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
Examples
# create data
N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)
# set gaussian kernel
kern <- rbfdot(sigma=0.1)
# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))
# run KERE
m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5)
# plot the solution paths
plot(m1)
Assing kernelMatrix class to matrix objects
Description
as.kernelMatrix
in package KERE can be used
to coerce the kernelMatrix class to matrix objects representing a
kernel matrix. These matrices can then be used with the kernelMatrix
interfaces which most of the functions in KERE support.
Usage
## S4 method for signature 'matrix'
as.kernelMatrix(x, center = FALSE)
Arguments
x |
matrix to be assigned the |
center |
center the kernel matrix in feature space (default: FALSE) |
Author(s)
Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at
See Also
Examples
## Create toy data
x <- rbind(matrix(rnorm(10),,2),matrix(rnorm(10,mean=3),,2))
y <- matrix(c(rep(1,5),rep(-1,5)))
### Use as.kernelMatrix to label the cov. matrix as a kernel matrix
### which is eq. to using a linear kernel
K <- as.kernelMatrix(crossprod(t(x)))
K
Cross-validation for KERE
Description
Does k-fold cross-validation for KERE
, produces a plot, and returns a value for lambda
.
Usage
## S3 method for class 'KERE'
cv(x, y, kern, lambda = NULL, nfolds = 5, foldid, omega = 0.5, ...)
Arguments
x |
matrix of predictors, of dimension |
y |
response variable. |
kern |
the built-in kernel classes in KERE.
The
Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.) |
lambda |
a user supplied |
nfolds |
number of folds - default is 5. Although |
foldid |
an optional vector of values between 1 and |
omega |
the parameter |
... |
other arguments that can be passed to |
Details
The function runs KERE
nfolds
+1 times; the
first to get the lambda
sequence, and then the remainder to
compute the fit with each of the folds omitted. The average error and standard deviation over the
folds are computed.
Value
an object of class cv.KERE
is returned, which is a
list with the ingredients of the cross-validation fit.
lambda |
the values of |
cvm |
the mean cross-validated error - a vector of length
|
cvsd |
estimate of standard error of |
cvupper |
upper curve = |
cvlo |
lower curve = |
name |
a character string "Expectile Loss" |
lambda.min |
the optimal value of |
cvm.min |
the minimum
cross validation error |
Author(s)
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <yiyang@umn.edu>
References
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
Examples
N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)
# set gaussian kernel
kern <- rbfdot(sigma=0.1)
# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))
cv.KERE(x=X, y=Y, kern, lambda = lambda, nfolds = 5, omega = 0.5)
Kernel Functions
Description
The kernel generating functions provided in KERE.
The Gaussian RBF kernel k(x,x') = \exp(-\sigma \|x - x'\|^2)
The Polynomial kernel k(x,x') = (scale <x, x'> + offset)^{degree}
The Linear kernel k(x,x') = <x, x'>
The Hyperbolic tangent kernel k(x, x') = \tanh(scale <x, x'> + offset)
The Laplacian kernel k(x,x') = \exp(-\sigma \|x - x'\|)
The Bessel kernel k(x,x') = (- Bessel_{(\nu+1)}^n \sigma \|x - x'\|^2)
The ANOVA RBF kernel k(x,x') = \sum_{1\leq i_1 \ldots < i_D \leq
N} \prod_{d=1}^D k(x_{id}, {x'}_{id})
where k(x,x) is a Gaussian
RBF kernel.
The Spline kernel \prod_{d=1}^D 1 + x_i x_j + x_i x_j min(x_i,
x_j) - \frac{x_i + x_j}{2} min(x_i,x_j)^2 +
\frac{min(x_i,x_j)^3}{3}
\
Usage
rbfdot(sigma = 1)
polydot(degree = 1, scale = 1, offset = 1)
tanhdot(scale = 1, offset = 1)
vanilladot()
laplacedot(sigma = 1)
besseldot(sigma = 1, order = 1, degree = 1)
anovadot(sigma = 1, degree = 1)
splinedot()
Arguments
sigma |
The inverse kernel width used by the Gaussian the Laplacian, the Bessel and the ANOVA kernel |
degree |
The degree of the polynomial, bessel or ANOVA kernel function. This has to be an positive integer. |
scale |
The scaling parameter of the polynomial and tangent kernel is a convenient way of normalizing patterns without the need to modify the data itself |
offset |
The offset used in a polynomial or hyperbolic tangent kernel |
order |
The order of the Bessel function to be used as a kernel |
Details
The kernel generating functions are used to initialize a kernel
function
which calculates the dot (inner) product between two feature vectors in a
Hilbert Space. These functions can be passed as a kernel
argument on almost all
functions in KERE(e.g., ksvm
, kpca
etc).
Although using one of the existing kernel functions as a
kernel
argument in various functions in KERE has the
advantage that optimized code is used to calculate various kernel expressions,
any other function implementing a dot product of class kernel
can also be used as a kernel
argument. This allows the user to use, test and develop special kernels
for a given data set or algorithm.
Value
Return an S4 object of class kernel
which extents the
function
class. The resulting function implements the given
kernel calculating the inner (dot) product between two vectors.
kpar |
a list containing the kernel parameters (hyperparameters) used. |
The kernel parameters can be accessed by the kpar
function.
Note
If the offset in the Polynomial kernel is set to 0, we obtain homogeneous polynomial
kernels, for positive values, we have inhomogeneous
kernels. Note that for negative values the kernel does not satisfy Mercer's
condition and thus the optimizers may fail.
In the Hyperbolic tangent kernel if the offset is negative the likelihood of obtaining a kernel matrix that is not positive definite is much higher (since then even some diagonal elements may be negative), hence if this kernel has to be used, the offset should always be positive. Note, however, that this is no guarantee that the kernel will be positive.
Author(s)
Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at
See Also
kernelMatrix
, kernelMult
, kernelPol
Examples
rbfkernel <- rbfdot(sigma = 0.1)
rbfkernel
kpar(rbfkernel)
## create two vectors
x <- rnorm(10)
y <- rnorm(10)
## calculate dot product
rbfkernel(x,y)
Class "kernel" "rbfkernel" "polykernel", "tanhkernel", "vanillakernel"
Description
The built-in kernel classes in KERE
Objects from the Class
Objects can be created by calls of the form new("rbfkernel")
,
new{"polykernel"}
, new{"tanhkernel"}
,
new{"vanillakernel"}
, new{"anovakernel"}
,
new{"besselkernel"}
, new{"laplacekernel"}
,
new{"splinekernel"}
or by calling the rbfdot
, polydot
, tanhdot
,
vanilladot
, anovadot
, besseldot
, laplacedot
,
splinedot
functions etc..
Slots
.Data
:Object of class
"function"
containing the kernel functionkpar
:Object of class
"list"
containing the kernel parameters
Extends
Class "kernel"
, directly.
Class "function"
, by class "kernel"
.
Methods
- kernelMatrix
signature(kernel = "rbfkernel", x = "matrix")
: computes the kernel matrix- kernelMult
signature(kernel = "rbfkernel", x = "matrix")
: computes the quadratic kernel expression- kernelPol
signature(kernel = "rbfkernel", x = "matrix")
: computes the kernel expansion- kernelFast
signature(kernel = "rbfkernel", x = "matrix"),,a
: computes parts or the full kernel matrix, mainly used in kernel algorithms where columns of the kernel matrix are computed per invocation
Author(s)
Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at
See Also
Examples
rbfkernel <- rbfdot(sigma = 0.1)
rbfkernel
is(rbfkernel)
kpar(rbfkernel)
Kernel Matrix functions
Description
kernelMatrix
calculates the kernel matrix K_{ij} = k(x_i,x_j)
or K_{ij} =
k(x_i,y_j)
.
kernelPol
computes the quadratic kernel expression H = z_i z_j
k(x_i,x_j)
, H = z_i k_j k(x_i,y_j)
.
kernelMult
calculates the kernel expansion f(x_i) =
\sum_{i=1}^m z_i k(x_i,x_j)
kernelFast
computes the kernel matrix, identical
to kernelMatrix
, except that it also requires the squared
norm of the first argument as additional input, useful in iterative
kernel matrix calculations.
Usage
## S4 method for signature 'kernel'
kernelMatrix(kernel, x, y = NULL)
## S4 method for signature 'kernel'
kernelPol(kernel, x, y = NULL, z, k = NULL)
## S4 method for signature 'kernel'
kernelMult(kernel, x, y = NULL, z, blocksize = 256)
## S4 method for signature 'kernel'
kernelFast(kernel, x, y, a)
Arguments
kernel |
the kernel function to be used to calculate the kernel
matrix.
This has to be a function of class |
x |
a data matrix to be used to calculate the kernel matrix. |
y |
second data matrix to calculate the kernel matrix. |
z |
a suitable vector or matrix |
k |
a suitable vector or matrix |
a |
the squared norm of |
blocksize |
the kernel expansion computations are done block wise
to avoid storing the kernel matrix into memory. |
Details
Common functions used during kernel based computations.
The kernel
parameter can be set to any function, of class
kernel, which computes the inner product in feature space between two
vector arguments. KERE provides the most popular kernel functions
which can be initialized by using the following
functions:
-
rbfdot
Radial Basis kernel function -
polydot
Polynomial kernel function -
vanilladot
Linear kernel function -
tanhdot
Hyperbolic tangent kernel function -
laplacedot
Laplacian kernel function -
besseldot
Bessel kernel function -
anovadot
ANOVA RBF kernel function -
splinedot
the Spline kernel
(see example.)
kernelFast
is mainly used in situations where columns of the
kernel matrix are computed per invocation. In these cases,
evaluating the norm of each row-entry over and over again would
cause significant computational overhead.
Value
kernelMatrix
returns a symmetric diagonal semi-definite matrix.
kernelPol
returns a matrix.
kernelMult
usually returns a one-column matrix.
Author(s)
Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at
See Also
rbfdot
, polydot
,
tanhdot
, vanilladot
Examples
## use the spam data
x <- matrix(rnorm(10*10),10,10)
## initialize kernel function
rbf <- rbfdot(sigma = 0.05)
rbf
## calculate kernel matrix
kernelMatrix(rbf, x)
y <- matrix(rnorm(10*1),10,1)
## calculate the quadratic kernel expression
kernelPol(rbf, x, ,y)
## calculate the kernel expansion
kernelMult(rbf, x, ,y)
Plot coefficients from a "KERE" object
Description
Produces a coefficient profile plot of the coefficient paths for a
fitted KERE
object.
Usage
## S3 method for class 'KERE'
plot(x, ...)
Arguments
x |
fitted |
... |
other graphical parameters to plot. |
Details
A coefficient profile plot is produced. The x-axis is log(\lambda)
. The y-axis is the value of fitted \alpha
's.
Author(s)
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <yiyang@umn.edu>
References
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
Examples
# create data
N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)
# set gaussian kernel
kern <- rbfdot(sigma=0.1)
# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))
# run KERE
m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5)
# plot the solution paths
plot(m1)
make predictions from a "KERE" object.
Description
Similar to other predict methods, this functions predicts fitted values and class labels from a fitted KERE
object.
Usage
## S3 method for class 'KERE'
predict(object, kern, x, newx,...)
Arguments
object |
fitted |
kern |
the built-in kernel classes in KERE. Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.) |
x |
the original design matrix for training |
newx |
matrix of new values for |
... |
other parameters to |
Details
The fitted \alpha_0 + K * \alpha
at newx is returned as a size nrow(newx)*length(lambda)
matrix for various lambda values where the KERE
model was fitted.
Value
The fitted \alpha_0 + K * \alpha
is returned as a size nrow(newx)*length(lambda)
matrix. The row represents the index for observations of newx. The column represents the index for the lambda sequence.
Author(s)
Yi Yang, Teng Zhang and Hui Zou
Maintainer: Yi Yang <yiyang@umn.edu>
References
Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.
Examples
# create data
N <- 100
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)
# set gaussian kernel
kern <- rbfdot(sigma=0.1)
# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))
# run KERE
m1 <- KERE(x=X, y=Y, kern=kern, lambda = lambda, omega = 0.5)
# create newx for prediction
N1 <- 5
X1 <- runif(N1)
X2 <- 2*runif(N1)
X3 <- 3*runif(N1)
newx <- cbind(X1,X2,X3)
# make prediction
p1 <- predict.KERE(m1, kern, X, newx)
p1