| Type: | Package |
| Title: | Factor-Adjusted Robust Multiple Testing |
| Version: | 2.2.0 |
| Date: | 2020-09-06 |
| Description: | Performs robust multiple testing for means in the presence of known and unknown latent factors presented in Fan et al.(2019) "FarmTest: Factor-Adjusted Robust Multiple Testing With Approximate False Discovery Control" <doi:10.1080/01621459.2018.1527700>. Implements a series of adaptive Huber methods combined with fast data-drive tuning schemes proposed in Ke et al.(2019) "User-Friendly Covariance Estimation for Heavy-Tailed Distributions" <doi:10.1214/19-STS711> to estimate model parameters and construct test statistics that are robust against heavy-tailed and/or asymmetric error distributions. Extensions to two-sample simultaneous mean comparison problems are also included. As by-products, this package contains functions that compute adaptive Huber mean, covariance and regression estimators that are of independent interest. |
| Depends: | R (≥ 3.6.0) |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| URL: | https://github.com/XiaoouPan/FarmTest |
| SystemRequirements: | C++11 |
| Imports: | Rcpp, graphics |
| LinkingTo: | Rcpp, RcppArmadillo |
| RoxygenNote: | 7.1.1 |
| NeedsCompilation: | yes |
| Packaged: | 2020-09-07 01:29:51 UTC; xopan |
| Author: | Xiaoou Pan [aut, cre], Yuan Ke [aut], Wen-Xin Zhou [aut] |
| Maintainer: | Xiaoou Pan <xip024@ucsd.edu> |
| Repository: | CRAN |
| Date/Publication: | 2020-09-07 05:00:03 UTC |
FarmTest: Factor-Adjusted Robust Multiple Testing
Description
FarmTest package performs robust multiple testing for means in the presence of known and unknown latent factors (Fan et al, 2019). It implements a series of adaptive Huber methods combined with fast data-drive tuning schemes (Wang et al, 2020; Ke et al, 2019) to estimate model parameters and construct test statistics that are robust against heavy-tailed and/or assymetric error distributions. Extensions to two-sample simultaneous mean comparison problems are also included. As by-products, this package also contains functions that compute adaptive Huber mean, covariance and regression estimators that are of independent interest.
Details
See its GitHub page https://github.com/XiaoouPan/FarmTest for details.
References
Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio rest for the number of factors. Econometrica, 81(3) 1203–1227.
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol., 57 289–300.
Bose, K., Fan, J., Ke, Y., Pan, X. and Zhou, W.-X. (2019). FarmTest: An R package for factor-adjusted robust multiple testing, Preprint.
Fan, J., Ke, Y., Sun, Q. and Zhou, W-X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control. J. Amer. Statist. Assoc., 114, 1880-1893.
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.
Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci., 34, 454-471.
Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol., 64 479–498.
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A new principle for tuning-free Huber regression. Stat. Sin., to appear.
Zhou, W-X., Bose, K., Fan, J. and Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing. Ann. Statist., 46 1904-1931.
Factor-adjusted robust multiple testing
Description
This function conducts factor-adjusted robust multiple testing (FarmTest) for means of multivariate data proposed in Fan et al. (2019) via a tuning-free procedure.
Usage
farm.test(
X,
fX = NULL,
KX = -1,
Y = NULL,
fY = NULL,
KY = -1,
h0 = NULL,
alternative = c("two.sided", "less", "greater"),
alpha = 0.05,
p.method = c("bootstrap", "normal"),
nBoot = 500
)
Arguments
X |
An |
fX |
An optional factor matrix with each column being a factor for |
KX |
An optional positive number of factors to be estimated for |
Y |
An optional data matrix used for two-sample FarmTest. The number of columns of |
fY |
An optional factor matrix for two-sample FarmTest with each column being a factor for |
KY |
An optional positive number of factors to be estimated for |
h0 |
An optional |
alternative |
An optional character string specifying the alternate hypothesis, must be one of "two.sided" (default), "less" or "greater". |
alpha |
An optional level for controlling the false discovery rate. The value of |
p.method |
An optional character string specifying the method to calculate p-values when |
nBoot |
An optional positive integer specifying the size of bootstrap sample, only available when |
Details
For two-sample FarmTest, means, stdDev, loadings, eigenVal, eigenRatio, nfactors and n will be lists of items for sample X and Y separately.
alternative = "greater" is the alternative that \mu > \mu_0 for one-sample test or \mu_X > \mu_Y for two-sample test.
Setting p.method = "bootstrap" for factor-known model will slow down the program, but it will achieve lower empirical FDP than setting p.method = "normal".
Value
An object with S3 class farm.test containing the following items will be returned:
meansEstimated means, a vector with length
p.stdDevEstimated standard deviations, a vector with length
p. It's not available for bootstrap method.loadingsEstimated factor loadings, a matrix with dimension
pbyK, whereKis the number of factors.eigenValEigenvalues of estimated covariance matrix, a vector with length
p. It's only available when factorsfXandfYare not given.eigenRatioRatios of
eigenValto estimatenFactors, a vector with lengthmin(n, p) / 2. It's only available when number of factorsKXandKYare not given.nFactorsEstimated or input number of factors, a positive integer.
tStatValues of test statistics, a vector with length
p. It's not available for bootstrap method.pValuesP-values of tests, a vector with length
p.pAdjustAdjusted p-values of tests, a vector with length
p.significantBoolean values indicating whether each test is significant, with 1 for significant and 0 for non-significant, a vector with length
p.rejectIndices of tests that are rejected. It will show "no hypotheses rejected" if none of the tests are rejected.
typeIndicator of whether factor is known or unknown.
nSample size.
pData dimension.
h0Null hypothesis, a vector with length
p.alpha\alphavalue.alternativeAlthernative hypothesis.
References
Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3) 1203–1227.
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol., 57 289–300.
Fan, J., Ke, Y., Sun, Q. and Zhou, W-X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control. J. Amer. Statist. Assoc., 114, 1880-1893.
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.
Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol., 64, 479–498.
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.
Zhou, W-X., Bose, K., Fan, J. and Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing. Ann. Statist., 46 1904-1931.
See Also
print.farm.test, summary.farm.test and plot.farm.test.
Examples
n = 20
p = 50
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
# One-sample FarmTest with two sided alternative
output = farm.test(X)
# One-sample FarmTest with one sided alternative
output = farm.test(X, alternative = "less")
# One-sample FarmTest with known factors
output = farm.test(X, fX = fX)
# Two-sample FarmTest
muY = rep(0, p)
muY[1:5] = 4
epsilonY = matrix(rnorm(p * n, 0, 1), nrow = n)
BY = matrix(runif(p * K, -2, 2), nrow = p)
fY = matrix(rnorm(K * n, 0, 1), nrow = n)
Y = rep(1, n) %*% t(muY) + fY %*% t(BY) + epsilonY
output = farm.test(X, Y = Y)
Tuning-free Huber-type covariance estimation
Description
The function calculates adaptive Huber-type covariance estimator from a data sample, with robustification parameter \tau determined by a tuning-free principle.
For the input matrix X, both low-dimension (p < n) and high-dimension (p > n) are allowed.
Usage
huber.cov(X)
Arguments
X |
An |
Value
A p by p Huber-type covariance matrix estimator will be returned.
References
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.
Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci., 34, 454-471.
See Also
huber.mean for tuning-free Huber mean estimation and huber.reg for tuning-free Huber regression.
Examples
n = 100
d = 50
X = matrix(rt(n * d, df = 3), n, d) / sqrt(3)
Sigma = huber.cov(X)
Tuning-free Huber mean estimation
Description
The function calculates adaptive Huber mean estimator from a data sample, with robustification parameter \tau determined by a tuning-free principle.
Usage
huber.mean(X)
Arguments
X |
An |
Value
A Huber mean estimator will be returned.
References
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A New Principle for Tuning-Free Huber Regression. Stat. Sin., to appear.
See Also
huber.cov for tuning-free Huber-type covariance estimation and huber.reg for tuning-free Huber regression.
Examples
n = 10000
X = rt(n, 2) + 2
mu = huber.mean(X)
Tuning-free Huber regression
Description
The function conducts Huber regression from a data sample, with robustification parameter \tau determined by a tuning-free principle.
Usage
huber.reg(X, Y, method = c("standard", "adaptive"))
Arguments
X |
An |
Y |
A continuous response with length |
method |
An optional character string specifying the method to calibrate the robustification parameter |
Value
A coefficients estimator with length p + 1 will be returned.
References
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73–101.
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Statist. Assoc., 115, 254-265.
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A new principle for tuning-free Huber regression. Stat. Sin., to appear.
See Also
huber.mean for tuning-free Huber mean estimation and huber.cov for tuning-free Huber-type covariance estimation.
Examples
n = 200
d = 10
beta = rep(1, d)
X = matrix(rnorm(n * d), n, d)
err = rnorm(n)
Y = 1 + X %*% beta + err
beta.hat = huber.reg(X, Y)
Plot function of FarmTest
Description
This is the plot function of S3 objects with class "farm.test". It produces the histogram of estimated means.
Usage
## S3 method for class 'farm.test'
plot(x, ...)
Arguments
x |
A |
... |
Further arguments passed to or from other methods. |
Details
For two-sample FarmTest, the histogram is based on the difference: estimated means of sample X - estimated means of sample Y.
Value
No variable will be returned, but a histogram of estimated means will be presented.
See Also
farm.test, print.farm.test and summary.farm.test.
Examples
n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
plot(output)
Print function of FarmTest
Description
This is the print function of S3 objects with class "farm.test".
Usage
## S3 method for class 'farm.test'
print(x, ...)
Arguments
x |
A |
... |
Further arguments passed to or from other methods. |
Value
No variable will be returned, but a brief summary of FarmTest will be displayed.
See Also
farm.test, summary.farm.test and plot.farm.test.
Examples
n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
print(output)
Summary function of FarmTest
Description
This is the summary function of S3 objects with class "farm.test".
Usage
## S3 method for class 'farm.test'
summary(object, ...)
Arguments
object |
A |
... |
Further arguments passed to or from other methods. |
Details
For two-sample FarmTest, the first column is the difference: estimated means of sample X - estimated means of sample Y.
Value
A data frame including the estimated means, p-values, adjusted p-values and significance for all the features will be presented.
See Also
farm.test, print.farm.test and plot.farm.test.
Examples
n = 50
p = 100
K = 3
muX = rep(0, p)
muX[1:5] = 2
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
output = farm.test(X)
summary(output)