Type: | Package |
Title: | Approximate False Positive Rate Control in Selection Frequency for Random Forest |
Version: | 0.2.2 |
Date: | 2022-02-09 |
Description: | Approximate false positive rate control in selection frequency for random forest using the methods described by Ender Konukoglu and Melanie Ganz (2014) <doi:10.48550/arXiv.1410.2838>. Methods for calculating the selection frequency threshold at false positive rates and selection frequency false positive rate feature selection. |
Imports: | Rcpp, purrr, tibble, magrittr, dplyr |
Suggests: | testthat, randomForest, ranger, parsnip, knitr, rmarkdown |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
URL: | https://github.com/aberHRML/forestControl |
BugReports: | https://github.com/aberHRML/forestControl/issues |
RoxygenNote: | 7.1.1 |
LinkingTo: | Rcpp |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2022-02-09 10:36:21 UTC; tom |
Author: | Tom Wilson |
Maintainer: | Tom Wilson <tpw2@aber.ac.uk> |
Repository: | CRAN |
Date/Publication: | 2022-02-09 10:50:02 UTC |
False Positive Rate Control in Selection Frequency for Random Forest
Description
This package is an implementation of the methods described by Ender Konukoglu and Melanie Ganz in Konukoglu, E. and Ganz, M., 2014. Approximate false positive rate control in selection frequency for random forest. arXiv preprint arXiv:1410.2838 https://arxiv.org/abs/1410.2838.
Extract forest parameters
Description
For a randomForest
or ranger
classification object, extract the parameters needed to calculate an approximate selection frequency threshold
Usage
extract_params(x)
Arguments
x |
a |
Value
a list of four elements
-
Fn The number of features considered at each internal node (mtry)
-
Ft The total number of features in the data set
-
K The average number of binary tests/internal nodes across the enitre forest
-
Tr The total number of trees in the forest
Author(s)
Tom Wilson tpw2@aber.ac.uk
Examples
library(randomForest)
data(iris)
iris.rf <- randomForest(iris[,-5], iris[,5], forest = TRUE)
iris.params <- extract_params(iris.rf)
print(iris.params)
False Postivie Rate Feature Selection
Description
Calculate the False Positive Rate (FPR) for each feature using it's selection frequency
Usage
fpr_fs(x)
Arguments
x |
a |
Value
a tibble
of selection frequencies and their false positive rate
Author(s)
Jasen Finch jsf9@aber.ac.uk
Examples
library(randomForest)
data(iris)
iris.rf <- randomForest(iris[,-5], iris[,5], forest = TRUE)
iris.features <- fpr_fs(iris.rf)
print(iris.features)
Variable Selection Frequencies
Description
Extract variable selection frequencies from randomForest
and ranger
model objects
Usage
selection_freqs(x)
Arguments
x |
a |
Value
tibble
of variable selection frequencies
Examples
library(randomForest)
data(iris)
iris.rf <- randomForest(iris[,-5], iris[,5], forest = TRUE)
iris.freqs <- selection_freqs(iris.rf)
print(iris.freqs)
Selection Frequency Threshold
Description
Determine the selecton frequency threshold of a model at a specified false positive rate
Usage
sft(x, alpha)
Arguments
x |
a |
alpha |
a false positive rate (ie, 0.01) |
Value
a list of two elements
-
sft Tthe selection frequency threshold
-
probs_atsft The esimated false positive rate
Author(s)
Tom Wilson tpw2@aber.ac.uk
Examples
library(randomForest)
data(iris)
iris.rf <- randomForest(iris[,-5], iris[,5], forest = TRUE)
# For a false positive rate of 1%
iris.sft <- sft(iris.rf, 0.01)
print(iris.sft)
# To iterate through a range of alpha values
alpha <- c(0.01,0.05, 0.1,0.15,0.2, 0.25)
threshold <- NULL
for(i in seq_along(alpha)){
threshold[i] <- sft(iris.rf, alpha[i])$sft
}
plot(alpha, threshold, type = 'b')