| Title: | Calculate Variable Importance with Knock Off Variables | 
| Version: | 1.0 | 
| Description: | The variable importance is calculated using knock off variables. Then output can be provided in numerical and graphical form. Meredith L Wallace (2023) <doi:10.1186/s12874-023-01965-x>. | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.3 | 
| Imports: | caret, ggplot2, ranger, knockoff, ROCR | 
| Suggests: | knitr, rmarkdown, testthat | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2024-02-20 16:13:09 UTC; wheelerbj2 | 
| Author: | Meredith Wallace | 
| Maintainer: | Meredith Wallace <lotzmj@upmc.edu> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-02-21 20:40:06 UTC | 
calc_vimps
Description
Calculate the variable importance of the domains for a given dataset
Usage
calc_vimps(
  dat,
  dep_var,
  doms,
  calc_ko = TRUE,
  calc_dom = FALSE,
  num_folds = 10,
  num_kos = 100,
  model_all = normal_model,
  model_subset = one_tree_model,
  mtry = NULL,
  min.node.size = NULL,
  iterations = 500,
  ko_path = NULL,
  results_path = NULL,
  output_file_ko = NULL,
  output_file_dom = NULL
)
Arguments
| dat | A dataframe of data | 
| dep_var | The dependent variable in the dat | 
| doms | A dataframe of the variables in dat and the domain they belong to | 
| calc_ko | True/False to calculate the knock_off importance | 
| calc_dom | True/False to calculate the domain importance | 
| num_folds | The number of folds to use while calculating the classification threshold for predictions | 
| num_kos | The number of sets of knock off variables to create | 
| model_all | The model to use in full ensemble mode in calculations | 
| model_subset | The model to use sigularly for building ensembles from | 
| mtry | The mtry value to use in the random forests | 
| min.node.size | The min.node.size value to use in the random forests | 
| iterations | Number of trees to build while calculating variable importance | 
| ko_path | Where to store the knock off variable sets | 
| results_path | Where to store the intermediary results for calculating variable importance | 
| output_file_ko | Where to store the results of the knock off variable importance | 
| output_file_dom | Where to store the results of the domain variable importance | 
Value
List with 1) Threshold for binary class labeling 2) Model metrics using all variables 3) Model metrics using knock-off variables 4) Variable importance with knock-offs
Examples
calc_vimps(
  data.frame(
    X1=c(2,8,3,9,1,4,3,8,0,9,2,8,3,9,1,4,3,8,0,9),
    X2=c(7,2,5,0,9,1,8,8,3,9,7,2,5,0,9,1,8,8,3,9),
    Y=c(0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1)),
 "Y",
 data.frame(domain=c('X1','X2'),
 variable=c('X1','X2')),
 num_folds=2,
 num_kos=1,
 iterations=50)
graph_results
Description
Graph the variable importance results from calc_vimps
Usage
graph_results(results, object)
Arguments
| results | The results from calc_vimps | 
| object | Which object from results to use for graphing results | 
Value
No return value