Help for package interpret

Title:

Fit Interpretable Machine Learning Models

Version:

0.1.34

Date:

2024-11-28

Description:

Package for training interpretable machine learning models. Historically, the most interpretable machine learning models were not very accurate, and the most accurate models were not very interpretable. Microsoft Research has developed an algorithm called the Explainable Boosting Machine (EBM) which has both high accuracy and interpretable characteristics. EBM uses machine learning techniques like bagging and boosting to breathe new life into traditional GAMs (Generalized Additive Models). This makes them as accurate as random forests and gradient boosted trees, and also enhances their intelligibility and editability. Details on the EBM algorithm can be found in the paper by Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad (2015, <doi:10.1145/2783258.2788613>).

URL:

https://github.com/interpretml/interpret

BugReports:

https://github.com/interpretml/interpret/issues

License:

MIT + file LICENSE

Depends:

R (≥ 3.0.0)

NeedsCompilation:

yes

SystemRequirements:

C++17

Packaged:

2024-11-28 10:45:55 UTC; runner

Author:

Samuel Jenkins [aut], Harsha Nori [aut], Paul Koch [aut], Rich Caruana [aut, cre], The InterpretML Contributors [cph]

Maintainer:

Rich Caruana <interpretml@outlook.com>

Repository:

CRAN

Date/Publication:

2024-11-28 11:40:08 UTC

Build an EBM classification model

Description

Builds a classification model

Usage

ebm_classify(
  X, 
  y, 
  max_bins = 255,
  outer_bags = 16, 
  inner_bags = 0,
  learning_rate = 0.01, 
  validation_size = 0.15, 
  early_stopping_rounds = 50, 
  early_stopping_tolerance = 1e-4,
  max_rounds = 5000, 
  min_hessian = 1e-3,
  max_leaves = 3,
  random_state = 42
)

Arguments

X

features

y

targets

max_bins

number of bins to create

outer_bags

number of outer bags

inner_bags

number of inner bags

learning_rate

learning rate

validation_size

amount of data to use for validation

early_stopping_rounds

how many rounds without improvement before we quit

early_stopping_tolerance

how much does the round need to improve by to be considered as an advancement

max_rounds

number of boosting rounds

min_hessian

minimum hessian required for a split

max_leaves

how many leaves allowed

random_state

random seed

Value

Returns an EBM model

Examples

  data(mtcars)
  X <- subset(mtcars, select = -c(vs))
  y <- mtcars$vs

  set.seed(42)
  data_sample <- sample(length(y), length(y) * 0.8)

  X_train <- X[data_sample, ]
  y_train <- y[data_sample]
  X_test <- X[-data_sample, ]
  y_test <- y[-data_sample]

  ebm <- ebm_classify(X_train, y_train)

ebm_predict_proba

Description

Predicts probabilities using an EBM model

Usage

ebm_predict_proba(
  model, 
  X
)

Arguments

model

the model

X

features

Value

returns the probabilities predicted

Examples

  data(mtcars)
  X <- subset(mtcars, select = -c(vs))
  y <- mtcars$vs

  set.seed(42)
  data_sample <- sample(length(y), length(y) * 0.8)

  X_train <- X[data_sample, ]
  y_train <- y[data_sample]
  X_test <- X[-data_sample, ]
  y_test <- y[-data_sample]

  ebm <- ebm_classify(X_train, y_train)
  proba_test <- ebm_predict_proba(ebm, X_test)

ebm_show

Description

Shows the GAM plot for a single feature

Usage

ebm_show(
  model, 
  name
)

Arguments

model

the model

name

the name of the feature to plot

Value

None

Examples

  data(mtcars)
  X <- subset(mtcars, select = -c(vs))
  y <- mtcars$vs

  set.seed(42)
  data_sample <- sample(length(y), length(y) * 0.8)

  X_train <- X[data_sample, ]
  y_train <- y[data_sample]
  X_test <- X[-data_sample, ]
  y_test <- y[-data_sample]

  ebm <- ebm_classify(X_train, y_train)
  ebm_show(ebm, "mpg")