Title: Performance Measures for 'mlr3'
Version: 1.0.0
Description: Implements multiple performance measures for supervised learning. Includes over 40 measures for regression and classification. Additionally, meta information about the performance measures can be queried, e.g. what the best and worst possible performances scores are.
License: LGPL-3
URL: https:///mlr3measures.mlr-org.com, https://github.com/mlr-org/mlr3measures
BugReports: https://github.com/mlr-org/mlr3measures/issues
Depends: R (≥ 3.1.0)
Imports: checkmate, mlr3misc, PRROC
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
Encoding: UTF-8
RoxygenNote: 7.3.2
Collate: 'assertions.R' 'bibentries.R' 'measures.R' 'binary_auc.R' 'binary_bbrier.R' 'binary_dor.R' 'binary_fbeta.R' 'binary_fdr.R' 'binary_fn.R' 'binary_fnr.R' 'binary_fomr.R' 'binary_fp.R' 'binary_fpr.R' 'binary_gmean.R' 'binary_gpr.R' 'binary_npv.R' 'binary_ppv.R' 'binary_prauc.R' 'binary_tn.R' 'binary_tnr.R' 'binary_tp.R' 'binary_tpr.R' 'classif_acc.R' 'classif_auc.R' 'classif_bacc.R' 'classif_ce.R' 'classif_logloss.R' 'classif_mbrier.R' 'classif_mcc.R' 'classif_zero_one.R' 'confusion_matrix.R' 'helper.R' 'regr_ae.R' 'regr_ape.R' 'regr_bias.R' 'regr_ktau.R' 'regr_linex.R' 'regr_mae.R' 'regr_mape.R' 'regr_maxae.R' 'regr_maxse.R' 'regr_medae.R' 'regr_medse.R' 'regr_mse.R' 'regr_msle.R' 'regr_pbias.R' 'regr_pinball.R' 'regr_rae.R' 'regr_rmse.R' 'regr_rmsle.R' 'regr_rrse.R' 'regr_rse.R' 'regr_rsq.R' 'regr_sae.R' 'regr_se.R' 'regr_sle.R' 'regr_smape.R' 'regr_srho.R' 'regr_sse.R' 'roxygen.R' 'similarity_jaccard.R' 'similarity_phi.R' 'zzz.R'
NeedsCompilation: no
Packaged: 2024-09-11 09:58:42 UTC; marc
Author: Michel Lang ORCID iD [aut], Martin Binder [ctb], Marc Becker ORCID iD [cre, aut], Lona Koers [aut]
Maintainer: Marc Becker <marcbecker@posteo.de>
Repository: CRAN
Date/Publication: 2024-09-11 22:52:30 UTC

mlr3measures: Performance Measures for 'mlr3'

Description

Implements multiple performance measures for supervised learning. Includes over 40 measures for regression and classification. Additionally, meta information about the performance measures can be queried, e.g. what the best and worst possible performances scores are.

Author(s)

Maintainer: Marc Becker marcbecker@posteo.de (ORCID)

Authors:

Other contributors:

See Also

Useful links:


Classification Accuracy

Description

Measure to compare true observed labels with predicted labels in multiclass classification tasks.

Usage

acc(truth, response, sample_weights = NULL, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Classification Accuracy is defined as

\frac{1}{n} \sum_{i=1}^n w_i \mathbf{1} \left( t_i = r_i \right),

where w_i are normalized weights for all observations x_i.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Classification Measures: bacc(), ce(), logloss(), mauc_aunu(), mbrier(), mcc(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
acc(truth, response)

Absolute Error (per observation)

Description

Measure to compare true observed response with predicted response in regression tasks.

Note that this is an unaggregated measure, returning the losses per observation.

Usage

ae(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

Calculates the per-observation absolute error as

\left| t_i - r_i \right|.

Value

Performance value as numeric(length(truth)).

Meta Information

See Also

Other Regression Measures: ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()


Absolute Percentage Error (per observation)

Description

Measure to compare true observed response with predicted response in regression tasks.

Note that this is an unaggregated measure, returning the losses per observation.

Usage

ape(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

Calculates the per-observation absolute percentage error as

\left| \frac{ t_i - r_i}{t_i} \right|.

Value

Performance value as numeric(length(truth)).

Meta Information

See Also

Other Regression Measures: ae(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()


Area Under the ROC Curve

Description

Measure to compare true observed labels with predicted probabilities in binary classification tasks.

Usage

auc(truth, prob, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

prob

(numeric())
Predicted probability for positive class. Must have exactly same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

Computes the area under the Receiver Operator Characteristic (ROC) curve. The AUC can be interpreted as the probability that a randomly chosen positive observation has a higher predicted probability than a randomly chosen negative observation.

This measure is undefined if the true values are either all positive or all negative.

Value

Performance value as numeric(1).

Meta Information

References

Youden WJ (1950). “Index for rating diagnostic tests.” Cancer, 3(1), 32–35. doi:10.1002/1097-0142(1950)3:1<32::aid-cncr2820030106>3.0.co;2-3.

See Also

Other Binary Classification Measures: bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

truth = factor(c("a", "a", "a", "b"))
prob = c(.6, .7, .1, .4)
auc(truth, prob, "a")

Balanced Accuracy

Description

Measure to compare true observed labels with predicted labels in multiclass classification tasks.

Usage

bacc(truth, response, sample_weights = NULL, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Balanced Accuracy computes the weighted balanced accuracy, suitable for imbalanced data sets. It is defined analogously to the definition in sklearn.

First, all sample weights w_i are normalized per class so that each class has the same influence:

\hat{w}_i = \frac{w_i}{\sum_{j=1}^n w_j \cdot \mathbf{1}(t_j = t_i)}.

The Balanced Accuracy is then calculated as

\frac{1}{\sum_{i=1}^n \hat{w}_i} \sum_{i=1}^n \hat{w}_i \cdot \mathbf{1}(r_i = t_i).

This definition is equivalent to acc() with class-balanced sample weights.

Value

Performance value as numeric(1).

Meta Information

References

Brodersen KH, Ong CS, Stephan KE, Buhmann JM (2010). “The Balanced Accuracy and Its Posterior Distribution.” In 2010 20th International Conference on Pattern Recognition. doi:10.1109/icpr.2010.764.

Guyon I, Bennett K, Cawley G, Escalante HJ, Escalera S, Ho TK, Macia N, Ray B, Saeed M, Statnikov A, Viegas E (2015). “Design of the 2015 ChaLearn AutoML challenge.” In 2015 International Joint Conference on Neural Networks (IJCNN). doi:10.1109/ijcnn.2015.7280767.

See Also

Other Classification Measures: acc(), ce(), logloss(), mauc_aunu(), mbrier(), mcc(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
bacc(truth, response)

Binary Brier Score

Description

Measure to compare true observed labels with predicted probabilities in binary classification tasks.

Usage

bbrier(truth, prob, positive, sample_weights = NULL, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

prob

(numeric())
Predicted probability for positive class. Must have exactly same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Binary Brier Score is defined as

\frac{1}{n} \sum_{i=1}^n w_i (I_i - p_i)^2,

where w_i are the sample weights, and I_{i} is 1 if observation x_i belongs to the positive class, and 0 otherwise.

Note that this (more common) definition of the Brier score is equivalent to the original definition of the multi-class Brier score (see mbrier()) divided by 2.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Brier_score

Brier GW (1950). “Verification of forecasts expressed in terms of probability.” Monthly Weather Review, 78(1), 1–3. doi:10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2.

See Also

Other Binary Classification Measures: auc(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
prob = runif(10)
bbrier(truth, prob, positive = "a")

Bias

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

bias(truth, response, sample_weights = NULL, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Bias is defined as

\frac{1}{n} \sum_{i=1}^n w_i \left( t_i - r_i \right),

where w_i are normalized sample weights. Good predictions score close to 0.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
bias(truth, response)

Binary Classification Parameters

Description

Binary Classification Parameters

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

prob

(numeric())
Predicted probability for positive class. Must have exactly same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.


Classification Error

Description

Measure to compare true observed labels with predicted labels in multiclass classification tasks.

Usage

ce(truth, response, sample_weights = NULL, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Classification Error is defined as

\frac{1}{n} \sum_{i=1}^n w_i \mathbf{1} \left( t_i \neq r_i \right),

where w_i are normalized weights for each observation x_i.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Classification Measures: acc(), bacc(), logloss(), mauc_aunu(), mbrier(), mcc(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
ce(truth, response)

Classification Parameters

Description

Classification Parameters

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

prob

(matrix())
Matrix of predicted probabilities, each column is a vector of probabilities for a specific class label. Columns must be named with levels of truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.


Calculate Binary Confusion Matrix

Description

Calculates the confusion matrix for a binary classification problem once and then calculates all binary confusion measures of this package.

Usage

confusion_matrix(truth, response, positive, na_value = NaN, relative = FALSE)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

relative

(logical(1))
If TRUE, the returned confusion matrix contains relative frequencies instead of absolute frequencies.

Details

The binary confusion matrix is defined as

\begin{pmatrix} TP & FP \\ FN & TN \end{pmatrix}.

If relative = TRUE, all values are divided by n.

Value

List with two elements:

Examples

set.seed(123)
lvls = c("a", "b")
truth = factor(sample(lvls, 20, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 20, replace = TRUE), levels = lvls)

confusion_matrix(truth, response, positive = "a")
confusion_matrix(truth, response, positive = "a", relative = TRUE)
confusion_matrix(truth, response, positive = "b")

Diagnostic Odds Ratio

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

dor(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Diagnostic Odds Ratio is defined as

\frac{\mathrm{TP}/\mathrm{FP}}{\mathrm{FN}/\mathrm{TN}}.

This measure is undefined if FP = 0 or FN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
dor(truth, response, positive = "a")

F-beta Score

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fbeta(truth, response, positive, beta = 1, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

beta

(numeric(1))
Parameter to give either precision or recall more weight. Default is 1, resulting in balanced weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

With P as precision() and R as recall(), the F-beta Score is defined as

(1 + \beta^2) \frac{P \cdot R}{(\beta^2 P) + R}.

It measures the effectiveness of retrieval with respect to a user who attaches \beta times as much importance to recall as precision. For \beta = 1, this measure is called "F1" score.

This measure is undefined if precision or recall is undefined, i.e. TP + FP = 0 or TP + FN = 0.

Value

Performance value as numeric(1).

Meta Information

References

Rijsbergen, Van CJ (1979). Information Retrieval, 2nd edition. Butterworth-Heinemann, Newton, MA, USA. ISBN 408709294.

Goutte C, Gaussier E (2005). “A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation.” In Lecture Notes in Computer Science, 345–359. doi:10.1007/978-3-540-31865-1_25.

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fbeta(truth, response, positive = "a")

False Discovery Rate

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fdr(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The False Discovery Rate is defined as

\frac{\mathrm{FP}}{\mathrm{TP} + \mathrm{FP}}.

This measure is undefined if TP + FP = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fdr(truth, response, positive = "a")

False Negatives

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fn(truth, response, positive, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

...

(any)
Additional arguments. Currently ignored.

Details

This measure counts the false negatives (type 2 error), i.e. the number of predictions indicating a negative class label while in fact it is positive. This is sometimes also called a "miss" or an "underestimation".

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fn(truth, response, positive = "a")

False Negative Rate

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fnr(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The False Negative Rate is defined as

\frac{\mathrm{FN}}{\mathrm{TP} + \mathrm{FN}}.

Also know as "miss rate".

This measure is undefined if TP + FN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fnr(truth, response, positive = "a")

False Omission Rate

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fomr(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The False Omission Rate is defined as

\frac{\mathrm{FN}}{\mathrm{FN} + \mathrm{TN}}.

This measure is undefined if FN + TN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fomr(truth, response, positive = "a")

False Positives

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fp(truth, response, positive, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

...

(any)
Additional arguments. Currently ignored.

Details

This measure counts the false positives (type 1 error), i.e. the number of predictions indicating a positive class label while in fact it is negative. This is sometimes also called a "false alarm".

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fp(truth, response, positive = "a")

False Positive Rate

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

fpr(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The False Positive Rate is defined as

\frac{\mathrm{FP}}{\mathrm{FP} + \mathrm{TN}}.

Also know as fall out or probability of false alarm.

This measure is undefined if FP + TN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
fpr(truth, response, positive = "a")

Geometric Mean of Recall and Specificity

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

gmean(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

Calculates the geometric mean of recall() R and specificity() S as

\sqrt{\mathrm{R} \cdot \mathrm{S}}.

This measure is undefined if recall or specificity is undefined, i.e. if TP + FN = 0 or if FP + TN = 0.

Value

Performance value as numeric(1).

Meta Information

References

He H, Garcia EA (2009). “Learning from Imbalanced Data.” IEEE Transactions on knowledge and data engineering, 21(9), 1263–1284. doi:10.1109/TKDE.2008.239.

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
gmean(truth, response, positive = "a")

Geometric Mean of Precision and Recall

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

gpr(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

Calculates the geometric mean of precision() P and recall() R as

\sqrt{\mathrm{P} \cdot \mathrm{R}}.

This measure is undefined if precision or recall is undefined, i.e. if TP + FP = 0 or if TP + FN = 0.

Value

Performance value as numeric(1).

Meta Information

References

He H, Garcia EA (2009). “Learning from Imbalanced Data.” IEEE Transactions on knowledge and data engineering, 21(9), 1263–1284. doi:10.1109/TKDE.2008.239.

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), npv(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
gpr(truth, response, positive = "a")

Jaccard Similarity Index

Description

Measure to compare two or more sets w.r.t. their similarity.

Usage

jaccard(sets, na_value = NaN, ...)

Arguments

sets

(list())
List of character or integer vectors. sets must have at least 2 elements.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

For two sets A and B, the Jaccard Index is defined as

J(A, B) = \frac{|A \cap B|}{|A \cup B|}.

If more than two sets are provided, the mean of all pairwise scores is calculated.

This measure is undefined if two or more sets are empty.

Value

Performance value as numeric(1).

Meta Information

References

Jaccard, Paul (1901). “Étude comparative de la distribution florale dans une portion des Alpes et du Jura.” Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547-579. doi:10.5169/SEALS-266450.

Bommert A, Rahnenführer J, Lang M (2017). “A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data.” Computational and Mathematical Methods in Medicine, 2017, 1–18. doi:10.1155/2017/7907163.

Bommert A, Lang M (2021). “stabm: Stability Measures for Feature Selection.” Journal of Open Source Software, 6(59), 3010. doi:10.21105/joss.03010.

See Also

Package stabm which implements many more stability measures with included correction for chance.

Other Similarity Measures: phi()

Examples

set.seed(1)
sets = list(
  sample(letters[1:3], 1),
  sample(letters[1:3], 2)
)
jaccard(sets)

Kendall's tau

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

ktau(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

Kendall's tau is defined as Kendall's rank correlation coefficient between truth and response. It is defined as

\tau = \frac{(\mathrm{number of concordant pairs)} - (\mathrm{number of discordant pairs)}}{\mathrm{(number of pairs)}}

Calls stats::cor() with method set to "kendall".

Value

Performance value as numeric(1).

Meta Information

References

Rosset S, Perlich C, Zadrozny B (2006). “Ranking-based evaluation of regression models.” Knowledge and Information Systems, 12(3), 331–353. doi:10.1007/s10115-006-0037-3.

See Also

Other Regression Measures: ae(), ape(), bias(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
ktau(truth, response)

Linear-Exponential Loss (per observation)

Description

Measure to compare true observed response with predicted response in regression tasks.

Note that this is an unaggregated measure, returning the losses per observation.

Usage

linex(truth, response, a = -1, b = 1, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

a

(numeric(1))
Shape parameter controlling asymmetry. Negative values penalize overestimation more, positive values penalize underestimation more. As a approaches 0, the loss resembles squared error loss. Default is -1.

b

(numeric(1))
Positive scaling factor for the loss. Larger values increase the loss magnitude. Default is 1.

...

(any)
Additional arguments. Currently ignored.

Details

The Linear-Exponential Loss is defined as

b (\exp (t_i - r_i) - a (t_i - r_i) - 1),

where a \neq 0, b > 0.

Value

Performance value as numeric(length(truth)).

Meta Information

References

Varian, R. H (1975). “A Bayesian Approach to Real Estate Assessment.” In Fienberg SE, Zellner A (eds.), Studies in Bayesian Econometrics and Statistics: In Honor of Leonard J. Savage, 195–208. North-Holland, Amsterdam.

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
linex(truth, response)

Log Loss

Description

Measure to compare true observed labels with predicted probabilities in multiclass classification tasks.

Usage

logloss(truth, prob, sample_weights = NULL, eps = 1e-15, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

prob

(matrix())
Matrix of predicted probabilities, each column is a vector of probabilities for a specific class label. Columns must be named with levels of truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

eps

(numeric(1))
Probabilities are clipped to max(eps, min(1 - eps, p)). Otherwise the measure would be undefined for probabilities p = 0 and p = 1.

...

(any)
Additional arguments. Currently ignored.

Details

The Log Loss (a.k.a Benoulli Loss, Logistic Loss, Cross-Entropy Loss) is defined as

-\frac{1}{n} \sum_{i=1}^n w_i \log \left( p_i \right )

where p_i is the probability for the true class of observation i and w_i are normalized weights for each observation x_i.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Classification Measures: acc(), bacc(), ce(), mauc_aunu(), mbrier(), mcc(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
prob = matrix(runif(3 * 10), ncol = 3, dimnames = list(NULL, lvls))
prob = t(apply(prob, 1, function(x) x / sum(x)))
logloss(truth, prob)

Mean Absolute Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

mae(truth, response, sample_weights = NULL, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Mean Absolute Error is defined as

\frac{1}{n} \sum_{i=1}^n w_i \left| t_i - r_i \right|,

where w_i are normalized sample weights.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
mae(truth, response)

Mean Absolute Percent Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

mape(truth, response, sample_weights = NULL, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Mean Absolute Percent Error is defined as

\frac{1}{n} \sum_{i=1}^n w_i \left| \frac{ t_i - r_i}{t_i} \right|,

where w_i are normalized sample weights.

This measure is undefined if any element of t is 0.

Value

Performance value as numeric(1).

Meta Information

References

de Myttenaere, Arnaud, Golden, Boris, Le Grand, Bénédicte, Rossi, Fabrice (2016). “Mean Absolute Percentage Error for regression models.” Neurocomputing, 192, 38-48. ISSN 0925-2312, doi:10.1016/j.neucom.2015.12.114.

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
mape(truth, response)

Multiclass AUC Scores

Description

Measure to compare true observed labels with predicted probabilities in multiclass classification tasks.

Usage

mauc_aunu(truth, prob, na_value = NaN, ...)

mauc_aunp(truth, prob, na_value = NaN, ...)

mauc_au1u(truth, prob, na_value = NaN, ...)

mauc_au1p(truth, prob, na_value = NaN, ...)

mauc_mu(truth, prob, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

prob

(matrix())
Matrix of predicted probabilities, each column is a vector of probabilities for a specific class label. Columns must be named with levels of truth.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

Multiclass AUC measures.

Value

Performance value as numeric(1).

Meta Information

References

Fawcett, Tom (2001). “Using rule sets to maximize ROC performance.” In Proceedings 2001 IEEE international conference on data mining, 131–138. IEEE.

Ferri, César, Hernández-Orallo, José, Modroiu, R (2009). “An experimental comparison of performance measures for classification.” Pattern Recognition Letters, 30(1), 27–38. doi:10.1016/j.patrec.2008.08.010.

Hand, J D, Till, J R (2001). “A simple generalisation of the area under the ROC curve for multiple class classification problems.” Machine learning, 45(2), 171–186.

Kleiman R, Page D (2019). “AUC mu: A Performance Metric for Multi-Class Machine Learning Models.” In Chaudhuri, Kamalika, Salakhutdinov, Ruslan (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 series Proceedings of Machine Learning Research, 3439–3447. PMLR.

See Also

Other Classification Measures: acc(), bacc(), ce(), logloss(), mbrier(), mcc(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
prob = matrix(runif(3 * 10), ncol = 3)
colnames(prob) = levels(truth)
mauc_aunu(truth, prob)

Max Absolute Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

maxae(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

The Max Absolute Error is defined as

\max \left( \left| t_i - r_i \right| \right).

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
maxae(truth, response)

Max Squared Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

maxse(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

The Max Squared Error is defined as

\max \left( t_i - r_i \right)^2.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
maxse(truth, response)

Multiclass Brier Score

Description

Measure to compare true observed labels with predicted probabilities in multiclass classification tasks.

Usage

mbrier(truth, prob, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

prob

(matrix())
Matrix of predicted probabilities, each column is a vector of probabilities for a specific class label. Columns must be named with levels of truth.

...

(any)
Additional arguments. Currently ignored.

Details

Brier score for multi-class classification problems with k labels defined as

\frac{1}{n} \sum_{i=1}^n \sum_{j=1}^k (I_{ij} - p_{ij})^2.

I_{ij} is 1 if observation x_i has true label j, and 0 otherwise. p_{ij} is the probability that observation x_i belongs to class j.

Note that there also is the more common definition of the Brier score for binary classification problems in bbrier().

Value

Performance value as numeric(1).

Meta Information

References

Brier GW (1950). “Verification of forecasts expressed in terms of probability.” Monthly Weather Review, 78(1), 1–3. doi:10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2.

See Also

Other Classification Measures: acc(), bacc(), ce(), logloss(), mauc_aunu(), mcc(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
prob = matrix(runif(3 * 10), ncol = 3)
colnames(prob) = levels(truth)
mbrier(truth, prob)

Matthews Correlation Coefficient

Description

Measure to compare true observed labels with predicted labels in multiclass classification tasks.

Usage

mcc(truth, response, positive = NULL, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

positive

(character(1)) Name of the positive class in case of binary classification.

...

(any)
Additional arguments. Currently ignored.

Details

In the binary case, the Matthews Correlation Coefficient is defined as

\frac{\mathrm{TP} \cdot \mathrm{TN} - \mathrm{FP} \cdot \mathrm{FN}}{\sqrt{(\mathrm{TP} + \mathrm{FP}) (\mathrm{TP} + \mathrm{FN}) (\mathrm{TN} + \mathrm{FP}) (\mathrm{TN} + \mathrm{FN})}},

where TP, FP, TN, TP are the number of true positives, false positives, true negatives, and false negatives respectively.

In the multi-class case, the Matthews Correlation Coefficient is defined for a multi-class confusion matrix C with K classes:

\frac{c \cdot s - \sum_k^K p_k \cdot t_k}{\sqrt{(s^2 - \sum_k^K p_k^2) \cdot (s^2 - \sum_k^K t_k^2)}},

where

The above formula is undefined if any of the four sums in the denominator is 0 in the binary case and more generally if either s^2 - \sum_k^K p_k^2 or s^2 - \sum_k^K t_k^2) is equal to 0. The denominator is then set to 1.

When there are more than two classes, the MCC will no longer range between -1 and +1. Instead, the minimum value will be between -1 and 0 depending on the true distribution. The maximum value is always +1.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Phi_coefficient

Matthews BW (1975). “Comparison of the predicted and observed secondary structure of T4 phage lysozyme.” Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2), 442–451. doi:10.1016/0005-2795(75)90109-9.

See Also

Other Classification Measures: acc(), bacc(), ce(), logloss(), mauc_aunu(), mbrier(), zero_one()

Examples

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
mcc(truth, response)

Measure Registry

Description

The environment() measures keeps track of all measures in this package. It stores meta information such as minimum, maximum or if the measure must be minimized or maximized. The following information is available for each measure:

Usage

measures

Format

An object of class environment of length 65.

Examples

names(measures)
measures$tpr

Median Absolute Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

medae(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

The Median Absolute Error is defined as

\mathop{\mathrm{median}} \left| t_i - r_i \right|.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
medae(truth, response)

Median Squared Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

medse(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

The Median Squared Error is defined as

\mathop{\mathrm{median}} \left[ \left( t_i - r_i \right)^2 \right].

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
medse(truth, response)

Mean Squared Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

mse(truth, response, sample_weights = NULL, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Mean Squared Error is defined as

\frac{1}{n} \sum_{i=1}^n w_i \left( t_i - r_i \right)^2,

where w_i are normalized sample weights.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
mse(truth, response)

Mean Squared Log Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

msle(truth, response, sample_weights = NULL, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Mean Squared Log Error is defined as

\frac{1}{n} \sum_{i=1}^n w_i \left( \ln (1 + t_i) - \ln (1 + r_i) \right)^2,

where w_i are normalized sample weights. This measure is undefined if any element of t or r is less than or equal to -1.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
msle(truth, response)

Negative Predictive Value

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

npv(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Negative Predictive Value is defined as

\frac{\mathrm{TN}}{\mathrm{FN} + \mathrm{TN}}.

This measure is undefined if FN + TN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), ppv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
npv(truth, response, positive = "a")

Percent Bias

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

pbias(truth, response, sample_weights = NULL, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Percent Bias is defined as

\frac{1}{n} \sum_{i=1}^n w_i \frac{\left( t_i - r_i \right)}{\left| t_i \right|},

where w_i are normalized sample weights. Good predictions score close to 0.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
pbias(truth, response)

Phi Coefficient Similarity

Description

Measure to compare two or more sets w.r.t. their similarity.

Usage

phi(sets, p, na_value = NaN, ...)

Arguments

sets

(list())
List of character or integer vectors. sets must have at least 2 elements.

p

(integer(1))
Total number of possible elements.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Phi Coefficient is defined as the Pearson correlation between the binary representation of two sets A and B. The binary representation for A is a logical vector of length p with the i-th element being 1 if the corresponding element is in A, and 0 otherwise.

If more than two sets are provided, the mean of all pairwise scores is calculated.

This measure is undefined if one set contains none or all possible elements.

Value

Performance value as numeric(1).

Meta Information

References

Nogueira S, Brown G (2016). “Measuring the Stability of Feature Selection.” In Machine Learning and Knowledge Discovery in Databases, 442–457. Springer International Publishing. doi:10.1007/978-3-319-46227-1_28.

Bommert A, Rahnenführer J, Lang M (2017). “A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data.” Computational and Mathematical Methods in Medicine, 2017, 1–18. doi:10.1155/2017/7907163.

Bommert A, Lang M (2021). “stabm: Stability Measures for Feature Selection.” Journal of Open Source Software, 6(59), 3010. doi:10.21105/joss.03010.

See Also

Package stabm which implements many more stability measures with included correction for chance.

Other Similarity Measures: jaccard()

Examples

set.seed(1)
sets = list(
  sample(letters[1:3], 1),
  sample(letters[1:3], 2)
)
phi(sets, p = 3)

Average Pinball Loss

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

pinball(truth, response, sample_weights = NULL, alpha = 0.5, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

alpha

numeric(1)
The quantile to compute the pinball loss.

...

(any)
Additional arguments. Currently ignored.

Details

The pinball loss for quantile regression is defined as

\text{Average Pinball Loss} = \frac{1}{n} \sum_{i=1}^{n} w_{i} \begin{cases} q \cdot (t_i - r_i) & \text{if } t_i \geq r_i \\ (1 - q) \cdot (r_i - t_i) & \text{if } t_i < r_i \end{cases}

where q is the quantile and w_i are normalized sample weights.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
pinball(truth, response)

Positive Predictive Value

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

ppv(truth, response, positive, na_value = NaN, ...)

precision(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Positive Predictive Value is defined as

\frac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FP}}.

Also know as "precision".

This measure is undefined if TP + FP = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

Goutte C, Gaussier E (2005). “A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation.” In Lecture Notes in Computer Science, 345–359. doi:10.1007/978-3-540-31865-1_25.

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), prauc(), tn(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
ppv(truth, response, positive = "a")

Area Under the Precision-Recall Curve

Description

Measure to compare true observed labels with predicted probabilities in binary classification tasks.

Usage

prauc(truth, prob, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

prob

(numeric())
Predicted probability for positive class. Must have exactly same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

Computes the area under the Precision-Recall curve (PRC). The PRC can be interpreted as the relationship between precision and recall (sensitivity), and is considered to be a more appropriate measure for unbalanced datasets than the ROC curve. The AUC-PRC is computed by integration of the piecewise function.

This measure is undefined if the true values are either all positive or all negative.

Value

Performance value as numeric(1).

Meta Information

References

Davis J, Goadrich M (2006). “The relationship between precision-recall and ROC curves.” In Proceedings of the 23rd International Conference on Machine Learning. ISBN 9781595933836.

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), tn(), tnr(), tp(), tpr()

Examples

truth = factor(c("a", "a", "a", "b"))
prob = c(.6, .7, .1, .4)
prauc(truth, prob, "a")

Relative Absolute Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

rae(truth, response, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Relative Absolute Error is defined as

\frac{\sum_{i=1}^n \left| t_i - r_i \right|}{\sum_{i=1}^n \left| t_i - \bar{t} \right|},

where \bar{t} = \sum_{i=1}^n t_i. This measure is undefined for constant t.

Can be interpreted as absolute error of the predictions relative to a naive model predicting the mean.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
rae(truth, response)

Regression Parameters

Description

Regression Parameters

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.


Root Mean Squared Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

rmse(truth, response, sample_weights = NULL, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

...

(any)
Additional arguments. Currently ignored.

Details

The Root Mean Squared Error is defined as

\sqrt{\frac{1}{n} \sum_{i=1}^n w_i \left( t_i - r_i \right)^2},

where w_i are normalized sample weights.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
rmse(truth, response)

Root Mean Squared Log Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

rmsle(truth, response, sample_weights = NULL, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

sample_weights

(numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Root Mean Squared Log Error is defined as

\sqrt{\frac{1}{n} \sum_{i=1}^n w_i \left( \ln (1 + t_i) - \ln (1 + r_i) \right)^2},

where w_i are normalized sample weights.

This measure is undefined if any element of t or r is less than or equal to -1.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
rmsle(truth, response)

Root Relative Squared Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

rrse(truth, response, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Root Relative Squared Error is defined as

\sqrt{\frac{\sum_{i=1}^n \left( t_i - r_i \right)^2}{\sum_{i=1}^n \left( t_i - \bar{t} \right)^2}},

where \bar{t} = \sum_{i=1}^n t_i.

Can be interpreted as root of the squared error of the predictions relative to a naive model predicting the mean.

This measure is undefined for constant t.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
rrse(truth, response)

Relative Squared Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

rse(truth, response, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Relative Squared Error is defined as

\frac{\sum_{i=1}^n \left( t_i - r_i \right)^2}{\sum_{i=1}^n \left( t_i - \bar{t} \right)^2},

where \bar{t} = \sum_{i=1}^n t_i.

Can be interpreted as squared error of the predictions relative to a naive model predicting the mean.

This measure is undefined for constant t.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rsq(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
rse(truth, response)

R Squared

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

rsq(truth, response, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

R Squared is defined as

1 - \frac{\sum_{i=1}^n \left( t_i - r_i \right)^2}{\sum_{i=1}^n \left( t_i - \bar{t} \right)^2},

where \bar{t} = \sum_{i=1}^n t_i.

Also known as coefficient of determination or explained variation. Subtracts the rse() from 1, hence it compares the squared error of the predictions relative to a naive model predicting the mean.

This measure is undefined for constant t.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), sae(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
rsq(truth, response)

Sum of Absolute Errors

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

sae(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

The Sum of Absolute Errors is defined as

\sum_{i=1}^n \left| t_i - r_i \right|.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), se(), sle(), smape(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
sae(truth, response)

Squared Error (per observation)

Description

Measure to compare true observed response with predicted response in regression tasks.

Note that this is an unaggregated measure, returning the losses per observation.

Usage

se(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

Calculates the per-observation squared error as

\left( t_i - r_i \right)^2.

Value

Performance value as numeric(length(truth)).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), sle(), smape(), srho(), sse()


Similarity Parameters

Description

Similarity Parameters

Arguments

sets

(list())
List of character or integer vectors. sets must have at least 2 elements.

p

(integer(1))
Total number of possible elements.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.


Squared Log Error (per observation)

Description

Calculates the per-observation squared error as

\left( \ln (1 + t_i) - \ln (1 + r_i) \right)^2.

Measure to compare true observed response with predicted response in regression tasks.

Note that this is an unaggregated measure, returning the losses per observation.

Usage

sle(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Value

Performance value as numeric(length(truth)).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), smape(), srho(), sse()


Symmetric Mean Absolute Percent Error

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

smape(truth, response, na_value = NaN, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The Symmetric Mean Absolute Percent Error is defined as

\frac{2}{n} \sum_{i=1}^n \frac{\left| t_i - r_i \right|}{\left| t_i \right| + \left| r_i \right|}.

This measure is undefined if if any |t| + |r| is equal to 0.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), srho(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
smape(truth, response)

Spearman's rho

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

srho(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

Spearman's rho is defined as Spearman's rank correlation coefficient between truth and response. Calls stats::cor() with method set to "spearman".

Value

Performance value as numeric(1).

Meta Information

References

Rosset S, Perlich C, Zadrozny B (2006). “Ranking-based evaluation of regression models.” Knowledge and Information Systems, 12(3), 331–353. doi:10.1007/s10115-006-0037-3.

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), sse()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
srho(truth, response)

Sum of Squared Errors

Description

Measure to compare true observed response with predicted response in regression tasks.

Usage

sse(truth, response, ...)

Arguments

truth

(numeric())
True (observed) values. Must have the same length as response.

response

(numeric())
Predicted response values. Must have the same length as truth.

...

(any)
Additional arguments. Currently ignored.

Details

The Sum of Squared Errors is defined as

\sum_{i=1}^n \left( t_i - r_i \right)^2.

Value

Performance value as numeric(1).

Meta Information

See Also

Other Regression Measures: ae(), ape(), bias(), ktau(), linex(), mae(), mape(), maxae(), maxse(), medae(), medse(), mse(), msle(), pbias(), pinball(), rae(), rmse(), rmsle(), rrse(), rse(), rsq(), sae(), se(), sle(), smape(), srho()

Examples

set.seed(1)
truth = 1:10
response = truth + rnorm(10)
sse(truth, response)

True Negatives

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

tn(truth, response, positive, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

...

(any)
Additional arguments. Currently ignored.

Details

This measure counts the true negatives, i.e. the number of predictions correctly indicating a negative class label. This is sometimes also called a "correct rejection".

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tnr(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
tn(truth, response, positive = "a")

True Negative Rate

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

tnr(truth, response, positive, na_value = NaN, ...)

specificity(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The True Negative Rate is defined as

\frac{\mathrm{TN}}{\mathrm{FP} + \mathrm{TN}}.

Also know as "specificity" or "selectivity".

This measure is undefined if FP + TN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tp(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
tnr(truth, response, positive = "a")

True Positives

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

tp(truth, response, positive, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

...

(any)
Additional arguments. Currently ignored.

Details

This measure counts the true positives, i.e. the number of predictions correctly indicating a positive class label. This is sometimes also called a "hit".

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tpr()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
tp(truth, response, positive = "a")

True Positive Rate

Description

Measure to compare true observed labels with predicted labels in binary classification tasks.

Usage

tpr(truth, response, positive, na_value = NaN, ...)

recall(truth, response, positive, na_value = NaN, ...)

sensitivity(truth, response, positive, na_value = NaN, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the exactly same two levels and the same length as response.

response

(factor())
Predicted response labels. Must have the exactly same two levels and the same length as truth.

positive

(⁠character(1))⁠
Name of the positive class.

na_value

(numeric(1))
Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.

...

(any)
Additional arguments. Currently ignored.

Details

The True Positive Rate is defined as

\frac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FN}}.

This is also know as "recall", "sensitivity", or "probability of detection".

This measure is undefined if TP + FN = 0.

Value

Performance value as numeric(1).

Meta Information

References

https://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram

Goutte C, Gaussier E (2005). “A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation.” In Lecture Notes in Computer Science, 345–359. doi:10.1007/978-3-540-31865-1_25.

See Also

Other Binary Classification Measures: auc(), bbrier(), dor(), fbeta(), fdr(), fn(), fnr(), fomr(), fp(), fpr(), gmean(), gpr(), npv(), ppv(), prauc(), tn(), tnr(), tp()

Examples

set.seed(1)
lvls = c("a", "b")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
response = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
tpr(truth, response, positive = "a")

Zero-One Classification Loss (per observation)

Description

Calculates the per-observation 0/1 (zero-one) loss as

\mathbf{1} (t_i \neq r_1).

The 1/0 (one-zero) loss is equal to 1 - zero-one and calculated as

\mathbf{1} (t_i = r_i).

Measure to compare true observed labels with predicted labels in multiclass classification tasks.

Note that this is an unaggregated measure, returning the losses per observation.

Usage

zero_one(truth, response, ...)

one_zero(truth, response, ...)

Arguments

truth

(factor())
True (observed) labels. Must have the same levels and length as response.

response

(factor())
Predicted response labels. Must have the same levels and length as truth.

...

(any)
Additional arguments. Currently ignored.

Value

Performance value as numeric(length(truth)).

Meta Information

See Also

Other Classification Measures: acc(), bacc(), ce(), logloss(), mauc_aunu(), mbrier(), mcc()