| Type: | Package | 
| Title: | Leave One Out Kernel Density Estimates for Outlier Detection | 
| Version: | 0.1.4 | 
| Maintainer: | Sevvandi Kandanaarachchi <sevvandik@gmail.com> | 
| Description: | Outlier detection using leave-one-out kernel density estimates and extreme value theory. The bandwidth for kernel density estimates is computed using persistent homology, a technique in topological data analysis. Using peak-over-threshold method, a generalized Pareto distribution is fitted to the log of leave-one-out kde values to identify outliers. | 
| License: | GPL-3 | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.1 | 
| Imports: | TDAstats, evd, RANN, ggplot2, tidyr | 
| Suggests: | knitr, rmarkdown | 
| URL: | https://sevvandi.github.io/lookout/ | 
| NeedsCompilation: | no | 
| Packaged: | 2022-10-13 23:23:55 UTC; kan092 | 
| Author: | Sevvandi Kandanaarachchi | 
| Repository: | CRAN | 
| Date/Publication: | 2022-10-14 00:10:02 UTC | 
lookout: Leave One Out Kernel Density Estimates for Outlier Detection
Description
 
Outlier detection using leave-one-out kernel density estimates and extreme value theory. The bandwidth for kernel density estimates is computed using persistent homology, a technique in topological data analysis. Using peak-over-threshold method, a generalized Pareto distribution is fitted to the log of leave-one-out kde values to identify outliers.
Author(s)
Maintainer: Sevvandi Kandanaarachchi sevvandik@gmail.com (ORCID)
Authors:
- Rob Hyndman rob.hyndman@monash.edu (ORCID) 
Other contributors:
- Chris Fraley fraley@u.washington.edu [contributor] 
See Also
Useful links:
Plots outliers identified by lookout algorithm.
Description
Scatterplot of two columns from the data set with outliers highlighted.
Usage
## S3 method for class 'lookoutliers'
autoplot(object, columns = 1:2, ...)
Arguments
| object | The output of the function 'lookout'. | 
| columns | Which columns of the original data to plot (specified as either numbers or strings) | 
| ... | Other arguments currently ignored. | 
Value
A ggplot object.
Examples
X <- rbind(
  data.frame(x = rnorm(500),
             y = rnorm(500)),
  data.frame(x = rnorm(5, mean = 10, sd = 0.2),
             y = rnorm(5, mean = 10, sd = 0.2))
)
lo <- lookout(X)
autoplot(lo)
Plots outlier persistence for a range of significance levels.
Description
This function plots outlier persistence for a range of significance levels using the algorithm lookout, an outlier detection method that uses leave-one-out kernel density estimates and generalized Pareto distributions to find outliers.
Usage
## S3 method for class 'persistingoutliers'
autoplot(object, alpha = object$alpha, ...)
Arguments
| object | The output of the function 'persisting_outliers'. | 
| alpha | The significance levels to plot. | 
| ... | Other arguments currently ignored. | 
Value
A ggplot object.
Examples
X <- rbind(
  data.frame(
    x = rnorm(500),
    y = rnorm(500)
  ),
  data.frame(
    x = rnorm(5, mean = 10, sd = 0.2),
    y = rnorm(5, mean = 10, sd = 0.2)
  )
)
plot(X, pch = 19)
outliers <- persisting_outliers(X, unitize = FALSE)
autoplot(outliers)
Identifies outliers using the algorithm lookout.
Description
This function identifies outliers using the algorithm lookout, an outlier detection method that uses leave-one-out kernel density estimates and generalized Pareto distributions to find outliers.
Usage
lookout(X, alpha = 0.05, unitize = TRUE, bw = NULL, gpd = NULL, fast = TRUE)
Arguments
| X | The input data in a dataframe, matrix or tibble format. | 
| alpha | The level of significance. Default is  | 
| unitize | An option to normalize the data. Default is  | 
| bw | Bandwidth parameter. Default is  | 
| gpd | Generalized Pareto distribution parameters. If 'NULL' (the default), these are estimated from the data. | 
| fast | If set to  | 
Value
A list with the following components:
| outliers | The set of outliers. | 
| outlier_probability | The GPD probability of the data. | 
| outlier_scores | The outlier scores of the data. | 
| bandwidth | The bandwdith selected using persistent homology. | 
| kde | The kernel density estimate values. | 
| lookde | The leave-one-out kde values. | 
| gpd | The fitted GPD parameters. | 
Examples
X <- rbind(
  data.frame(x = rnorm(500),
             y = rnorm(500)),
  data.frame(x = rnorm(5, mean = 10, sd = 0.2),
             y = rnorm(5, mean = 10, sd = 0.2))
)
lo <- lookout(X)
lo
autoplot(lo)
Identifies outliers in univariate time series using the algorithm lookout.
Description
This is the time series implementation of lookout.
Usage
lookout_ts(x, alpha = 0.05)
Arguments
| x | The input univariate time series. | 
| alpha | The level of significance. Default is  | 
Value
A lookout object.
See Also
Examples
set.seed(1)
x <- arima.sim(list(order = c(1,1,0), ar = 0.8), n = 200)
x[50] <- x[50] + 10
plot(x)
lo <- lookout_ts(x)
lo
Computes outlier persistence for a range of significance values.
Description
This function computes outlier persistence for a range of significance values, using the algorithm lookout, an outlier detection method that uses leave-one-out kernel density estimates and generalized Pareto distributions to find outliers.
Usage
persisting_outliers(
  X,
  alpha = seq(0.01, 0.1, by = 0.01),
  st_qq = 0.9,
  unitize = TRUE,
  num_steps = 20
)
Arguments
| X | The input data in a matrix, data.frame, or tibble format. All columns should be numeric. | 
| alpha | Grid of significance levels. | 
| st_qq | The starting quantile for death radii sequence. This will be used to compute the starting bandwidth value. | 
| unitize | An option to normalize the data. Default is  | 
| num_steps | The length of the bandwidth sequence. | 
Value
A list with the following components:
| out | A 3D array of  | 
| bw | The set of bandwidth values. | 
| gpdparas | The GPD parameters used. | 
| lookoutbw | The bandwidth chosen by the algorithm  | 
Examples
X <- rbind(
  data.frame(x = rnorm(500),
             y = rnorm(500)),
  data.frame(x = rnorm(5, mean = 10, sd = 0.2),
             y = rnorm(5, mean = 10, sd = 0.2))
)
plot(X, pch = 19)
outliers <- persisting_outliers(X, unitize = FALSE)
outliers
autoplot(outliers)
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
- ggplot2