Help for package graphicalExtremes

Title:

Statistical Methodology for Graphical Extreme Value Models

Version:

0.3.4

Maintainer:

Sebastian Engelke <sebastian.engelke@unige.ch>

Description:

Statistical methodology for sparse multivariate extreme value models. Methods are provided for exact simulation and statistical inference for multivariate Pareto distributions on graphical structures as described in the paper 'Graphical Models for Extremes' by Engelke and Hitz (2020) <doi:10.1111/rssb.12355>.

Depends:

R (≥ 3.6.0)

License:

GPL-3

Encoding:

UTF-8

LazyData:

true

LazyDataCompression:

bzip2

Suggests:

testthat (≥ 2.1.0), knitr, rmarkdown, dplyr, ggplot2, bookdown, edmcr, maps

RoxygenNote:

7.3.2

Imports:

igraph (≥ 1.2.4.1), mvtnorm (≥ 1.0.10), Rdpack, stats (≥ 3.6.0), utils, corpcor, osqp, glmnet, glassoFast, CVXR

URL:

https://github.com/sebastian-engelke/graphicalExtremes

BugReports:

https://github.com/sebastian-engelke/graphicalExtremes/issues

RdMacros:

Rdpack

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2025-06-13 11:01:33 UTC; manuel

Author:

Sebastian Engelke [aut, cre], Adrien S. Hitz [aut], Nicola Gnecco [aut], Manuel Hentschel [aut]

Repository:

CRAN

Date/Publication:

2025-06-13 12:30:07 UTC

graphicalExtremes: Statistical methodology for graphical extreme value models.

Description

An implementation of the statistical methodology paper Engelke and Hitz (2020) for sparse multivariate extreme value models. Includes exact simulation algorithms and statistical inference methods for multivariate Pareto distributions on graphical structures. Also contains implementations of statistical methods from Engelke and Volgushev (2022), Röttger et al. (2021), and Hentschel et al. (2022).

Details

The following global options are used by functions in the package. Their values can be changed using base::options().

"graphicalExtremes.mc.cores": The (maximal) number of cores to use in parallel tasks. Will always be overwritten by 1 on Windows.
"graphicalExtremes.tol.small": The "small" tolerance is used in internal computations for values that should mathematically be exactly equal to zero, but deviate due to inherent limitations of numerical computations. This value is used e.g. when checking matrices for symmetry and definiteness. In general, this value is used only as a "permissive" tolerance, in the sense that if a value has to be positive, it is compared to actual zero, but if it has to be zero, its absolute value is compared to this tolerance.
"graphicalExtremes.tol.large": The "large" tolerance is used for values that converge to zero, but are mathematically not supposed to be equal to zero. This value is used e.g. when converting a precision matrix \Theta to an adjacency matrix of a graph.
⁠"graphicalExtremes.default.alert⁠: The default alert function to be used in validity checks of Huesler-Reiss parameter matrix transformations. Can be a function that takes an arbitrary number of strings as arguments (e.g. cat(), stop()), FALSE to ignore the alerts, or TRUE/NULL to use the default function warning().

Author(s)

Maintainer: Sebastian Engelke sebastian.engelke@unige.ch

Authors:

Adrien S. Hitz
Nicola Gnecco
Manuel Hentschel

References

Engelke S, Hitz AS (2020). “Graphical models for extremes (with discussion).” J. R. Stat. Soc. Ser. B Stat. Methodol., 82, 871–932.

Engelke S, Volgushev S (2022). “Structure learning for extremal tree models.” J. R. Stat. Soc. Ser. B Stat. Methodol.. doi:10.1111/rssb.12556, Forthcoming, https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/rssb.12556.

Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292, https://arxiv.org/abs/2210.14292.

Röttger F, Engelke S, Zwiernik P (2021). “Total positivity in multivariate extremes.” doi:10.48550/ARXIV.2112.14727, https://arxiv.org/abs/2112.14727.

Conversion between Huesler-Reiss parameter matrices

Description

Converts between different matrices that parametrize the same Huesler-Reiss distribution: \Gamma, \Sigma, \Theta, \Sigma^k, \Theta^k. The (d-1) \times (d-1) matrices \Sigma^k and \Theta^k can also be given/returned as d \times d matrices with the kth row and column filled with zeros.

Usage

Gamma2Sigma(Gamma, k = NULL, full = FALSE, check = TRUE)

Gamma2Theta(Gamma, k = NULL, full = FALSE, check = TRUE)

Sigma2Gamma(Sigma, k = NULL, full = FALSE, check = TRUE)

Theta2Gamma(Theta, k = NULL, full = FALSE, check = TRUE)

Sigma2Theta(
  Sigma,
  k1 = NULL,
  k2 = NULL,
  full1 = FALSE,
  full2 = FALSE,
  check = TRUE
)

Theta2Sigma(
  Theta,
  k1 = NULL,
  k2 = NULL,
  full1 = FALSE,
  full2 = FALSE,
  check = TRUE
)

Theta2Theta(
  Theta,
  k1 = NULL,
  k2 = NULL,
  full1 = FALSE,
  full2 = FALSE,
  check = TRUE
)

Sigma2Sigma(
  Sigma,
  k1 = NULL,
  k2 = NULL,
  full1 = FALSE,
  full2 = FALSE,
  check = TRUE
)

Gamma2Gamma(Gamma, check = TRUE)

matrix2matrix(
  M,
  name1 = c("Gamma", "Sigma", "Theta")[1],
  name2 = c("Gamma", "Sigma", "Theta")[1],
  k1 = NULL,
  k2 = NULL,
  full1 = FALSE,
  full2 = FALSE,
  check = TRUE
)

Arguments

Gamma

Numeric d \times d variogram matrix.

k

NULL if the input/output matrix is \Sigma/\Theta. Else, an integer between 1 and d indicating the value of k in \Sigma^k, \Theta^k.

full

Logical. If TRUE and !is.null(k), the input/output matrix is a d \times d matrix with the kth row filled with zeros.

check

Whether to check the inputs and call ensure_matrix_symmetry_and_truncate_zeros on the outputs.

Sigma

Numeric d \times d or (d-1) \times (d-1) covariance matrix.

Theta

Numeric d \times d or (d-1) \times (d-1) precision matrix.

k1

NULL if the input matrix is \Sigma/\Theta. Else, an integer between 1 and d indicating the value of k in \Sigma^k, \Theta^k.

k2

NULL if the output matrix is \Sigma/\Theta. Else, an integer between 1 and d indicating the value of k in \Sigma^k, \Theta^k.

full1

Logical. If TRUE and !is.null(k1), the input is a d \times d matrix with the kth row filled with zeros.

full2

Logical. If TRUE and !is.null(k2), the output is a d \times d matrix with the kth row filled with zeros.

M

Numeric matrix, \Gamma, \Sigma, or \Theta.

name1

Name of the input representation.

name2

Name of the output representation.

Details

If k, k1, or k2 is NULL, the corresponding ⁠full*⁠ argument is ignored.

Gamma2Gamma only checks and returns the input.

matrix2matrix is a wrapper function that calls the corresponding conversion function implied by name1, name2.

Value

The desired parameter matrix corresponding to the specified inputs.

Compute theoretical `\chi` in 3D

Description

Computes the theoretical \chi coefficient in 3 dimensions.

Usage

Gamma2chi_3D(Gamma)

Arguments

Gamma

Numeric 3 \times 3 matrix.

Value

The 3-dimensional \chi coefficient, i.e., the extremal correlation coefficient for the HR distribution. Note that 0 \leq \chi \leq 1.

Convert matrix to graph

Description

Creates a graph object representing the graph structure implied by a parameter matrix.

Usage

Gamma2graph(Gamma, tol = get_large_tol(), check = TRUE)

Sigma2graph(Sigma, tol = get_large_tol(), k = NULL, full = FALSE, check = TRUE)

Theta2graph(Theta, tol = get_large_tol(), k = NULL, full = FALSE, check = TRUE)

partialMatrixToGraph(M)

Arguments

Gamma

Numeric d \times d variogram matrix.

tol

Numeric scalar. Entries in the precision matrix with absolute value smaller than this are considered to be zero.

check

Whether to check the inputs and call ensure_matrix_symmetry_and_truncate_zeros on the outputs.

Sigma

Numeric d \times d or (d-1) \times (d-1) covariance matrix.

k

NULL if the input/output matrix is \Sigma/\Theta. Else, an integer between 1 and d indicating the value of k in \Sigma^k, \Theta^k.

full

Logical. If TRUE and !is.null(k), the input/output matrix is a d \times d matrix with the kth row filled with zeros.

Theta

Numeric d \times d or (d-1) \times (d-1) precision matrix.

M

Partial matrix with NA entries indicating missing edges.

Value

An igraph::graph object.

Compute exponent measure

Description

Computes the exponent measure of HR distribution.

Usage

V_HR(x, Gamma = NULL, Theta = NULL)

Arguments

x

Numeric vector with d positive elements where the exponent measure is to be evaluated.

Gamma

d x d variogram matrix or numeric vector with d(d-1)/2 elements, containing the upper triangular part of a variogram matrix.

Theta

d x d precision matrix or numeric vector with d(d-1)/2 elements, containing the upper triangular part of a precision matrix.

Details

Only Gamma is needed for the computation. Theta is only used to compute Gamma if necessary.

Value

Numeric. The exponent measure of the HR distribution.

Computes the Z-matrix

Description

Copied from the R package "golazo" with kind permission by Piotr Zwiernik piotr.zwiernik@utoronto.ca. This function outputs the Z matrix, that is, the unique ultrametric matrix dominating S. This matrix is used to construct a starting point in the GOLAZO algorithm when L=0 but U has strictly positive (off-diagonal entries).

Usage

Zmatrix(S)

Arguments

S

a covariance matrix

Censor dataset

Description

Censors each row of matrix x with vector p.

Usage

censor(x, p)

Arguments

x

Numeric n \times d matrix.

p

Numeric vector with d elements.

Value

Numeric n \times d matrix.

HR parameter matrix checks

Description

Checks whether the matrix given is a valid Huesler-Reiss parameter matrix in the role of \Gamma, \Theta, or \Sigma, respectively.

Usage

checkGamma(Gamma, alert = NULL, tol = get_small_tol(), returnBoolean = FALSE)

checkSigmaTheta(
  M,
  k,
  full,
  matrixName = "Sigma",
  tol = get_small_tol(),
  alert = NULL,
  returnBoolean = FALSE
)

checkTheta(
  Theta,
  k = NULL,
  full = FALSE,
  tol = get_small_tol(),
  alert = NULL,
  returnBoolean = FALSE
)

checkSigma(
  Sigma,
  k = NULL,
  full = FALSE,
  tol = get_small_tol(),
  alert = NULL,
  returnBoolean = FALSE
)

checkMatrix(
  M,
  name = c("Gamma", "Sigma", "Theta")[1],
  k = NULL,
  full = FALSE,
  tol = get_small_tol(),
  alert = NULL,
  returnBoolean = FALSE
)

is_valid_Gamma(M, tol = get_small_tol())

is_valid_Theta(Theta, k = NULL, full = FALSE, tol = get_small_tol())

is_valid_Sigma(Sigma, k = NULL, full = FALSE, tol = get_small_tol())

Arguments

Gamma

Numeric d \times d variogram matrix.

alert

Passed to get_alert_function: NULL or TRUE to read the option value, FALSE to return a dummy function, or a function that takes an arbitrary number of strings as arguments (e.g. stop()).

tol

Numeric scalar. Values below this are considered as zero, when zeros are required (e.g. row-sums).

returnBoolean

Logical scalar, set to TRUE to return a boolean instead of the (adjusted) input.

M

Numeric matrix, \Gamma, \Sigma, or \Theta.

k

NULL if the input/output matrix is \Sigma/\Theta. Else, an integer between 1 and d indicating the value of k in \Sigma^k, \Theta^k.

full

Logical. If TRUE and !is.null(k), the input/output matrix is a d \times d matrix with the kth row filled with zeros.

matrixName

Name of the matrix to be used in alerts/error messages.

Theta

Numeric d \times d or (d-1) \times (d-1) precision matrix.

Sigma

Numeric d \times d or (d-1) \times (d-1) covariance matrix.

name

Name of the input matrix, indicating which other function to call.

Details

The function ⁠is_valid_*⁠ are a wrapper around ⁠check*⁠, with arguments alert=FALSE and returnBoolean=TRUE.

Value

For ⁠check*⁠, the input matrix, passed through ensure_matrix_symmetry_and_truncate_zeros.

For ⁠is_valid_*⁠, a boolean indicating whether the input is a valid parameter matrix.

Check input graph

Description

Checks that the input graph is a valid graph for an extremal graphical model. If necessary, converts the graph to an undirected graph. Removes vertex labels if present.

Usage

check_graph(
  graph,
  graph_type = c("general", "decomposable", "block", "tree"),
  check_connected = TRUE,
  nVertices = NULL
)

Arguments

graph

An [igraph::graph] object.

graph_type

"general", "decomposable", "block", "tree". The required type of graph.

check_connected

Whether to check if the graph is connected.

nVertices

The number of vertices required in the graph.

Value

The given graph, if necessary converted to undirected. If the graph is not valid an error is thrown.

Check input graph and partial matrix

Description

Checks and converts the partial matrix and graph given for a HR graphical model.

Usage

check_partial_matrix_and_graph(M, graph = NULL, graph_type = "general")

Arguments

M

A partial matrix or a vector of entries corresponding to the edges of graph

graph

A graph object or NULL if the graph structure is implied by the NA structure of M

graph_type

Passed to check_graph().

Value

A list consisting of

matrix

The matrix given as input or implied by the input

graph

The graph given as input or implied by the input

Throws an error if the input is not valid.

Identify pairs of vertices that are split by a separator

Description

Identify pairs of vertices that are split by a separator

Usage

check_split_by_sep(graph, sep, edgeMat)

Arguments

graph

A graph

sep

A set of vertex ids that are used to split the graph

edgeMat

A two-column matrix, containing the vertex-paris to be checked

Value

A logical vector, indicating for each edge whether it is split by sep

Transformation between `\chi` and `\Gamma`

Description

Transforms between the extremal correlation \chi and the variogram \Gamma. Only valid for Huesler-Reiss distributions. Done element-wise, no checks of the entire matrix structure are performed.

Usage

chi2Gamma(chi)

Gamma2chi(Gamma)

Arguments

chi

Numeric vector or matrix with entries between 0 and 1.

Gamma

Numeric vector or matrix with non-negative entries.

Details

The formula for transformation from \chi to \Gamma is element-wise

\Gamma = (2 \Phi^{-1}(1 - 0.5 \chi))^2,

where \Phi^{-1} is the inverse of the standard normal distribution function.

The formula for transformation from \Gamma to \chi is element-wise

\chi = 2 - 2 \Phi(\sqrt{\Gamma} / 2),

where \Phi is the standard normal distribution function.

Value

Numeric vector or matrix containing the implied \Gamma.

Numeric vector or matrix containing the implied \chi.

Completion of Gamma matrices

Description

Given a graph and a (partial) variogram matrix Gamma, returns a full variogram matrix that agrees with Gamma in entries corresponding to edges of graph and whose corresponding precision matrix, obtained by Gamma2Theta(), has zeros in entries corresponding to non-edges of graph. For results on the existence and uniqueness of this completion, see Hentschel et al. (2022).

Usage

complete_Gamma(Gamma, graph = NULL, ...)

Arguments

Gamma

Numeric d \times d variogram matrix.

graph

NULL or igraph::graph object. If NULL, the graph is implied by non-edge entries in Gamma being NA. Must be connected, undirected.

...

Further arguments passed to complete_Gamma_general_split() if graph is not decomposable

Details

If graph is decomposable, Gamma only needs to be specified on the edges of the graph, other entries are ignored. If graph is not decomposable, the graphical completion algorithm requires a fully specified (but non-graphical) variogram matrix Gamma to begin with.

If not initial completion is provided, the function edmcr::npf() can be used to compute one. The package edmcr might need to be installed manually from GitHub.

Value

Completed d \times d variogram matrix.

References

Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292, https://arxiv.org/abs/2210.14292.

Examples

## Block graph:
Gamma <- rbind(
  c(0, .5, NA, NA),
  c(.5, 0, 1, 1.5),
  c(NA, 1, 0, .8),
  c(NA, 1.5, .8, 0)
)

complete_Gamma(Gamma)

## Alternative representation of the same completion problem:
my_graph <- igraph::graph_from_adjacency_matrix(rbind(
  c(0, 1, 0, 0),
  c(1, 0, 1, 1),
  c(0, 1, 0, 1),
  c(0, 1, 1, 0)
), mode = "undirected")
Gamma_vec <- c(.5, 1, 1.5, .8)
complete_Gamma(Gamma_vec, my_graph)

## Decomposable graph:
G <- rbind(
c(0, 5, 7, 6, NA),
c(5, 0, 14, 15, NA),
c(7, 14, 0, 5, 5),
c(6, 15, 5, 0, 6),
c(NA, NA, 5, 6, 0)
)

complete_Gamma(G)

## Non-decomposable graph:
G <- rbind(
c(0, 5, 7, 6, 6),
c(5, 0, 14, 15, 13),
c(7, 14, 0, 5, 5),
c(6, 15, 5, 0, 6),
c(6, 13, 5, 6, 0)
)
g <- igraph::make_ring(5)

complete_Gamma(G, g)

Completion of decomposable Gamma matrices

Description

Given a decomposable graph and incomplete variogram matrix Gamma, returns the full Gamma matrix implied by the conditional independencies.

Usage

complete_Gamma_decomposable(Gamma, graph = NULL)

Arguments

Gamma

A variogram matrix that is specified on the edges of graph and the diagonals. All other entries are ignored (if graph is specified), or should be NA to indicate non-edges in graph.

graph

NULL or a decomposable [igraph::graph] object. If NULL, the structure of NA entries in Gamma is used instead.

Value

A complete variogram matrix that agrees with Gamma on the entries corresponding to edges in graph and the diagonals. The corresponding \Theta matrix produced by Gamma2Theta() has zeros in the remaining entries.

Non-decomposable completion of variogram matrices

Description

Usage

complete_Gamma_general(
  Gamma,
  graph,
  N = 10000,
  tol = get_large_tol(),
  check_tol = 100
)

Arguments

Gamma

Numeric d \times d variogram matrix.

graph

igraph::graph() object.

N

Maximum number of iterations.

tol

Numeric scalar. Tolerance to be used when completing submatrices.

check_tol

Numeric/integer scalar. How often to check the tolerance when completing submatrices.

Value

A completed d \times d variogram matrix.

DEMO-VERSION: Completion of non-decomposable Gamma matrices

Description

Given a graph and variogram matrix Gamma, returns the full Gamma matrix implied by the conditional independencies. DEMO VERSION: Returns a lot of details and allows specifying the graph list that is used. Is way slower than other functions.

Usage

complete_Gamma_general_demo(Gamma, graph, N = 1000, tol = 0, gList = NULL)

Arguments

Gamma

A complete variogram matrix (without any graphical structure).

graph

An igraph::graph object.

N

The maximal number of iterations of the algorithm.

tol

The tolerance to use when checking for zero entries in Theta.

gList

A list of graphs to be used instead of the output from make_sep_list().

Value

A nested list, containing the following details. The "error term" is the maximal absolute value of Theta in a non-edge entry.

graph, N, tol

As in the input

gList

As in the input or computed by make_sep_list().

Gamma0, Theta0, err0

Initial Gamma, Theta, and error term.

iterations

A nested list, containing the following infos for each performed iteration:

n: Number of the iteration
t: Corresponding index in gList
g: The graph used
Gamma, Theta, err: The value of Gamma, Theta, and error term after the iteration

Non-decomposable completion of variogram matrices

Description

Given a non-decomposable graph, and (non-graphical) variogram matrix Gamma, modifies Gamma in non-edge entries, such that the resulting matrix is a variogram matrix with graphical structure described by graph. Does so by splitting graph at complete separators into smaller subgraphs, and calling complete_Gamma_general for each subgraph/submatrix, using multiple cores if available.

Usage

complete_Gamma_general_split(
  Gamma,
  graph,
  N = 10000,
  sub_tol = get_large_tol() * 0.001,
  check_tol = 100,
  mc_cores_overwrite = NULL,
  final_tol = get_large_tol()
)

Arguments

Gamma

Numeric d \times d variogram matrix.

graph

igraph::graph() object.

N

Maximum number of iterations.

sub_tol

Numeric scalar. Tolerance to be used when completing submatrices. Should be smaller than final_tol.

check_tol

Numeric/integer scalar. How often to check the tolerance when completing submatrices.

mc_cores_overwrite

NULL or numeric/integer scalar. Maximal number of cores to use.

final_tol

Numeric scalar. Check convergence of the final result with this tolerance. Skipped if this value is < 0.

Value

A completed d \times d variogram matrix.

Completion of two-clique decomposable Gamma matrices

Description

Given a decomposable graph consisting of two cliques and incomplete variogram matrix Gamma, returns the full Gamma matrix implied by the conditional independencies. The rows/columns of Gamma must be ordered such that the clique of size nA (excluding separator) comes first, then the separator of size nC, and then the remaining nB vertices.

Usage

complete_Gamma_one_step(Gamma, nA, nC, nB)

Compute plot limits

Description

Helper function to compute the axis limits of a plot with given x, y data and optionally a fixed x-y-ratio and correcting the latitude/longitude scale at different latitudes

Usage

computeLimits(xData, yData, xyRatio = 1, convertLatLong = TRUE, stretch = 1)

Upper Danube basin dataset

Description

A dataset containing river discharge data for tributaries of the Danube.

Usage

danube

Format

A named list with four entries

data_clustered: A numeric matrix, containing pre-processed discharge data for each gauging station
data_raw: A numeric matrix, containing daily (raw) discharge data for each gauging station
info: A data frame, containing information about each gauging station
flow_edges: A two-column numeric matrix. Each row contains the indices (in info) of a pair of gauging stations that are directly connected by a river.

Details

To obtain the matrix data_clustered, the daily discharge data from the summer months of 1960 to 2010, given in data_raw, was declustered, yielding between seven and ten observations per year. Each row corresponds to one observation from this declustered time series, the non-unique rownames indicate which year an observation is from. Each column corresponds to one of the gauging stations, with column indices in data_raw/data_clustered corresponding to row indices in info. See (Asadi et al. 2015) for details on the preprocessing and declustering.

info is a data frame containing the following information for each of the gauging stations or its corresponding catchment area:

RivNames: Name of the river at the gauging station
Lat, Long: Coordinates of the gauging station
Lat_Center, Long_Center: Coordinates of the center of the catchment corresponding to the gauging station
Alt: Mean altitude of the catchment
Area: Area of the catchment corresponding to the gauging station
Slope: Mean slope of the catchment
PlotCoordX, PlotCoordY: X-Y-coordinates which can be used to arrange the gauging stations when plotting a flow graph.

Source

Bavarian Environmental Agency https://www.gkd.bayern.de.

References

Asadi P, Davison AC, Engelke S (2015). “Extremes on river networks.” Ann. Appl. Stat., 9(4), 2023 – 2050. doi:10.1214/15-AOAS863.

Examples

dim(danube$data_clustered)
colnames(danube$info)

Data standardization to multivariate Pareto scale

Description

Transforms the data matrix empirically to the multivariate Pareto scale.

Usage

data2mpareto(data, p, na.rm = FALSE)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the dimension.

p

Numeric between 0 and 1. Probability used for the quantile to threshold the data.

na.rm

Logical. If rows containing NAs should be removed.

Details

The columns of the data matrix are first transformed empirically to standard Pareto distributions. Then, only the observations where at least one component exceeds the p-quantile of the standard Pareto distribution are kept. Those observations are finally divided by the p-quantile of the standard Pareto distribution to standardize them to the multivariate Pareto scale.

If na.rm is FALSE, missing entries are left as such during the transformation of univariate marginals. In the thresholding step, missing values are considered as -Inf.

Value

Numeric m \times d matrix, where m is the number of rows in the original data matrix that are above the threshold.

Examples

n <- 20
d <- 4
p <- .8
G <- cbind(
  c(0, 1.5, 1.5, 2),
  c(1.5, 0, 2, 1.5),
  c(1.5, 2, 0, 1.5),
  c(2, 1.5, 1.5, 0)
)

set.seed(123)
my_data <- rmstable(n, "HR", d = d, par = G)
data2mpareto(my_data, p)

Learning extremal graph structure with latent variables

Description

Following the methodology from Engelke and Taeb (2024), fits an extremal graph structure with latent variables.

Usage

eglatent(
  Gamma,
  lam1_list = c(0.1, 0.15, 0.19, 0.205),
  lam2_list = c(2),
  refit = TRUE,
  verbose = FALSE
)

Arguments

Gamma

conditionally negative semidefinite matrix. This will be typically the empirical variogram matrix.

lam1_list

Numeric vector of non-negative regularization parameters for eglatent. Default is lam1_list = c(0.1, 0.15, 0.19, 0.205).

lam2_list

Numeric vector of non-negative regularization parameters for eglatent. Default is lam2_list = c(2).

refit

Logical scalar, if TRUE then the model is refit on the estimated graph to obtain an estimate of the Gamma matrix on that graph. Default is refit = TRUE.

verbose

Logical scalar, indicating whether to print progress updates.

Value

The function fits one model for each combination of values in lam1_list and lam2_list. All returned objects have one entry per model. List consisting of:

graph

A list of igraph::graph objects representing the fitted graphs.

rk

Numeric vector containing the estimated ranks of the latent variables.

G_est

A list of numeric estimated d \times d variogram matrices \Gamma corresponding to the fitted graphs.

G_refit

A list of numeric estimated d \times d variogram matrices \Gamma refitted with fixed graphs corresponding to the fitted graphs.

lambdas

A list containing the values of lam1_list and lam2_list used for the model fit.

References

Engelke S, Taeb A (2024). “Extremal graphical modeling with latent variables.” 2403.09604.

Learning extremal graph structure

Description

Following the methodology from Engelke et al. (2022), fits an extremal graph structure using the neighborhood selection approach (see Meinshausen and Bühlmann (2006)) or graphical lasso (see Friedman et al. (2008)).

Usage

eglearn(
  data,
  p = NULL,
  rholist = c(0.1, 0.15, 0.19, 0.205),
  reg_method = c("ns", "glasso"),
  complete_Gamma = FALSE
)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the dimension.

p

Numeric between 0 and 1 or NULL. If NULL (default), it is assumed that the data are already on multivariate Pareto scale. Else, p is used as the probability in the function data2mpareto() to standardize the data.

rholist

Numeric vector of non-negative regularization parameters for the lasso. Default is rholist = c(0.1, 0.15, 0.19, 0.205). For details see glasso::glassopath().

reg_method

One of ⁠"ns", "glasso"⁠, for neighborhood selection and graphical lasso, respectively. Default is reg_method = "ns". For details see Meinshausen and Bühlmann (2006), Friedman et al. (2008).

complete_Gamma

Whether you want to try fto complete Gamma matrix. Default is complete_Gamma = FALSE.

Value

List made of:

graph

A list of igraph::graph objects representing the fitted graphs for each rho in rholist.

Gamma

A list of numeric estimated d \times d variogram matrices \Gamma corresponding to the fitted graphs, for each rho in rholist. If complete_Gamma = FALSE or the underlying graph is not connected, it returns NULL.

rholist

The list of penalty coefficients.

graph_ic

A list of igraph::graph objects representing the optimal graph according to the aic, bic, and mbic information criteria. If reg_method = "glasso", it returns a list of NULL.

Gamma_ic

A list of numeric d \times d estimated variogram matrices \Gamma corresponding to the aic, bic, and mbic information criteria. If reg_method = "glasso", complete_Gamma = FALSE, or the underlying graph is not connected, it returns a list of NULL.

References

Engelke S, Lalancette M, Volgushev S (2022). “Learning extremal graphical structures in high dimensions.” doi:10.48550/ARXIV.2111.00840, Available from https://arxiv.org/abs/2111.00840., 2111.00840, https://arxiv.org/abs/2111.00840.

Friedman J, Hastie T, Tibshirani R (2008). “Sparse inverse covariance estimation with the graphical lasso.” Biostatistics, 9(3), 432–441.

Meinshausen N, Bühlmann P (2006). “High-dimensional graphs and variable selection with the Lasso.” Ann. Statist., 34(3), 1436 – 1462. doi:10.1214/009053606000000281.

Examples

set.seed(2)
m <- generate_random_model(d=6)
y <- rmpareto(n=500, par=m$Gamma)
ret <- eglearn(y)

Empirical estimation of extremal correlation matrix `\chi`

Description

Estimates empirically the matrix of bivariate extremal correlation coefficients \chi.

Usage

emp_chi(data, p = NULL)

emp_chi_pairwise(data, p = NULL, verbose = FALSE)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the dimension.

p

Numeric scalar between 0 and 1 or NULL. If NULL (default), it is assumed that the data are already on multivariate Pareto scale. Else, p is used as the probability in data2mpareto() to standardize the data.

verbose

Print verbose progress information

Details

emp_chi_pairwise calls emp_chi for each pair of observations. This is more robust if the data contains many NAs, but can take rather long.

Value

Numeric matrix d \times d. The matrix contains the bivariate extremal coefficients \chi_{ij}, for i, j = 1, ..., d.

Examples

n <- 100
d <- 4
p <- .8
Gamma <- cbind(
  c(0, 1.5, 1.5, 2),
  c(1.5, 0, 2, 1.5),
  c(1.5, 2, 0, 1.5),
  c(2, 1.5, 1.5, 0)
)

set.seed(123)
my_data <- rmstable(n, "HR", d = d, par = Gamma)
emp_chi(my_data, p)

Empirical estimation of extremal correlation `\chi`

Description

Estimates the d-dimensional extremal correlation coefficient \chi empirically.

Usage

emp_chi_multdim(data, p = NULL)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the dimension.

p

Value

Numeric scalar. The empirical d-dimensional extremal correlation coefficient \chi for the data.

Examples

n <- 100
d <- 2
p <- .8
G <- cbind(
  c(0, 1.5),
  c(1.5, 0)
)

set.seed(123)
my_data <- rmstable(n, "HR", d = d, par = G)
emp_chi_multdim(my_data, p)

Estimation of the variogram matrix `\Gamma` of a Huesler-Reiss distribution

Description

Estimates the variogram of the Huesler-Reiss distribution empirically.

Usage

emp_vario(data, k = NULL, p = NULL)

emp_vario_pairwise(data, k = NULL, p = NULL, verbose = FALSE)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the dimension.

k

Integer between 1 and d. Component of the multivariate observations that is conditioned to be larger than the threshold p. If NULL (default), then an average over all k is returned.

p

verbose

Print verbose progress information

Details

emp_vario_pairwise calls emp_vario for each pair of observations. This is more robust if the data contains many NAs, but can take rather long.

Value

Numeric d \times d matrix. The estimated variogram of the Huesler-Reiss distribution.

Examples

G <- generate_random_Gamma(d=5)
y <- rmpareto(n=100, par=G)
Ghat <- emp_vario(y)

Fitting extremal minimum spanning tree

Description

Fits an extremal minimum spanning tree, where the edge weights are:

negative maximized log-likelihoods of the bivariate Huesler-Reiss distributions, if method = "ML". See Engelke and Hitz (2020) for details.
empirical extremal variogram, if method = "vario". See Engelke and Volgushev (2022) for details.
empirical extremal correlation, if method = "chi". See Engelke and Volgushev (2022) for details.

Usage

emst(data, p = NULL, method = c("vario", "ML", "chi"), cens = FALSE)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the dimension.

p

method

One of ⁠"vario", "ML", "chi"⁠. Default is method = "vario".

cens

Logical. This argument is considered only if method = "ML". If TRUE, then censored likelihood contributions are used for components below the threshold. By default, cens = FALSE.

Value

List consisting of:

graph

An igraph::graph object. The fitted minimum spanning tree.

Gamma

Numeric d \times d estimated variogram matrix \Gamma corresponding to the fitted minimum spanning tree.

References

Examples

## Fitting a 4-dimensional HR minimum spanning tree
my_graph <- igraph::graph_from_adjacency_matrix(
  rbind(
    c(0, 1, 0, 0),
    c(1, 0, 1, 1),
    c(0, 1, 0, 0),
    c(0, 1, 0, 0)
  ),
  mode = "undirected"
)
n <- 100
Gamma_vec <- c(.5, 1.4, .8)
complete_Gamma(Gamma = Gamma_vec, graph = my_graph) ## full Gamma matrix

set.seed(123)
my_data <- rmpareto_tree(n, "HR", tree = my_graph, par = Gamma_vec)
my_fit <- emst(my_data, p = NULL, method = "ML", cens = FALSE)

Performs Gaussian likelihood optimization under Laplacian matrix constraints.

Description

This function implements a block descent algorithm to find the maximum of the Gaussian log-likelihood under the constraint that the concentration matrix is a Laplacian matrix. See Röttger et al. (2021) for details.

Usage

emtp2(Gamma, tol = 1e-06, verbose = TRUE, initial_point = TRUE)

Arguments

Gamma

conditionally negative semidefinite matrix. This will be typically the empirical variogram matrix.

tol

The convergence tolerance. The algorithm terminates when the sum of absolute differences between two iterations is below tol.

verbose

if TRUE (default) the output will be printed.

initial_point

if TRUE (default), the algorithm will construct an initial point before the iteration steps.

Value

A list consisting of:

G_emtp2

The optimal value of the variogram matrix

it

The number of iterations

References

Röttger F, Engelke S, Zwiernik P (2021). “Total positivity in multivariate extremes.” doi:10.48550/ARXIV.2112.14727, https://arxiv.org/abs/2112.14727.

Ensure numerical matrix symmetry/zero values

Description

Ensures the symmetry of a square matrix by averaging it with its transpose. Optionally verifies that the matrix was close to symmetric before.

Makes sure zeros are "numerically zero", by truncating all small values.

Usage

ensure_matrix_symmetry(M, checkTol = Inf, alert = NULL)

truncate_zeros(M, tol = get_small_tol())

ensure_matrix_symmetry_and_truncate_zeros(
  M,
  tol = get_small_tol(),
  checkTol = Inf
)

Arguments

M

Numeric square matrix.

checkTol

Positive scalar. If the maximum absolute difference between M and t(M) is larger, show a warning.

alert

Passed to get_alert_function: NULL or TRUE to read the option value, FALSE to return a dummy function, or a function that takes an arbitrary number of strings as arguments (e.g. stop()).

tol

All entries with absolute value below this value are truncated to zero.

Value

The adjusted value of M.

Fast computation of diag(y %% M %% t(y))

Description

Fast computation of diag(y %% M %% t(y))

Usage

fast_diag(y, M)

Arguments

y

Numeric matrix

M

Numeric matrix

Value

Numeric vector

Helper function to combine par with fixed params (in init)

Description

Helper function to combine par with fixed params (in init)

Usage

fillFixedParams(par, init, fixParams)

Arguments

par

Numeric vector. The parameters that are optimized

init

Numeric vector. The initial parameters (including the ones optimized over)

fixParams

Numeric or logical vector. Positions of fixed parameters in the full parameter vector.

Find a separator set for two vertices

Description

Finds a reasonably small set of vertices that separate v0 and v1 in graph.

Usage

findVsep(graph, v0, v1)

Fit value(s) in interval

Description

Fit value(s) in interval, all arguments are recycled where necessary.

Usage

fitInInterval(x, xMin = -Inf, xMax = Inf)

Arguments

x

Numeric vector

xMin

Numeric vector

xMax

Numeric vector

Value

Numeric vector

Experimental: Fit graph using empirical Theta matrix

Description

Fits a graph to an empirical Gamma matrix by computing the corresponding Theta matrix using Gamma2Theta() and greedily chooses m edges that correspond to high absolute values in Theta.

Usage

fit_graph_to_Theta(data, m = NULL, Gamma_emp = NULL)

Arguments

data

The (standardized) data from which to compute Gamma

m

The number of edges to add, defaults to the number of edges in a tree

Gamma_emp

The empirical Gamma matrix (can be NULL if data is given)

Value

A list containing an [igraph::graph] object and a fitted Gamma matrix

Convert flight counts to connection list

Description

Convert a numeric matrix containing flight counts between airports to a data frame containing a list of connections.

Usage

flightCountMatrixToConnectionList(nFlightsPerConnection, directed = TRUE)

Arguments

nFlightsPerConnection

A square, numeric matrix with identical column- and row-names. Each entry represents the number of flights from the airport indexing the row to the airport indexing the column in some arbitrary time period.

directed

Logical scalar. Whether flights A->B and B->A should be considered separately.

Value

A data frame with columns departureAirport, arrivalAirport, nFlights. Each row represents one connection with >=1 flights in the input matrix.

Examples

flightCountMatrixToConnectionList(flights$flightCounts[1:100, 1:100, 1])

Flights delay data

Description

A dataset containing daily total delays of major U.S. airlines. The raw data was obtained from the U.S. Bureau of Transportation Statistics, and pre-processed as described in Hentschel et al. (2022). Note: The CRAN version of this package contains only data from 2010-2013. The full dataset is available in the GitHub version of this package.

Usage

flights

Format

A named list with three entries:

airports: A data.frame, containing information about US airports
delays: A numeric matrix, containing daily aggregated delays at the airports in the dataset
flightCounts: A numeric array, containing yearly flight numbers between airports in the dataset

Details

flightCounts is a three-dimensional array, containing the number of flights in the dataset between each pair of airports, aggregated on a yearly basis. Each entry is the total number of flights between the departure airport (row) and destination airport (column) in a given year (dimension 3). This array does not contain any NAs, even if an airport did not operate at all in a given year, which is simply indicated by zeros.

delays is a three-dimensional array containing daily total positive delays, in minutes, of incoming and outgoing flights respectively. Each column corresponds to an airport in the dataset and each row corresponds to a day. The third dimension has length two, 'arrivals' containing delays of incoming flights and 'departures' containing delays of outgoing flights. Zeros indicate that there were flights arriving/departing at that airport on a given day, but none of them had delays. NAs indicate that there were no flights arriving/departing at that airport on that day at all.

airports is a data frame containing the following information about a number of US airports. Some entries are missing, which is indicated by NAs.

IATA: 3-letter IATA code
Name: name of the airport
City: main city served by the airport
Country: country or territory where the airport is located (mostly "United States")
ICAO: 4-letter ICAO code
Latitude: latitude of the airport, in decimal degrees
Longitude: longitude of the airport, in decimal degrees
Altitude: altitude of the airport, in feet
Timezone: timezone of the airport, in hours offset from UTC
DST: Daylight savings time used at the airport. 'A'=US/Canada, 'N'=None.
Timezone2: name of the timezone of the airport

Source

Raw delays data:

https://www.bts.gov/browse-statistical-products-and-data/bts-publications/airline-service-quality-performance-234-time

Fields/Forms used in the raw data:

Airports (includes license information):

https://openflights.org/data

References

Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292, https://arxiv.org/abs/2210.14292.

Examples

# Get total number of flights in the dataset:
totalFlightCounts <- apply(flights$flightCounts, c(1,2), sum)

# Get number of flights for specific years in the dataset:
flightCounts_10_11 <- apply(flights$flightCounts[,,c('2010', '2011')], c(1,2), sum)

# Get list of connections from 2008:
connections_10 <- flightCountMatrixToConnectionList(flights$flightCounts[,,'2010'])

Parameter fitting for multivariate Huesler-Reiss Pareto distribution

Description

Fits the parameters of a multivariate Huesler-Reiss Pareto distribution using (censored) maximum likelihood estimation.

Usage

fmpareto_HR_MLE(
  data,
  p = NULL,
  cens = FALSE,
  init = NULL,
  fixParams = integer(0),
  useTheta = TRUE,
  maxit = 100,
  graph = NULL,
  optMethod = "BFGS",
  nAttemptsFixInit = 3
)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the number of dimensions.

p

Numeric scalar between 0 and 1 or NULL. If NULL (default), it is assumed that the data is already on a multivariate Pareto scale. Else, p is used as the probability in data2mpareto() to standardize the data.

cens

Logical scalar. If true, then censored likelihood contributions are used for components below the threshold. This is computationally expensive and by default cens = FALSE.

init

Numeric vector or numeric matrix. Initial parameter values in the optimization. If NULL, the empirical variogram is used instead. Otherwise should be a numeric vector with one entry per edge in graph, or a complete variogram/precision matrix.

fixParams

Numeric or logical vector. Indices of the parameter vectors that are kept fixed (identical to init) during the optimization. Default is integer(0).

useTheta

Logical. Whether to perform the MLE optimization in terms of Theta or Gamma.

maxit

Positive integer. The maximum number of iterations in the optimization.

graph

Graph object from igraph package or NULL (implying the complete graph).

optMethod

String. A valid optimization method used by the function stats::optim. By default, method = "BFGS".

nAttemptsFixInit

Numeric. If useTheta=TRUE and the initial parameter init is not valid, attempt to fix it first by making sure all off-diagonal entries are negative and then adding some random noise at most this many times.

Details

Only the parameters corresponding to edges in graph are optimized, the remaining entries are implied by the graphical structure. If graph is NULL, the complete graph is used. The optimization is done either in terms of the variogram (Gamma) or precision matrix (Theta), depending on the value of useTheta. If graph is non-decomposable, useTheta=TRUE is significantly faster, otherwise they are similar in performance.

Value

List consisting of:

convergence

Logical. Indicates whether the optimization converged or not.

Gamma

Numeric ⁠d x d⁠ matrix. Fitted variogram matrix.

Theta

Numeric ⁠d x d⁠ matrix. Fitted precision matrix.

par

Numeric vector. Optimal parameters, including fixed parameters.

par_opt

Numeric. Optimal parameters, excluding fixed parameters.

nllik

Numeric. Optimal value of the negative log-likelihood function.

hessian

Numeric matrix. Estimated Hessian matrix of the estimated parameters.

Parameter fitting for Huesler-Reiss graphical models

Description

Fits the parameter matrix (variogram) of a multivariate Huesler-Reiss Pareto distribution with a given graphical structure, using maximum-likelihood estimation or the empirical variogram.

Usage

fmpareto_graph_HR(
  data,
  graph,
  p = NULL,
  method = c("vario", "ML"),
  handleCliques = c("average", "full", "sequential"),
  ...
)

Arguments

data

Numeric n \times d matrix, where n is the number of observations and d is the number of dimensions.

graph

Undirected, connected [igraph::graph] object with d vertices, representing the graphical structure of the fitted Huesler-Reiss model.

p

Numeric between 0 and 1 or NULL. If NULL (default), it is assumed that the data is already on a multivariate Pareto scale. Else, p is used as the probability in the function data2mpareto() to standardize the data.

method

One of c('vario', 'ML'), with 'vario' as default, indicating the method to be used for parameter estimation. See details.

handleCliques

How to handle cliques and separators in the graph. See details.

...

Arguments passed to fmpareto_HR_MLE(). Currently cens, maxit, optMethod, and useTheta are supported.

Details

If handleCliques='average', the marginal parameter matrix is estimated for each maximal clique of the graph and then combined into a partial parameter matrix by taking the average of entries from overlapping cliques. Lastly, the full parameter matrix is computed using complete_Gamma().

If handleCliques='full', first the full parameter matrix is estimated using the specified method and then the non-edge entries are adjusted such that the final parameter matrix has the graphical structure indicated by graph.

If handleCliques='sequential', graph must be decomposable, and method='ML' must be specified. The parameter matrix is first estimated on the (recursive) separators and then on the rest of the cliques, keeping previously estimated entries fixed.

If method='ML', the computational cost is mostly influenced by the total size of the graph (if handleCliques='full') or the size of the cliques, and can already take a significant amount of time for modest dimensions (e.g. d=3).

Value

The estimated parameter matrix.

HR Parameter fitting - Helper functions

Description

Helper functions called by fmpareto_HR_MLE.

Usage

fmpareto_graph_HR_clique_average(data, graph, method = c("ML", "vario"), ...)

fmpareto_graph_HR_clique_sequential(data, graph, ...)

combine_clique_estimates_by_averaging(cliques, subGammas)

Generate a random Gamma matrix

Description

Generates a valid Gamma matrix with a given dimension

Usage

generate_random_Gamma(d, ...)

Arguments

d

Size of the matrix

...

Further arguments passed to generate_random_spd_matrix()

Generate random graphs

Description

Generate random graphs with different structures. These do not follow well-defined distributions and are mostly meant for quickly generating test models.

Usage

generate_random_chordal_graph(
  d,
  cMin = 2,
  cMax = 6,
  sMin = 1,
  sMax = 4,
  block_graph = FALSE,
  ...
)

generate_random_connected_graph(
  d,
  m = NULL,
  p = 2/(d + 1),
  maxTries = 1000,
  ...
)

generate_random_tree(d)

generate_random_cactus(d, cMin = 2, cMax = 6)

Arguments

d

Number of vertices in the graph

cMin

Minimal size of cliques/blocks (last one might be smaller if necessary)

cMax

Maximal size of cliques/blocks

sMin

Minimal size of separators

sMax

Maximal size of separators

block_graph

Force ⁠sMin == sMax == 1⁠ to produce a block graph

...

Ignored, only allowed for compatibility

m

Number of edges in the graph (specify this or p)

p

Probability of each edge being in the graph (specify this or m)

maxTries

Maximum number of tries to produce a connected Erdoes-Renyi graph

Details

generate_random_chordal_graph generates a random chordal graph by starting with a (small) complete graph and then adding new cliques until the specified size is reached. The sizes of cliques and separators can be specified.

generate_random_connected_graph first tries to generate an Erdoes-Renyi graph, if that fails, falls back to producing a tree and adding random edges to that tree.

generate_random_cactus generates a random cactus graph (mostly useful for benchmarking).

Value

An [igraph::graph] object

Generate a random Gamma matrix for a given graph

Description

Generates a valid Gamma matrix with conditional independence structure specified by a graph

Usage

generate_random_graphical_Gamma(graph, ...)

Arguments

graph

An igraph::graph object

...

Further arguments passed to generate_random_spd_matrix()

Generate a random Gamma matrix containing only integers

Description

Generates a random variogram Matrix by producing a (d-1) \times (d-1) matrix B with random integer entries between -b and b, computing S = B %*% t(B), and passing this S to Sigma2Gamma(). This process is repeated with an increasing b until a valid Gamma matrix is produced.

Usage

generate_random_integer_Gamma(d, b = 2, b_step = 1)

Arguments

d

Number of rows/columns in the output matrix

b

Initial b used in the algorithm described above

b_step

By how much b is increased in each iteration

Value

A numeric d \times d variogram matrix with integer entries

Examples


generate_random_integer_Gamma(5, 2, 0.1)

Generate random Huesler-Reiss Models

Description

Generates a random connected graph and Gamma matrix with conditional independence structure corresponding to that graph.

Usage

generate_random_model(d, graph_type = "general", ...)

Arguments

d

Number of vertices in the graph

graph_type

"tree", "block", "decomposable", "complete", or "general"

...

Further arguments passed to functions generating the graph and Gamma matrix

Examples

set.seed(1)
d <- 12

generate_random_model(d, 'tree')
generate_random_model(d, 'block')
generate_random_model(d, 'decomposable')
generate_random_model(d, 'general')
generate_random_model(d, 'complete')

Generate a random symmetric positive definite matrix

Description

Generates a random d \times d symmetric positive definite matrix. This is done by generating a random d \times d matrix B, then computing B %*% t(B), and then normalizing the matrix to approximately single digit entries.

Usage

generate_random_spd_matrix(d, bMin = -10, bMax = 10, ...)

Arguments

d

Number of rows/columns

bMin

Minimum value of entries in B

bMax

Maximum value of entries in B

...

Ignored, only allowed for compatibility

Get Danube flow graph

Description

Returns an igraph::graph object representing the flow graph of the danube dataset.

Usage

getDanubeFlowGraph(stationIndices = NULL, directed = FALSE)

Arguments

stationIndices

Logical or numerical vector. Indicating which stations to include.

directed

Logical. Whether the graph should be directed (in the direction of flow).

Value

An igraph::graph object.

Get filtered flight delays

Description

Get filtered flight delay data, containing only a selection of dates and airports. Currently, all possible selections correspond to the case study in Hentschel et al. (2022).

Usage

getFlightDelayData(
  what = c("delays", "IATAs", "dates"),
  airportFilter = c("all", "tcCluster", "tcAll"),
  dateFilter = c("all", "tcTrain", "tcTest", "tcAll"),
  delayFilter = c("totals", "arrivals", "departures")[1]
)

Arguments

what

Whether to get the array of delays (numerical), or just the vector of airport codes ("IATAs", strings) or dates (as strings). Specify exactly one.

airportFilter

Which airports to include. Specify exactly one. See details below.

dateFilter

Which dates to include. Specify exactly one. See details below.

delayFilter

Which kinds of delays to include. Specify one or more. Possible values are "arrivals", "departures", and "totals" (computed as sum of arrival and departure delays).

Details

The provided lists of airports and dates correspond to the ones used in the case study of Hentschel et al. (2022). The argument airportFilter="tcCluster" corresponds to the airports in the analyzed "Texas Cluster", airportFilter="tcAll" corresponds to all airports used in the previous clustering step, airportFilter="all" corresponds to all airports in the dataset.

Similarly, dateFilter="tcTrain" selects the dates from the training set, dateFilter="tcTest" the ones from the test/validation set. To get the union of these sets, specify dateFilter="tcAll". To get all dates in the dataset (possibly more than for "tcAll"), specify dateFilter="all".

Value

If what="IATAs" or what="dates", a character vector. If required, it can be converted to Date objects using as.Date().

If what="delays", a three-dimensional array or two-dimensional matrix, with dimensions corresponding to dates, airports, and delay types.

References

Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292, https://arxiv.org/abs/2210.14292.

Get flight graph

Description

Convert the info from flights$flightCounts to an igraph::graph object.

Usage

getFlightGraph(IATAs = NULL, years = NULL, minNFlights = 1, directed = FALSE)

Arguments

IATAs

Character vector. IATA codes of airports to include

years

Character vector. Years to include (as strings).

minNFlights

Numerical scalar. Minimum number of flights on a connection to be included as an edge.

directed

Logical scalar. Whether flights A->B and B->A should be considered separately.

Value

An igraph::graph object containing a vertex for each airport and an edge whenever there are at least minNFlights between two airports.

Examples

g <- getFlightGraph()

Get package data

Description

Get private data sets from inst/extdata.

Usage

getPackageData(filename, isRDS = TRUE)

Arguments

filename

Base name of the file. E.g. "FILENAME" for a file inst/extdata/FILENAME.

isRDS

Whether the file is an .RDS file.

Value

If isRDS=TRUE an R object. If isRDS=FALSE an environment, containing the R objects from the file.

Get the submatrix corresponding to a subgraph

Description

Both the graph and subgraph need to have persistent IDs If graph==NULL it is assumed to have pIDs 1, 2, ...

Usage

getSubMatrixForSubgraph(fullMatrix, subgraph, graph = NULL)

Get alert function

Description

Get a function that can be used to alert the user of invalid inputs. Returns the value implied by the overwrite argument, or the option "graphicalExtremes.default.alert", falling back to warning() if neither is specified.

Usage

get_alert_function(overwrite = NULL)

Arguments

overwrite

NULL or TRUE to read the option value, FALSE to return a dummy function, or a function that takes an arbitrary number of strings as arguments (e.g. stop()).

Value

A function that takes an arbitrary number of strings as arguments.

Get Cliques and Separators of a graph

Description

Finds all cliques, separators, and (recursively) separators of separators in a graph.

Usage

get_cliques_and_separators(
  graph,
  sortIntoLayers = FALSE,
  includeSingletons = FALSE
)

Arguments

graph

An [igraph::graph] object

Value

A list of vertex sets that represent the cliques and (recursive) separators of graph, ordered such that separators come before cliques they separate.

Get Cliques and Separators of a graph

Description

Finds all cliques, separators, and (recursively) separators of separators in a graph.

Usage

get_cliques_and_separators_OLD(graph)

Arguments

graph

An [igraph::graph] object

Value

A list of vertex sets that represent the cliques and (recursive) separators of graph

Number of cores to be used in parallel computations

Description

Helper function that returns the number of cores to be used in parallel computations. Will always be 1 on Windows. On other systems, this value can be set using setOption('graphicalExtremes.mc.cores', ...).

Usage

get_mc_cores(overwrite = NULL)

Arguments

overwrite

Use this value (if it is valid and not on Windows)

Value

An integer to be used as number of cores

Tolerances to be used in computations

Description

Helper function that returns the tolerance to be used in internal computations.

Usage

get_small_tol(overwrite = NULL)

get_large_tol(overwrite = NULL)

Arguments

overwrite

NULL or numeric scalar. If specified, use this value instead of the option value.

Details

There are two different tolerances used in the package, for details see graphicalExtremes-package. The default values for these tolerances can be set using the options "graphicalExtremes.tol.small" and "graphicalExtremes.tol.large".

Value

A non-negative numerical scalar

Graph equality

Description

Produce true if two graphs have same vertices and edges (ordered)

Usage

graphs_equal(g1, g2)

Arguments

g1

igraph::graph

g2

igraph::graph

Value

logical indicating if the graphs are equal

Full censored log-likelihood of HR model

Description

Computes the full (censored) log-likelihood of HR model.

Usage

logLH_HR(data, Gamma, cens = FALSE)

Arguments

data

Numeric n \times d matrix, containing observations following a multivariate HR Pareto distribution.

Gamma

Numeric d \times d matrix, representing a variogram matrix \Gamma.

cens

Boolean. If true, then censored log-likelihood is computed. By default, cens = FALSE.

Value

Numeric. The full censored log-likelihood of HR model.

Compute censored exponent measure

Description

Computes the (censored) exponent measure density of HR distribution.

Usage

logdVK_HR(x, K, Gamma)

Arguments

x

Numeric vector with d positive elements where the censored exponent measure is to be evaluated.

K

Integer vector, subset of 1, ..., d, the index set that is not censored. Or logical vector of length d, indicating entries that are not censored.

Gamma

d x d variogram matrix or numeric vector with d(d-1)/2 elements, containing the upper triangular part of a variogram matrix.

Value

Numeric. The censored exponent measure of the HR distribution. If no entries are censored, the result of ⁠logdV_HR(x, par⁠ is returned.

Compute the exponent measure density of HR distribution

Description

Computes the exponent measure density of HR distribution.

Usage

logdV_HR(x, Gamma = NULL, Theta = NULL)

Arguments

x

Numeric n \times d matrix or vector with d elements.

Gamma

d x d variogram matrix or numeric vector with d(d-1)/2 elements, containing the upper triangular part of a variogram matrix.

Theta

d x d precision matrix or numeric vector with d(d-1)/2 elements, containing the upper triangular part of a precision matrix.

Details

Both Gamma and Theta are needed internally, but if one is missing it is computed from the other one.

Value

Numeric. The censored exponent measure of the HR distribution.

Compute Huesler-Reiss log-likelihood, AIC, and BIC

Description

Computes (censored) Huesler-Reiss log-likelihood, AIC, and BIC values.

Usage

loglik_HR(data, p = NULL, graph = NULL, Gamma, cens = FALSE)

Arguments

data

Numeric n \times d matrix. It contains observations following a multivariate HR Pareto distribution.

p

graph

An [igraph::graph] object or NULL. The graph must be undirected and connected. If no graph is specified, the complete graph is used.

Gamma

Numeric n \times d matrix. It represents a variogram matrix \Gamma.

cens

Boolean. If true, then censored log-likelihood is computed. By default, cens = FALSE.

Value

Numeric vector c("loglik"=..., "aic"=..., "bic"=...) with the evaluated log-likelihood, AIC, and BIC values.

Convert indices to numerical indices

Description

Converts (possibly) logical indices to numerical ones. Also ensures unique indices and sorts them if specified.

Usage

make_numeric_indices(ind, n = NULL, unique = TRUE, sort = TRUE)

Arguments

ind

The numerical or logical index vector

n

Max numerical index (used if ind is logical and might be recycled)

unique

Whether to keep every (numerical) index at most once

sort

Whether to sort the numerical indices

Value

A numerical index vector

Create a list of separators

Description

Creates a list of separator set, such that every pair of non-adjacent vertices in graph is completely disconnected by the removal of (at least) one of the separator sets from the graph.

Usage

make_sep_list(graph, details = TRUE)

Arguments

graph

A graph

details

Return detailed infos (default TRUE)

Value

A list of numeric vectors

Marginalize multivariate Pareto dataset

Description

Marginalize a multivariate Pareto dataset data with respect to the variables in set_indices.

Usage

mparetomargins(data, set_indices)

Arguments

data

Numeric n \times d matrix. A dataset containing observations following a multivariate Pareto distribution.

set_indices

Numeric vector with at most d different elements in 1, ..., d. The variables with respect to which to marginalize the multivariate distribution.

Value

Numeric n \times m matrix, where m is the length of set_indices. Marginalized multivariate Pareto data.

Order Cliques

Description

Orders the cliques in a connected decomposable graph so that they fulfill the running intersection property.

Usage

order_cliques(cliques)

Create Gamma or Theta from vector

Description

Convert parameter vector par (upper triangular part of Gamma/Theta matrix) to full Gamma/Theta, or vice versa.

Usage

par2Matrix(par, allowMatrix = FALSE, allowNull = FALSE, zeroRowSums = FALSE)

par2Gamma(par, allowMatrix = FALSE, allowNull = FALSE)

par2Theta(par, allowMatrix = FALSE, allowNull = FALSE)

matrix2par(M, allowVector = FALSE, allowNull = FALSE)

Arguments

par

Numeric vector with d elements. Upper triangular part of a Gamma/Theta matrix.

allowMatrix

If TRUE and par is already a matrix, return it as is.

allowNull

If TRUE and par is NULL, return NULL.

zeroRowSums

If TRUE the diagonal is set to (-1) times the rowSums.

M

Matrix

allowVector

If TRUE and M is already a vector, return it as is.

Value

Numeric matrix d \times d. Full Gamma/Theta matrix corresponding to par.

Upper triangular part of M (or M itself/NULL if allowed)

Factory: parToMatrices

Description

Creates a helper function to convert a parameter vector par to a Gamma and/or Theta matrix.

Usage

parToMatricesFactory(
  graph,
  init = NULL,
  fixParams = integer(0),
  parIsTheta = FALSE,
  checkValidity = TRUE
)

Arguments

graph

par represents entries corresponding to the edges of graph.

init

The values used for fixed parameters

fixParams

The indices (logical or numeric) of fixed parameters in the full parameter vector.

parIsTheta

TRUE if par represents entries in Theta (otherwise Gamma)

checkValidity

Whether to check if the implied Gamma/Theta is a valid parameter matrix.

Value

A function parToMatrices(par, forceGamma=FALSE, forceTheta=FALSE), which takes a parameter vector and returns either NULL or a list with entries Gamma, Theta. The function returns NULL if checkValidity==TRUE and par implies an invalid matrix. Otherwise, depending on parIsTheta, forceTheta, and forceGamma, one or both of Gamma and Theta are matrices implied by par.

Plot Danube River Flow Data

Description

Plotting function to visualize the river flow data from the danube dataset. Requires ggplot2 to be installed.

Usage

plotDanube(
  stationIndices = NULL,
  graph = NULL,
  directed = NULL,
  plotStations = TRUE,
  plotConnections = TRUE,
  labelStations = FALSE,
  returnGGPlot = FALSE,
  useStationVolume = FALSE,
  useConnectionVolume = FALSE,
  mapCountries = c("Germany"),
  vertexColors = NULL,
  vertexShapes = NULL,
  edgeColors = NULL,
  xyRatio = NULL,
  clipMap = 1.2,
  useLatex = FALSE,
  edgeAlpha = 0.2
)

plotDanubeIGraph(
  stationIndices = NULL,
  graph = NULL,
  directed = NULL,
  labelStations = TRUE,
  vertexColors = NULL,
  vertexShapes = NULL,
  edgeColors = NULL,
  ...
)

Arguments

stationIndices

Logical or numerical vector, indicating the stations to be plotted.

graph

An igraph::graph object or NULL to use the flow graph.

directed

Logical. Whether to consider the flow graph as directed.

plotStations

Logical. Whether to plot the stations.

plotConnections

Logical. Whether to plot the connections.

labelStations

Logical. Whether to label stations.

returnGGPlot

If TRUE, a ggplot2::ggplot object is returned and not plotted immediately.

useStationVolume

Logical. Whether to indicate flow volume at a station by circle size.

useConnectionVolume

Logical. Whether to indicate flow volume on a connection by line width.

mapCountries

Which country borders to show using ggplot2::map_data('world', mapCountries).

vertexColors

Vector with color information for vertices.

vertexShapes

Vector with shape information for vertices.

edgeColors

Vector with color information for edges.

xyRatio

Approximate X-Y-ratio (w.r.t. distance on the ground) of the area shown in the plot.

clipMap

Logical or numeric scalar. Whether to ignore the map image when determining the axis limits of the plot. If it is a positive scalar, the plot limits are extended by that factor.

useLatex

Whether to format numbers etc. as latex code (useful when plotting to tikz).

edgeAlpha

Numeric scalar between 0 and 1. The alpha value to be used when plotting edges/connections.

...

Passed through to igraph::plot.igraph.

Details

The values of vertexColors, vertexShapes, and edgeColors are interpreted differently by ggplot2::geom_point/ggplot2::geom_segment and igraph::plot.igraph().

plotDanube uses a combination of ggplot2 functions to plot the graph.

plotDanubeIGraph uses igraph::plot.igraph to plot the graph.

Examples

# Basic plot
graphicalExtremes::plotDanube()

# Plot flow volumes
graphicalExtremes::plotDanube(
    clipMap = 1.2,
    useConnectionVolume = TRUE,
    useStationVolume = TRUE
)

# Plot other graph structures
nStations <- nrow(graphicalExtremes::danube$info)
g <- igraph::erdos.renyi.game(nStations, nStations, 'gnm')
graphicalExtremes::plotDanube(
    clipMap = 1.2,
    graph = g
)

Plot flight data

Description

Plotting function to visualize the flight connections from the flights dataset. This function requires the package ggplot2 to be installed.

Usage

plotFlights(
  airportIndices = NULL,
  airports_sel = NULL,
  connections_sel = NULL,
  graph = NULL,
  plotAirports = TRUE,
  plotConnections = TRUE,
  labelAirports = FALSE,
  returnGGPlot = FALSE,
  useAirportNFlights = FALSE,
  useConnectionNFlights = FALSE,
  minNFlights = 0,
  map = "state",
  vertexColors = NULL,
  vertexShapes = NULL,
  edgeColors = NULL,
  xyRatio = NULL,
  clipMap = FALSE,
  useLatex = FALSE,
  edgeAlpha = 0.2
)

Arguments

airportIndices

The indices of the airports (w.r.t. airports_sel) to include.

airports_sel

The airports to plot. Might be further subset by arguments airportIndices, graph. If NULL, then flights⁠$airports⁠ will be used.

connections_sel

A three columns data frame as output by flightCountMatrixToConnectionList(). If NULL, then flights⁠$nFlights⁠ will be used to construct one.

graph

An optional igraph::graph object, containing a flight graph to plot. Vertices should either match the selected airports in number and order, or be named with the corresponding IATA codes of the airports they represent.

plotAirports

Logical. Whether to plot the airports specified.

plotConnections

Logical. Whether to plot the connections specified.

labelAirports

Logical. Whether to show the IATA code next to each plotted airport.

returnGGPlot

If TRUE, a ggplot2::ggplot object is returned and not plotted immediately.

useAirportNFlights

Logical. Whether to vary the size of the circles representing airports in the plot, according to the number of flights at that airport.

useConnectionNFlights

Logical. Whether to vary the size of the edges representing connections in the plot, according to the number of flights on that connection.

minNFlights

Numeric scalar. Only plot connections with at least this many flights.

map

String or data.frame or NULL. What map to use as the background image. Strings are passed to ggplot2::map_data(), data frames are assumed to be the output of ggplot2::map_data().

vertexColors

Optional vector, named with IATA codes, to be used as colors for the vertices/airports.

vertexShapes

Optional vector, named with IATA codes, to be used as shapes for the vertices/airports. Is coerced to character.

edgeColors

Optional vector or symmetric matrix (character or numeric), to be used as colors for edges/connections. If this is a vector, its entries must match the plotted connections (in the order specified in connections_sel or implied by igraph::get.edgelist). If this is a matrix, its row/column names must be IATA codes, or its rows/columns match the plotted airports (in number and order).

xyRatio

Approximate X-Y-ratio (w.r.t. distance on the ground) of the area shown in the plot.

clipMap

Logical or numeric scalar. Whether to ignore the map image when determining the axis limits of the plot. If it is a positive scalar, the plot limits are extended by that factor.

useLatex

Whether to format numbers etc. as latex code (useful when plotting to tikz).

edgeAlpha

Numeric scalar between 0 and 1. The alpha value to be used when plotting edges/connections.

Value

If returnGGPlot is TRUE, a ggplot2::ggplot object, otherwise NULL.

Examples

# Plot all airports in the dataset
plotFlights(plotConnections = FALSE, map = 'world')

# Plot a selection of airports
plotFlights(c('JFK', 'SFO', 'LAX'), useConnectionNFlights = TRUE, useAirportNFlights = TRUE)

# Plot airports with a custom connections graph
IATAs <- c('ACV', 'BFL', 'EUG', 'SFO', 'MRY')
graph <- igraph::make_full_graph(length(IATAs))
plotFlights(IATAs, graph=graph, clipMap = 1.5)

Sampling of a multivariate Pareto distribution

Description

Simulates exact samples of a multivariate Pareto distribution.

Usage

rmpareto(
  n,
  model = c("HR", "logistic", "neglogistic", "dirichlet"),
  d = NULL,
  par
)

Arguments

n

Number of simulations.

model

The parametric model type; one of:

HR (default),
logistic,
neglogistic,
dirichlet.

d

Dimension of the multivariate Pareto distribution. In some cases this can be NULL and will be inferred from par.

par

Respective parameter for the given model, that is,

\Gamma, numeric d \times d variogram matrix, if model = HR.
\theta \in (0, 1), if model = logistic.
\theta > 0, if model = neglogistic.
\alpha, numeric vector of size d with positive entries, if model = dirichlet.

Details

The simulation follows the algorithm in Engelke and Hitz (2020). For details on the parameters of the Huesler-Reiss, logistic and negative logistic distributions see Dombry et al. (2016), and for the Dirichlet distribution see Coles and Tawn (1991).

Value

Numeric n \times d matrix of simulations of the multivariate Pareto distribution.

References

Coles S, Tawn JA (1991). “Modelling extreme multivariate events.” J. R. Stat. Soc. Ser. B Stat. Methodol., 53, 377–392.

Dombry C, Engelke S, Oesting M (2016). “Exact simulation of max-stable processes.” Biometrika, 103, 303–317.

Engelke S, Hitz AS (2020). “Graphical models for extremes (with discussion).” J. R. Stat. Soc. Ser. B Stat. Methodol., 82, 871–932.

Examples

## A 4-dimensional HR distribution
n <- 10
d <- 4
G <- cbind(
  c(0, 1.5, 1.5, 2),
  c(1.5, 0, 2, 1.5),
  c(1.5, 2, 0, 1.5),
  c(2, 1.5, 1.5, 0)
)

rmpareto(n, "HR", d = d, par = G)

## A 3-dimensional logistic distribution
n <- 10
d <- 3
theta <- .6
rmpareto(n, "logistic", d, par = theta)

## A 5-dimensional negative logistic distribution
n <- 10
d <- 5
theta <- 1.5
rmpareto(n, "neglogistic", d, par = theta)

## A 4-dimensional Dirichlet distribution
n <- 10
d <- 4
alpha <- c(.8, 1, .5, 2)
rmpareto(n, "dirichlet", d, par = alpha)

Sampling of a multivariate Pareto distribution on a tree

Description

Simulates exact samples of a multivariate Pareto distribution that is an extremal graphical model on a tree as defined in Engelke and Hitz (2020).

Usage

rmpareto_tree(n, model = c("HR", "logistic", "dirichlet")[1], tree, par)

Arguments

n

Number of simulations.

model

The parametric model type; one of:

HR (default),
logistic,
dirichlet.

tree

Graph object from igraph package. This object must be a tree, i.e., an undirected graph that is connected and has no cycles.

par

Respective parameter for the given model, that is,

\Gamma, numeric d \times d variogram matrix, where only the entries corresponding to the edges of the tree are used, if model = HR. Alternatively, can be a vector of length d-1 containing the entries of the variogram corresponding to the edges of the given tree.
\theta \in (0, 1), vector of length d-1 containing the logistic parameters corresponding to the edges of the given tree, if model = logistic.
a matrix of size (d - 1) \times 2, where the rows contain the parameters vectors \alpha of size 2 with positive entries for each of the edges in tree, if model = dirichlet.

Details

Value

Numeric n \times d matrix of simulations of the multivariate Pareto distribution.

References

Examples

## A 4-dimensional HR tree model

my_tree <- igraph::graph_from_adjacency_matrix(rbind(
  c(0, 1, 0, 0),
  c(1, 0, 1, 1),
  c(0, 1, 0, 0),
  c(0, 1, 0, 0)
),
mode = "undirected"
)
n <- 10
Gamma_vec <- c(.5, 1.4, .8)
set.seed(123)
rmpareto_tree(n, "HR", tree = my_tree, par = Gamma_vec)

## A 4-dimensional Dirichlet model with asymmetric edge distributions

alpha <- cbind(c(.2, 1, .5), c(1.5, .6, .8))
rmpareto_tree(n, model = "dirichlet", tree = my_tree, par = alpha)

Sampling of a multivariate max-stable distribution

Description

Simulates exact samples of a multivariate max-stable distribution.

Usage

rmstable(n, model = c("HR", "logistic", "neglogistic", "dirichlet")[1], d, par)

Arguments

n

Number of simulations.

model

The parametric model type; one of:

HR (default),
logistic,
neglogistic,
dirichlet.

d

Dimension of the multivariate Pareto distribution.

par

Respective parameter for the given model, that is,

\Gamma, numeric d \times d variogram matrix, if model = HR.
\theta \in (0, 1), if model = logistic.
\theta > 0, if model = neglogistic.
\alpha, numeric vector of size d with positive entries, if model = dirichlet.

Details

The simulation follows the extremal function algorithm in Dombry et al. (2016). For details on the parameters of the Huesler-Reiss, logistic and negative logistic distributions see Dombry et al. (2016), and for the Dirichlet distribution see Coles and Tawn (1991).

Value

Numeric n \times d matrix of simulations of the multivariate max-stable distribution.

References

Examples

## A 4-dimensional HR distribution
n <- 10
d <- 4
G <- cbind(
  c(0, 1.5, 1.5, 2),
  c(1.5, 0, 2, 1.5),
  c(1.5, 2, 0, 1.5),
  c(2, 1.5, 1.5, 0)
)

rmstable(n, "HR", d = d, par = G)

## A 3-dimensional logistic distribution
n <- 10
d <- 3
theta <- .6
rmstable(n, "logistic", d, par = theta)

## A 5-dimensional negative logistic distribution
n <- 10
d <- 5
theta <- 1.5
rmstable(n, "neglogistic", d, par = theta)

## A 4-dimensional Dirichlet distribution
n <- 10
d <- 4
alpha <- c(.8, 1, .5, 2)
rmstable(n, "dirichlet", d, par = alpha)

Sampling of a multivariate max-stable distribution on a tree

Description

Simulates exact samples of a multivariate max-stable distribution that is an extremal graphical model on a tree as defined in Engelke and Hitz (2020).

Usage

rmstable_tree(n, model = c("HR", "logistic", "dirichlet")[1], tree, par)

Arguments

n

Number of simulations.

model

The parametric model type; one of:

HR (default),
logistic,
dirichlet.

tree

Graph object from igraph package. This object must be a tree, i.e., an undirected graph that is connected and has no cycles.

par

Respective parameter for the given model, that is,

\Gamma, numeric d \times d variogram matrix, where only the entries corresponding to the edges of the tree are used, if model = HR. Alternatively, can be a vector of length d-1 containing the entries of the variogram corresponding to the edges of the given tree.
\theta \in (0, 1), vector of length d-1 containing the logistic parameters corresponding to the edges of the given tree, if model = logistic.
a matrix of size (d - 1) \times 2, where the rows contain the parameter vectors \alpha of size 2 with positive entries for each of the edges in tree, if model = dirichlet.

Details

The simulation follows a combination of the extremal function algorithm in Dombry et al. (2016) and the theory in Engelke and Hitz (2020) to sample from a single extremal function. For details on the parameters of the Huesler-Reiss, logistic and negative logistic distributions see Dombry et al. (2016), and for the Dirichlet distribution see Coles and Tawn (1991).

Value

Numeric n \times d matrix of simulations of the multivariate max-stable distribution.

References

Examples

## A 4-dimensional HR tree model

my_tree <- igraph::graph_from_adjacency_matrix(rbind(
  c(0, 1, 0, 0),
  c(1, 0, 1, 1),
  c(0, 1, 0, 0),
  c(0, 1, 0, 0)
),
mode = "undirected"
)
n <- 10
Gamma_vec <- c(.5, 1.4, .8)
rmstable_tree(n, "HR", tree = my_tree, par = Gamma_vec)

## A 4-dimensional Dirichlet model with asymmetric edge distributions

alpha <- cbind(c(.2, 1, .5), c(1.5, .6, .8))
rmstable_tree(n, model = "dirichlet", tree = my_tree, par = alpha)

Simulate HR extremal functions

Description

Simulates the Huesler-Reiss extremal functions

Usage

simu_px_HR(n, d, idx, trend, chol_mat)

Arguments

n

Number of simulations.

d

Dimension of the multivariate Pareto distribution. In some cases this can be NULL and will be inferred from par.

idx

Integer. Index corresponding to the variable over which the extremal function is simulated.

trend

Numeric. Trend corresponding to the variable idx.

chol_mat

Numeric matrix d \times d. Cholesky decomposition of the desired covariance matrix.

Value

Numeric n \times d matrix. Simulated data.

Simulate Dirichlet extremal functions

Description

Simulates Dirichlet extremal functions

Usage

simu_px_dirichlet(n, d, idx, alpha)

Arguments

n

Number of simulations.

d

Dimension of the multivariate Pareto distribution. In some cases this can be NULL and will be inferred from par.

idx

Integer or numeric vector with n elements. Index/indices from 1 to d, that determine which extremal function to simulate.

alpha

Numeric vector of size d.

Value

Numeric n \times d matrix. Simulated data.

Simulate logistic extremal functions

Description

Simulates logistic extremal functions

Usage

simu_px_logistic(n, d, idx, theta)

Arguments

n

Number of simulations.

d

Dimension of the multivariate Pareto distribution. In some cases this can be NULL and will be inferred from par.

idx

Integer or numeric vector with n elements. Index/indices from 1 to d, that determine which extremal function to simulate.

theta

Numeric — assume 0 < \theta < 1.

Value

Numeric n \times d matrix. Simulated data.

Simulate negative logistic extremal functions

Description

Simulates negative logistic extremal functions

Usage

simu_px_neglogistic(n, d, idx, theta)

Arguments

n

Number of simulations.

d

Dimension of the multivariate Pareto distribution. In some cases this can be NULL and will be inferred from par.

idx

Integer or numeric vector with n elements. Index/indices from 1 to d, that determine which extremal function to simulate.

theta

Numeric — assume \theta > 0.

Value

Numeric n \times d matrix. Simulated data.

Simulate HR extremal functions on a tree

Description

Simulates the Huesler-Reiss extremal functions on a tree

Usage

simu_px_tree_HR(n, Gamma_vec, A)

Arguments

n

Number of simulations.

Gamma_vec

Numeric vector with d-1 elements, where d is the number of nodes in the tree (and d-1 is the number of edges).

A

Numeric d \times (d-1) matrix; the rows represent the nodes in the tree, the columns represent the edges. For a fixed node k = 1, \dots, d, each entry (i, j) is equal to 1 if the edge in position j is on the directed path from node k to node i in the polytree rooted at node k.

Value

Numeric n \times d matrix. Simulated data.

Simulate Dirichlet extremal functions on a tree

Description

Simulates Dirichlet extremal functions on a tree

Usage

simu_px_tree_dirichlet(n, alpha.start, alpha.end, A)

Arguments

n

Number of simulations.

alpha.start

Numeric vector with d-1 elements, where d is the number of nodes in the tree (and d-1 is the number of edges).

alpha.end

Numeric vector with d-1 elements, where d is the number of nodes in the tree (and d-1 is the number of edges).

A

Value

Numeric n \times d matrix. Simulated data.

Simulate logistic extremal functions on a tree

Description

Simulates logistic extremal functions on a tree

Usage

simu_px_tree_logistic(n, theta, A)

Arguments

n

Number of simulations.

theta

Numeric vector with 1 or d-1 elements. Assume that the entry are such that 0 < \theta < 1.

A

Value

Numeric n \times d matrix. Simulated data.

Split graph into invariant subgraphs

Description

Split graph into invariant subgraphs

Usage

split_graph(g)

Arguments

g

igraph::graph object

Value

List of invariant subgraphs

Uniform margin

Description

Rescale the vector x empirically to uniform margin.

Usage

unif(x, na.rm = FALSE)

Arguments

x

Numeric vector.

na.rm

Logical. If TRUE, missing values are removed. If FALSE, missing values are kept as such.

Value

Numeric vector with entries rescaled to uniform margins

graphicalExtremes: Statistical methodology for graphical extreme value models.

Description

Details

Author(s)

References

See Also

Conversion between Huesler-Reiss parameter matrices

Description

Usage

Arguments

Details

Value

See Also

Compute theoretical \chi in 3D

Description

Usage

Arguments

Value

Convert matrix to graph

Description

Usage

Arguments

Value

See Also

Compute exponent measure

Description

Usage

Arguments

Details

Value

Computes the Z-matrix

Description

Usage

Arguments

Censor dataset

Description

Usage

Arguments

Value

HR parameter matrix checks

Description

Usage

Arguments

Details

Value

See Also

Check input graph

Description

Usage

Arguments

Value

See Also

Check input graph and partial matrix

Description

Usage

Arguments

Value

See Also

Identify pairs of vertices that are split by a separator

Description

Usage

Arguments

Value

Transformation between \chi and \Gamma

Description

Usage

Arguments

Details

Value

See Also

Completion of Gamma matrices

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Completion of decomposable Gamma matrices

Compute theoretical `\chi` in 3D

Transformation between `\chi` and `\Gamma`

Empirical estimation of extremal correlation matrix `\chi`