% preamble taken from the vignette in package strucchange % \documentclass[nojss,article]{jss} \usepackage[T1]{fontenc} %\usepackage{a4wide} \usepackage[left=2cm,right=2cm,bottom=15mm]{geometry} \usepackage{graphicx,color,alltt} \usepackage[authoryear,round,longnamesfirst]{natbib} \usepackage{hyperref} \usepackage{amsmath, amsfonts} % \usepackage{Sweave} %\definecolor{Red}{rgb}{0.7,0,0} %\definecolor{Blue}{rgb}{0,0,0.8} %\definecolor{hellgrau}{rgb}{0.55,0.55,0.55} %\newcommand{\E}{\mbox{$\mathsf{E}$}} %\newcommand{\VAR}{\mbox{$\mathsf{VAR}$}} %\newcommand{\COV}{\mbox{$\mathsf{COV}$}} %\newcommand{\p}{\mbox{$\mathsf{P}$}} %\newcommand{\email}[1]{\href{mailto:#1}{\normalfont\texttt{#1}}} %\newenvironment{smallexample}{\begin{alltt}\small}{\end{alltt}} %\setlength{\parskip}{0.5ex plus0.1ex minus0.1ex} %\setlength{\parindent}{0em} % %\bibpunct{(}{)}{;}{a}{}{,} % %\newcommand{\ui}{\underline{i}} %\newcommand{\oi}{\overline{\imath}} %\RequirePackage{color} %\definecolor{Red}{rgb}{0.5,0,0} %\definecolor{Blue}{rgb}{0,0,0.5} %\definecolor{hellgrau}{rgb}{0.55,0.55,0.55} % %\hypersetup{% %hyperindex,% %colorlinks,% %linktocpage,% %plainpages=false,% %linkcolor=Blue,% %citecolor=Blue,% %urlcolor=Red,% %pdfstartview=Fit,% %pdfview={XYZ null null null}% %} % SweaveOpts{engine=R,eps=FALSE} %\VignetteIndexEntry{Brief guide to R package cvar} %\VignetteDepends{cvar} %\VignetteKeywords{expected shortfall, ES, CVaR, VaR, value at risk} %\VignettePackage{cvar} <>= library(cvar) pd <- packageDescription("cvar") @ \newcommand{\VaR}[2][\alpha]{\text{VaR}_{#1}(#2)} \newcommand{\ES}[2][\alpha]{\text{ES}_{#1}(#2)} \title{Brief guide to R package \pkg{cvar}} \author{Georgi N. Boshnakov \\ University of Manchester } \Plainauthor{Georgi N. Boshnakov} \Address{ Georgi N. Boshnakov\\ School of Mathematics\\ The University of Manchester\\ Oxford Road, Manchester M13 9PL, UK\\ URL: \url{http://www.maths.manchester.ac.uk/~gb/} } %\date{} %\maketitle \Abstract{ Package \pkg{cvar} is a small package with, essentially, two functions --- \code{ES()} for computing the expected shortfall and \code{VaR()} for Value at Risk. The user specifies the distribution by supplying one of the functions that define a continuous distribution---currently this can be a quantile function (qf), cumulative distribution function (cdf) or probability density function (pdf). Virtually any continuous distribution can be specified. This vignette is part of package \pkg{cvar}, version~\Sexpr{pd$Version}. %$ } \Keywords{expected shortfall, ES, CVaR, VaR, value at risk} \Plainkeywords{expected shortfall, ES, CVaR, VaR, value at risk} \begin{document} \section{Introduction} \label{sec:introduction} There is a huge number of functions for computations with distributions in core R and in contributed packages. Pdf's, cdf's, quantile functions and random number generators are covered comprehensively. The coverage of expected shortfall is more patchy but a large collection of distributions, including functions for expected shortfall, is provided by \citet{VaRES2013}. \citet{PerformanceAnalytics2018} and \citet{actuarJSS2008} provide packages covering comprehensively various aspects of risk measurement, including some functions for expected shortfall. Package \pkg{cvar} is a small package with, essentially two functions --- \code{ES} for computing the expected shortfall and \code{VaR} for Value at Risk. The user specifies the distribution by supplying one of the functions that define a continuous distribution---currently this can be a quantile function (qf), cumulative distribution function (cdf) or probability density function (pdf). Virtually any continuous distribution can be specified. The computations are done directly from the definitions, see e.g. \citet{acerbi2002expected}. The functions are vectorised over the parameters of the distributions, making bulk computations more convenient, for example for forecasting or model evaluation. The name of this package, "cvar", comes from \emph{Conditional Value at Risk} (CVaR), which is an alternative term for expected shortfall. We chose to use the standard names \code{ES} and \code{VaR}, despite the possibility for name clashes with same named functions in other packages, rather than invent possibly difficult to remember alternatives. Just call the functions as \code{cvar::ES} and \code{cvar::VaR} if necessary. Locations-scale transformations can be specified separately from the other distribution parameters. This is useful when such parameters are not provided directly by the distribution at hand. The use of these parameters often leads to more efficient computations and better numerical accuracy even if the distribution has its own parameters for this purpose. Some of the examples for \code{VaR} and \code{ES} illustrate this for the Gaussian distribution. Since VaR is a quantile, functions computing it for a given distribution are convenience functions. \code{VaR} exported by \pkg{cvar} could be attractive in certain workflows because of its vectorised distribution parameters, the location-scale transformation and the possibility to compute it from cdf's when quantile functions are not available. \section{Value at Risk and Expected Shortfall} %To avoid confusion, we give the mathematical definitions here. We use the traditional definition of VaR as the negated lower quantile. More specifically, let $Y$ be the variable of interes, such as return on a financial asset. Suppose that it is modelled as a random variable with distribution function $F^{Y}(y)$. Then VaR is defined as% \footnote{Beware that a definition without negation is also used in both the literature and in the R packages.} \begin{equation*} \VaR{Y} = - q_{\alpha}^{Y} . \end{equation*} where $q_{\alpha}^{Y}$ is the lower $\alpha$-quantile of $Y$, i.e. we have \begin{equation*} \alpha = F(q_{\alpha}^{Y}) = P(Y \le q_{\alpha}^{Y}) . \end{equation*} Typical values for $\alpha$ are $0.05$ and $0.01$. Equivalently, we could write\footnote{This equation shows why some authors use the ``big numbers'', e.g. 0.95 and 0.99, in the notation for the VaR.} \begin{equation*} \VaR{Y} = q_{1-\alpha}^{-Y} . \end{equation*} The expected shortfall is the (partial) expectation of $\VaR{Y}$: \begin{equation*} \ES{Y} = \frac{1}{\alpha}\int_{0}^{\alpha}\VaR[\gamma]{Y} d\gamma . \end{equation*} Suppose now that $Y$ is obtained from another random variable, $X$, by a linear transformation: \begin{equation*} Y = a + bX \end{equation*} When this is the case, there is a simple relation between the VaR's and ES's of $Y$ and $X$: \begin{align*} \VaR{Y} &= -a + b \VaR{X} \\ \ES{Y} &= -a + b \ES{X} \end{align*} In practice, $X$ is often chosen to be standardised but this is not necessary. Note also that if a bunch of VaR's and ES's are needed, say for normally distributed variables, this can be computed very efficiently using this property. \section{VaR} \label{sec:var} Here we compute the VaR associated with a standard normal r.v. The three variants are equivalent since 0.05 and \code{"qf"} are default for the last two arguments. <<>>= cvar::VaR(qnorm, p_loss = 0.05, dist.type = "qf") cvar::VaR(qnorm, p_loss = 0.05) cvar::VaR(qnorm) @ \code{x} can be a vector: <<>>= cvar::VaR(qnorm, p_loss = c(0.01, 0.05)) @ Let's set some more realistic values for the parameters of the normal distribution. Suppose that the daily returns on some stock have sample mean $0.006408553$ and sample variance $0.0004018977$. Then $N(0.006408553, 0.0004018977)$ can be taken as normal distribution fitted to the data. <<>>= muA <- 0.006408553 sigma2A <- 0.0004018977 @ This computes VaR using the fitted normal distribution: <<>>= res1 <- cvar::VaR(qnorm, p_loss = 0.05, mean = muA, sd = sqrt(sigma2A)) @ For the normal disribution we could also use the intercept-slope arguments, since the parameters are precisely the intercept and the slope: <<>>= res2 <- cvar::VaR(qnorm, p_loss = 0.05, intercept = muA, slope = sqrt(sigma2A)) abs((res2 - res1)) # 0, intercept/slope equivalent to mean/sd @ If we compute VaR using the cdf, the intercept-slope method has some numerical advantage: <<>>= ## with cdf the precision depends on solving an equation res1a <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", mean = muA, sd = sqrt(sigma2A)) res2a <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", intercept = muA, slope = sqrt(sigma2A)) abs((res1a - res2)) # 3.287939e-09 abs((res2a - res2)) # 5.331195e-11, intercept/slope better numerically @ Of course, it is almost always better to use the quantile function when it is available. We can use smaller tollerance to improve the precision (the value \verb+.Machine$double.eps^0.75+ = \Sexpr{.Machine$double.eps^0.75} is probably excessive): <<>>= ## as above, but increase the precision, this is probably excessive res1b <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", mean = muA, sd = sqrt(sigma2A), tol = .Machine$double.eps^0.75) res2b <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", intercept = muA, slope = sqrt(sigma2A), tol = .Machine$double.eps^0.75) abs((res1b - res2)) # 6.938894e-18 # both within machine precision abs((res2b - res2)) # 1.040834e-16 @ The relative precision is also good. <<>>= abs((res1b - res2)/res2) # 2.6119e-16 # both within machine precision abs((res2b - res2)/res2) # 3.91785e-15 @ For examples with vectorised distribution parameters we use data from \pkg{PerformanceAnalytics}. Compute means and variances of the variables in a data frame and store them in vectors: <<>>= ## if(require("PerformanceAnalytics")){ data(edhec, package = "PerformanceAnalytics") mu <- apply(edhec, 2, mean) sigma2 <- apply(edhec, 2, var) musigma2 <- cbind(mu, sigma2) @ We compute VaR using \code{PerformanceAnalytics::VaR}: <<>>= ## analogous calc. with PerformanceAnalytics::VaR vPA <- apply(musigma2, 1, function(x) PerformanceAnalytics::VaR(p = .95, method = "gaussian", invert = FALSE, mu = x[1], sigma = x[2], weights = 1)) @ The results below compare to the value, \code{vPA}, obtained here. The computations below are similar to the previous example but the distribution parameters are vectors. The first pair of results are numerically the same: <<>>= vAz1 <- cvar::VaR(qnorm, p_loss = 0.05, mean = mu, sd = sqrt(sigma2)) vAz2 <- cvar::VaR(qnorm, p_loss = 0.05, intercept = mu, slope = sqrt(sigma2)) max(abs((vPA - vAz1))) # 5.551115e-17 max(abs((vPA - vAz2))) # "" @ Computing VaR from cdf shows some advantage for the location-scale method: <<>>= vAz1a <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", mean = mu, sd = sqrt(sigma2)) vAz2a <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", intercept = mu, slope = sqrt(sigma2)) max(abs((vPA - vAz1a))) # 3.287941e-09 max(abs((vPA - vAz2a))) # 1.465251e-10, intercept/slope better @ The advantage remains for smaller tolerance: <<>>= vAz1b <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", mean = mu, sd = sqrt(sigma2), tol = .Machine$double.eps^0.75) vAz2b <- cvar::VaR(pnorm, p_loss = 0.05, dist.type = "cdf", intercept = mu, slope = sqrt(sigma2), tol = .Machine$double.eps^0.75) max(abs((vPA - vAz1b))) # 4.374869e-13 max(abs((vPA - vAz2b))) # 3.330669e-16 @ \section{Expected shortfall} \label{sec:expected-shortfall} \code{ES} has essentially the same arguments as \code{VaR}. The examples below are obtained almost literally by replacing the calls to \code{VaR} with \code{ES()} ones. Here we compute the VaR associated with a standard normal r.v. The three variants are equivalent since 0.5 and \code{"qf"} are default for the last two arguments. <<>>= cvar::ES(qnorm, p_loss = 0.05, dist.type = "qf") cvar::ES(qnorm, p_loss = 0.05) cvar::ES(qnorm) @ \code{x} can be a vector: <<>>= cvar::ES(qnorm, p_loss = c(0.01, 0.05)) @ Let's set some more realistic values for the parameters of the normal distribution: <<>>= muA <- 0.006408553 sigma2A <- 0.0004018977 @ This demonstrates the use of \code{intercept} and \code{slope}. These arguments use the linear transformation property discussed in the theoretical section. For the normal disribution the parameters are precisely the intercept and the slope: <<>>= res1 <- cvar::ES(qnorm, p_loss = 0.05, mean = muA, sd = sqrt(sigma2A)) res2 <- cvar::ES(qnorm, p_loss = 0.05, intercept = muA, slope = sqrt(sigma2A)) abs((res2 - res1)) @ If we compute ES using the cdf, the intercept slope method has some numerical advantage: <<>>= ## with cdf the precision depends on solving an equation res1a <- cvar::ES(pnorm, p_loss = 0.05, dist.type = "cdf", mean = muA, sd = sqrt(sigma2A)) res2a <- cvar::ES(pnorm, p_loss = 0.05, dist.type = "cdf", intercept = muA, slope = sqrt(sigma2A)) abs((res1a - res2)) # abs((res2a - res2)) # intercept/slope better numerically @ We can use smaller tollerance to improve the precision (the value \verb+.Machine$double.eps^0.75+ = \Sexpr{.Machine$double.eps^0.75} is probably excessive): <<>>= ## as above, but increase the precision, this is probably excessive res1b <- cvar::ES(pnorm, p_loss = 0.05, dist.type = "cdf", mean = muA, sd = sqrt(sigma2A), tol = .Machine$double.eps^0.75) res2b <- cvar::ES(pnorm, p_loss = 0.05, dist.type = "cdf", intercept = muA, slope = sqrt(sigma2A), tol = .Machine$double.eps^0.75) abs((res1b - res2)) abs((res2b - res2)) @ The relative precision is also good. <<>>= abs((res1b - res2)/res2) abs((res2b - res2)/res2) @ \newpage{} \bibliography{REFERENCES} \end{document}