---
title: "Compute Impulse Response Functions to Structural Shocks using **bayesianVARs**"
author: Stefan Haan^[University of Klagenfurt, mailto:sthaan@edu.aau.at]
output: html_document
bibliography: irfs.bib
link-citations: true
reference-section-title: References
vignette: >
  %\VignetteIndexEntry{Compute Impulse Response Functions to Structural Shocks using **bayesianVARs**}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

# Introduction

The structural VAR model of order $p$ assumes the vector-valued observations $\left\{ y_t , t = 1, \dots, T \right\}$ are generated by the process^[We suppress the intercept of the equation.]
\begin{equation*}
	y_t' B_0 = y_{t-1}' B_1 + \dots + y_{t-p}' B_p + \omega_t'.
\end{equation*}
The components of $\omega_t$ are required to be uncorrelated (i.e., the variance-covariance matrix $\Sigma_{\omega}$ is a diagonal matrix) and referred to as _structural shocks_.

The model parameter $B_0$ is not identified. The model deserves to be called "structural", only if $B_0$ reflects the instantaneous causal relationships between the variables. Theoretical considerations are used to place restrictions on the entries of $B_0$ [@luetkepohl2005, section 10.2.2]. Following [@rubioramirez2010] and [@arias2014], in order to identify $B_0$, we allow the user to specify identifying zero and sign restrictions for several transformations of the structural parameters:

- $B_0$ itself,
- the contemporaneous effects $\mathrm{IR}_0 := B_0'^{-1}$,
- the structural coefficients $B_1, B_2, \dots, B_p$, and
- the long-run effects (sum of the impulse responses over all horizons $h$)
\begin{equation*}
\mathrm{IR}_\infty := \left(B_0' - \sum_{l=1}^{p} B_l' \right)^{-1}.
\end{equation*}

An _impulse response function_ describes the effects of an shock to the system over time.
Intuitively, IRFs answer the question: "What is the effect of a shock $\varepsilon_t$ hitting the system at time $t$ on
$y_{t+h}$ for $h = 1,2,\dots$, given that no other shocks hit the system?" [@koop1996].
Formally, we define IRFs as the Jacobian
\begin{equation*}
	\Theta_h := \frac{\partial y_{t+h}}{\partial \omega_t'}.
\end{equation*}

If the (homoskedastic) factor model is used, we define the impulse reponses with respect to innovations in the factors $f_t$ instead:
\begin{equation*}
\frac{\partial y_{t+h}}{\partial f_t'}.
\end{equation*}
In this case, identification of the IRFs is achieved by zero and sign restrictions on the factor loadings $\Lambda$ (which are also the contemporaneous effects $\mathrm{IR}_0$) and/or the long-run effects.

One outstanding feature of **bayesianVARs** is the ability to account for time-varying innovation variances using the _stochastic volatility_ models implemented in **stochvol** and **factorstochvol** [@vigbayesianVARs:bayesianVARs-vignette]. For these models however, all columns of $B_0$ are already identified up to sign [@lutkepohl2024]. This result generalizes to the VAR model with factor stochastic volatility [@thesishaan2025, Theorem 2]. If a heteroskedastic model is fitted with `bvar` (the default), the function `irf` uses the estimated innovation variances at time $T$.

# Usage examples
We present several practical examples in **R** that show how **bayesianVARs** can be used to obtain IRFs. The example [Optimism shocks](#optimism-shocks) is intended to show typical usage and reproduces Figure 4(a) from [@arias2014]. In the example [A Monetary SVAR](#a-monetary-svar) we show how to relax restrictions that cannot be satisfied. In particular, we replace a zero restriction with a custom "approximately zero restriction" using rejection sampling.
	
## Optimism shocks
[@beaudry2011] analyze the relevance of optimism shocks on business cycle phenomena such as a boom in output, investment, consumption, and hours worked.	[@beaudry2011] propose several identification schemes for optimism shocks. All identification schemes have in common that an optimism shock should be associated with an increase in stock prices. Additionally, the optimism shock should not be associated with improvements in technology (measured by _total factor productivity_) or expansionary monetary policy, which can be competing causes for an increase in stock prices. The most agnostic identification scheme imposes only a zero restriction on productivity and a positive sign restriction on stock prices on the contemporaneous effects.
	
[@arias2014] criticize the approach used by [@beaudry2011] to calculate the IRFs for its upward bias and artificially narrow confidence intervals, and come to the conclusion that the reported results are not significant, if the IRFs are drawn from the correct distribution. [@bsvarSIGNs] replicate this using the R package **bsvarSIGNs**. In this example we replicate this result once more using **bayesianVARs**.

First, we load the `optimism` dataset from the **bsvarSIGNs** package.
```{r, message=FALSE}
library(bsvarSIGNs)
data(optimism)
```

We then fit a homoskedastic $\mathrm{VAR}(4)$ model using **bayesianVARs**.
```{r, results="hide", cache=TRUE}
library(bayesianVARs)
prior_sigma <- specify_prior_sigma(
	M = ncol(optimism),
	type = "cholesky",
	cholesky_heteroscedastic = FALSE
)
mod <- bvar(optimism * 100, lags = 4L, prior_sigma = prior_sigma)
```

We construct following matrix describing the zero and sign restrictions on the contemporaneous effects.
```{r, echo=FALSE}
restrictions <- matrix(
	NA, ncol(optimism), ncol(optimism),
	dimnames=list(colnames(optimism), paste0("shock", 1:ncol(optimism)))
)
restrictions["productivity", "shock1"] <- 0
restrictions["stock_prices", "shock1"] <- 1
```
```{r}
show(restrictions)
```
Here, `NA` signifies an unrestricted parameter, `0` a zero restriction on the parameter and `1` imposes a positive sign restriction.
Since shocks 2-5 are not identified, we calculate IRFs only for the first shock.
```{r}
#| calculate-optimism-irfs, cache=TRUE,
#| fig.width=4, fig.cap="IRFs to an one standard-deviation optimism shock."

shock1 <- diag(ncol(optimism))[, 1, drop=FALSE]
show(shock1)

struct_restr <- specify_structural_restrictions(
	mod,
	restrictions_B0_inv_t = restrictions
)
shock1_irf <- irf(
	mod,
	ahead = 40,
	structural_restrictions = struct_restr,
	shocks = shock1
)

plot(shock1_irf, quantiles = c(0.16, 0.50, 0.84))
```
The 68% confidence intervals we compute resemble the intervals shown in [@arias2014, Figure~4(a)]. We can see that the confidence intervals of consumption and hours worked include zero at all times. The optimism shock identified in this manner has therefore no statistically significant effect on any of the these variables in our model. Stock prices seem to be highly persistent, with significant effects for at least 40 periods after the initial shock.

## A Monetary SVAR
[@rubioramirez2010] show that a five-variable structural VAR model of monetary policy is globally identified, if some normalization rule for the signs of the structural parameters exists and the following zero restrictions on $B_0$ are imposed:
\begin{equation*}
	B_0 = \begin{pmatrix}
		& \text{PS} & \text{PS} & \text{MP} & \text{MD} & \text{Inf} \\
		\log Y & * & * & 0 & * & * \\
		\log P & 0 & * & 0 & * & * \\
		R & 0 & 0 & * & * & * \\
		\log M & 0 & 0 & * & * & * \\
		\log P_c & 0 & 0 & 0 & 0 & * 
	\end{pmatrix}.
\end{equation*}
Variables:

* $Y$ ... gross domestic product (GDP)
* $P$ ... GDP deflator
* $R$ ... nominal short-term interest rate
* $M$ ... M3 money supply
* $P_c$ ... commodity prices

Shocks:

* PS ... production sector
* MP ... monetary policy
* MD ... money demand
* Inf ... information in the commodity market

**bayesianVARs** will report an error, if one attempts to calculate the IRFs to the structural shocks identified by the restrictions above:
```{r, echo=FALSE}
library(bayesianVARs)
```
```{r, monetary-svar-restr1, cache=TRUE, results="hide"}
train_data <- 100 * usmacro_growth[
  , c("GDPC1", "GDPCTPI", "GS1", "M2REAL", "CPIAUCSL")
]
prior_sigma <- specify_prior_sigma(
  M = ncol(train_data),
  type = "cholesky",
  cholesky_heteroscedastic = FALSE
)
mod <- bvar(train_data, lags = 5L, prior_sigma = prior_sigma)

restrictions_B0 <- rbind(
  c(1 ,NA,0 ,NA,NA),
  c(0 ,1 ,0 ,NA,NA),
  c(0 ,0 ,1 ,NA,NA),
  c(0 ,0 ,NA,1 ,NA),
  c(0 ,0 ,0 ,0 ,1 )
)
restrictions <- specify_structural_restrictions(
  mod,
  restrictions_B0 = restrictions_B0
)
```
```{r, monetary-svar-restr1-error, error=TRUE}
irf1 <- irf(mod, ahead = 4, structural_restrictions = restrictions)
```
How can we deal with this situation? One option is to relax one of the zero restrictions to an "approximately zero" restriction. Good candidates for the zero restriction on $B_0$ to be relaxed are any of the restrictions in column two ("PS" shock) or three ("MP" shock). Note that if we drop any of those restrictions, the number of zeros does not exceed those of a triangular matrix and **bayesianVARs** should be able to find structural parameters.
	
**bayesianVARs** does not implement "approximately zero" restrictions, but the desired behavior can be achieved by estimating a slightly larger model and then conditioning on the restriction by rejecting all posterior samples which do not meet it.^[Because we might end up rejecting many samples, a potentially useful feature is the possibility to cheaply generate multiple structural parameter samples for each reduced form parameter sample by passing the parameter `randomized_max_rotations_per_sample` to the method `irf`. However, one should be aware that this can only help with identification uncertainty, not the uncertainty in the reduced form estimates. In this example, the rotation is uniquely determined and the effective sample size will not increase.]

First, we draw a much large number of reduced form samples, knowing most of them will be rejected.
```{r, monetary-svar-more-samples, cache=TRUE, results="hide"}
mod <- bvar(
  train_data, lags = 5L, draws = 20*3000, prior_sigma = prior_sigma
)
```

We then obtain samples for the structural parameter $B_0$, dropping the zero restriction "$(B_0)_{3,2} = 0$". This can be thought of allowing the contemporaneous effect $R \rightarrow P$ to be different than zero. 
```{r}
restrictions_B0_relaxed <- rbind(
  c(1 ,NA,0 ,NA,NA),
  c(0 ,1 ,0 ,NA,NA),
  c(0 ,NA,1 ,NA,NA),
  c(0 ,0 ,NA,1 ,NA),
  c(0 ,0 ,0 ,0 ,1 )
)
```

The relaxed restrictions have columns with $0, 1, \dots, M-1$ zero restrictions and exactly one sign restriction per column. In this case, an unique solution exists for every reduced-form draw.
```{r}
restrictions_relaxed <- specify_structural_restrictions(
  mod, restrictions_B0 = restrictions_B0_relaxed
)
irf_relaxed <- irf(
  mod, ahead = 4,
  structural_restrictions = restrictions_relaxed
)
B0_relaxed <- extractB0(irf_relaxed)
```

Find the samples which fulfill the approximately zero condition $\lvert (B_0)_{3,2} \rvert < 0.01$ and make sure enough are remaining.
```{r}
B0_31_approx_zero <- abs(B0_relaxed[3,2,]) < 0.01
sum(B0_31_approx_zero)
```

Finally, plot the IRFs, conditional on `B0_31_approx_zero`:
```{r}
irf_conditional <- irf_relaxed[,,,B0_31_approx_zero]
class(irf_conditional) <- class(irf_relaxed)
plot(irf_conditional)
```

## Joint credible regions
[@inoue2021] criticize the common practice of plotting the point-wise quantiles of the IRFs, like we did in the previous examples. They draw attention to the fact, that that the point-wise posterior median (or mean) is not necessarily equal to _any_ of the impulse response vectors that satisfy the identifying restrictions.

Following [@inoue2021, section 2.4], we allow the user to visualize the $(1-\alpha)~\text{100%}$ joint credible set by sorting all IRF draws $\{ \Phi^{(r)} \}_{r=1\dots R}$ in ascending order of
\begin{equation*}
\hat Q(\Phi^{(r)})
:= \hat{\mathbb{E}}_\Phi\left[L(\Phi^{(r)}, \Phi) \right]
= \frac{1}{R} \sum_{s=1}^{R} L\left(\Phi^{(r)}, \Phi^{(s)} \right)
\end{equation*}
and plotting only the first $(1-\alpha)~\text{100%}$ draws. We use absolute loss for the loss function $L$. The draw with the smallest value of $\hat Q$ is referred to as the _Bayes estimator_ (shown as the black trajectory in the plot below).

Consider the following example, in which we deliberately do not use any sign-normalization: Using **bayesianVARs**, we estimate a two factor model on the entire `usmacro_growth` dataset, containing 21 variables.
```{r, cache=TRUE, results="hide"}
prior_sigma <- specify_prior_sigma(
  M = ncol(usmacro_growth),
  type = "factor",
  factor_factors = 2,
  factor_restrict = "none",
  factor_heteroskedastic = FALSE
)
mod <- bvar(100 * usmacro_growth, lags = 5L, prior_sigma = prior_sigma)
```
We require the first factor shock to have no long-run effects on the level of GDP,^[
Note that the dataset `usmacro_growth` contains log-differences, not the levels of `GDPC1`.
To see why the restriction above implies no long-run effects on the _level_, let
\begin{equation*}
	y_t := \log(\text{gdp}_t) - \log(\text{gdp}_{t-1}).
\end{equation*}
For any finite horizon $H < \infty$, summing up the impulse responses to the first factor shock $f_{1,t}$ gives us
\begin{equation*}
\sum_{h=0}^{H}\frac{\partial y_{t+h}}{\partial f_{1,t}}
= \frac{\partial}{\partial f_{1,t}} \sum_{h=0}^{H} y_{t+h}
= \frac{\partial}{\partial f_{1,t}} \left(\log(\text{gdp}_{t+H}) - \log(\text{gdp}_{t-1}) \right)
= \frac{\partial \log(\text{gdp}_{t+H})}{\partial f_{1,t}}.
\end{equation*}
The restriction on $\text{IR}_\infty$ now requires the left-hand side of the equation above to converge to zero as $H \rightarrow \infty$.
] but do not impose any other identifying restrictions.

```{r, echo=FALSE}
restrictions_long_run_ir <- matrix(
  NA, nrow=ncol(usmacro_growth), ncol=prior_sigma$prior_sigma_cpp$factor_factors,
  dimnames=list(vars=colnames(usmacro_growth), factors=NULL)
)
```
```{r}
show(restrictions_long_run_ir)
restr <- specify_structural_restrictions(
  mod,
  restrictions_long_run_ir = restrictions_long_run_ir,
)

shock1 <- diag(2)[, 1, drop = FALSE]
```
Because our identification scheme admits sign switches, roughly half of all trajectories have opposite signs and the medians turn out to be about zero for all time horizons. Yet, zero would be a bad estimate: The trajectory plot (with `hairy=TRUE`) reveals that our model predicts a statistically significant effect, we are just unsure about its sign.
```{r, cache=TRUE, fig.height=3}
x <- irf(mod, hairy = FALSE, structural_restrictions = restr, shocks = shock1)
plot(x, vars = "PCECTPI")
```
```{r, cache=TRUE, fig.height=3}
x <- irf(mod, hairy = TRUE, structural_restrictions = restr, shocks = shock1)
plot(x, vars = "PCECTPI", quantiles=0.68, default_hair_color = "#FF000005")
```