---
title: "05 Effect of prior information on transition probabilities and fertility"
author: "Drew Tyre"
date: "`r format(Sys.Date(), '%b %d, %Y')`"
output:
  rmarkdown::html_vignette:
    fig_caption: yes
vignette: >
  %\VignetteIndexEntry{05 Effect of prior information on transition probabilities and fertility}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
bibliography: references.bib
---

## Borrowing strength

Tremblay and McCarthy [-@tremblay2014bayesian] estimated transition probabilities and recruitment rates for another epiphytic orchid from a series of small populations. Although some transitions were not observed in all populations, the presence of a few observations in other populations allowed the statistical models to "borrow strength" from each other and generate estimated probabilities for all transitions. Unfortunately, many (45%; @SalgueroCompadre2015) plant demography studies are carried out at single sites or few time periods. In those cases there is no information to fill in the blanks for non-observed transitions. 

Here we explore the population consequences of setting a prior accounting for counting constraints. Ignoring the problem treats the transition as a fixed zero. This can have severe consequences for matrix structure [@stott2010reducibility]. The observed transitions are multinomial rather than binomial events. This can be accommodated by using a Dirichlet prior distribution, albeit at the cost of increased complexity in the calculations. 
@eberhardt1986grizzly had a small sample size of grizzlies in some age classes. Morris and Doak [pg 197, -@morris2002quantitative] recognized the potential problem with small sample sizes for rare transitions, but their recommendation is to use the mathematically more complex approach of Tremblay and McCarthy [-@tremblay2014bayesian]. Is there anything that can be done on the simple side?

## The matrix

We will use observations of an epiphytic Orchid *Lepanthes eltoroensis* Stimson an endemic species from Puerto Rico [@tremblay2003population].

```{r setup, message=FALSE, echo=-7}
library(dplyr)
library(purrr)
library(tibble)
library(tidyr)
library(ggplot2)
library(popbio)
library(raretrans)

# Raymond's theme modifications
rlt_theme <- theme(axis.title.y = element_text(colour="grey20",size=15,face="bold"),
        axis.text.x = element_text(colour="grey20",size=15, face="bold"),
        axis.text.y = element_text(colour="grey20",size=15,face="bold"),  
        axis.title.x = element_text(colour="grey20",size=15,face="bold"))
data("L_elto")
A <- projection.matrix(as.data.frame(L_elto), 
                       stage="stage", fate="next_stage", 
                       fertility="fertility", sort=c("p","j","a"))
knitr::kable(A, digits=2, 
             caption = "Average transition matrix over all census periods and populations.")
```

```{r hidethis, include = FALSE}
knitr::opts_chunk$set(fig.width = 5, fig.height = 5)
```

Data was collected from 1850 individuals which were permanently identified and tracked over a 6 year period in intervals of 6 months. Thus transitions are for six month intervals and not yearly transitions as is more frequently published.  The life history stages included in this analysis are seedlings (individuals without a lepanthiform sheet on any of the leaves), juveniles (individuals with no evidence of present or past reproductive effort, the base of the inflorescence are persistent), and adult (with presence of active or inactive inflorescences). The average population growth rate over all years and populations is `r format(popbio::lambda(A), digits=2)`, suggesting a rapid decline. 

```{r pop_by_time, fig.cap = "Population size over time."}
plotdata <- L_elto %>% group_by(POPNUM, year, stage) %>%
  tally() %>%
  group_by(POPNUM, year) %>%
  summarise(N = sum(n))

gg1 <- ggplot(plotdata, aes(x=year, y=N, group=POPNUM)) +
  geom_line() + rlt_theme
plotsumm <- plotdata %>% group_by(year) %>%
  summarize(N=sum(N))
gg2 <- ggplot(plotsumm, aes(x=year, y=N)) +
  geom_line() + rlt_theme
ggboth <- gridExtra::arrangeGrob(gg1, gg2)
grid::grid.draw(ggboth)

```

OK, now we want to generate matrices for each year and popnum. We have to change the variables into factors and control the levels so that the resulting matrices end up with the right shape with zeros for the missing transitions. Also have to coerce each grouped tbl_df to a regular data.frame using `as.data.frame()` before passing to `projection.matrix()`. 
```{r}
allA <- L_elto %>% 
  mutate(stage = factor(stage, levels=c("p","j","a","m")),
         fate = factor(next_stage, levels=c("p","j","a","m"))) %>%
  group_by(POPNUM, year) %>%
  do(A = projection.matrix(as.data.frame(.), 
                           stage="stage", fate="fate", 
                           fertility="fertility", sort=c("p","j","a")),
     TF = projection.matrix(as.data.frame(.), 
                           stage="stage", fate="fate", 
                           fertility="fertility", sort=c("p","j","a"), TF = TRUE),
     N = get_state_vector(as.data.frame(.), 
                           stage="stage", sort=c("p","j","a"))) %>%
  
  filter(year < 12) # last time period all zero for obvious reason

```

We end up with all the matrices for each population and year in a `list-column` of the result. Working with this thing is a bit awkward. The best strategy is from [Jenny Bryans' talk on list columns](https://speakerdeck.com/jennybc/putting-square-pegs-in-round-holes-using-list-cols-in-your-dataframe) and uses mutate with map_*(listcolumn, function).

```{r histLmbdaByYear, message=FALSE, fig.cap="histograms of asymptotic population growth for each year."}

library(popdemo)
allA <- ungroup(allA) %>% 
  mutate(A = map(A, matrix, nrow=3, ncol=3,
                 dimnames=list(c("p","j","a"),c("p","j","a")))) %>%
  mutate(lmbda = map_dbl(A, lambda),
         ergodic = map_dbl(A, isErgodic),
         irreduc = map_dbl(A, isIrreducible),
         primitv = map_dbl(A, isPrimitive))
ggplot(allA, aes(x=lmbda)) + geom_histogram(bins = 25) + facet_wrap(~year)+
 ylab("Frequency")+
  xlab(expression(paste(lambda, ", asymptotic population growth"))) + rlt_theme

```

Now, we need to figure out how many matrices are missing transitions. The full matrix has observed transitions in every cell. 

```{r checkErgIrr1, echo=-4}
allA <- ungroup(allA) %>% mutate(missing = map(A, ~which(.x==0)),
                                 n_missing = map_dbl(missing, length))
missing_summary <- summary(allA$n_missing)
ergo_irr <- as.data.frame(with(allA, table(ergodic, irreduc)))

knitr::kable(ergo_irr, caption = "Ergodicity and irreducibility of individual transition matrices.")
```

Out of 11 time periods and `r length(unique(allA$POPNUM))` populations there are a total of `r nrow(allA)` transition matrices. Some periods are missing for some populations. Only `r ergo_irr[4,3]` of these matrices are irreducible. All of the matrices have 2 or more transitions that are zero but known to be possible. 

```{r exampleMatrices, echo=FALSE}
#144 popn 905 year 9 10 individuals go extinct
badA144 <- allA$A[[144]]
badN144 <- allA$N[[144]]
#205 popn 914 year 6 lambda == 1 
badA205 <- allA$A[[205]]
badN205 <- allA$N[[205]]
badA71 <- allA$A[[71]]
badN71 <- allA$N[[71]]
#71 popn 250 year 5 lambda = 0.93
BaseTable <- 
  tribble(~Population, ~Year, ~Stage,     ~N,        ~seedling,    ~juvenile,   ~adult,
          905,             9, "seedling", badN144[1], badA144[1,1],badA144[1,2],badA144[1,3],
          905,             9, "juvenile", badN144[2], badA144[2,1],badA144[2,2],badA144[2,3],
          905,             9, "adult",    badN144[3], badA144[3,1],badA144[3,2],badA144[3,3],
          905,             9, "dead",     NA,         1-sum(badA144[,1]), 1-sum(badA144[,2]), 1-sum(badA144[,3]), 
          914,             6, "seedling", badN205[1], badA205[1,1],badA205[1,2],badA205[1,3],
          914,             6, "juvenile", badN205[2], badA205[2,1],badA205[2,2],badA205[2,3],
          914,             6, "adult",    badN205[3], badA205[3,1],badA205[3,2],badA205[3,3],
          914,             6, "dead",     NA,         1-sum(badA205[,1]), 1-sum(badA205[,2]), 1-sum(badA205[,3]), 
          250,             5, "seedling", badN71[1], badA71[1,1],badA71[1,2],badA71[1,3],
          250,             5, "juvenile", badN71[2], badA71[2,1],badA71[2,2],badA71[2,3],
          250,             5, "adult",    badN71[3], badA71[3,1],badA71[3,2],badA71[3,3],
          250,             5, "dead",     NA,         1-sum(badA71[,1]), 1-sum(badA71[,2]), 1-sum(badA71[,3]))

# make some space for the prior versions
RLT_Tprior <- matrix(c(0.25, 0.025, 0.0,
                       0.05, 0.9,   0.025,
                       0.01, 0.025, 0.95,
                       0.69, 0.05,  0.025), 
                     byrow = TRUE, nrow = 4, ncol = 3)
RLT_Fprior <- matrix(c(0.0, 0.0, 0.025,
                       0.0, 0.0, 0.0,
                       0.0, 0.0, 0.0), 
                     byrow = TRUE, nrow = 3, ncol = 3)

wUnifTable <- bind_cols(BaseTable, BaseTable[,5:7], BaseTable[,5:7])
wUnifTable[1:3, 8:10] <- fill_transitions(allA$TF[[144]], allA$N[[144]])
wUnifTable[4, 8:10] <- as.list(1-colSums(wUnifTable[1:3, 8:10]))
wUnifTable[5:7, 8:10] <- fill_transitions(allA$TF[[205]], allA$N[[205]])
wUnifTable[8, 8:10] <- as.list(1-colSums(wUnifTable[5:7, 8:10]))
wUnifTable[9:11, 8:10] <- fill_transitions(allA$TF[[71]], allA$N[[71]])
wUnifTable[12, 8:10] <- as.list(1-colSums(wUnifTable[9:11, 8:10]))
wUnifTable[1:3, 11:13] <- fill_transitions(allA$TF[[144]], allA$N[[144]], P = RLT_Tprior, priorweight = -1)
wUnifTable[4, 11:13] <- as.list(1-colSums(wUnifTable[1:3, 11:13]))
wUnifTable[5:7, 11:13] <- fill_transitions(allA$TF[[205]], allA$N[[205]], P = RLT_Tprior, priorweight = -1)
wUnifTable[8, 11:13] <- as.list(1-colSums(wUnifTable[5:7, 11:13]))
wUnifTable[9:11, 11:13] <- fill_transitions(allA$TF[[71]], allA$N[[71]], P = RLT_Tprior, priorweight = -1)
wUnifTable[12, 11:13] <- as.list(1-colSums(wUnifTable[9:11, 11:13]))
knitr::kable(wUnifTable, caption = "Two problematic matrices and the matrix with the largest sample size; $\\lambda$ = 0, 1, and 0.92 for population 905, 914 and 205 respectively. The six columns on the right show the effect of assuming a uniform prior for transitions with a total weight of 1, and an RLT prior with total weight equal to 1.  Where columns do not sum to 1 the remaining probability represents mortality.", digits=3, escape = TRUE)
```

## Ignore the problem

Ignoring this problem and calculating stochastic $\lambda$ for each population could yield non-sensical results. For example, the geometric mean $\lambda$ calculated using simulation fails for 6 out of the 23 populations. These are the populations with one matrix that has $\lambda = 0$ at some point. 

```{r stochGrowthRateBad, fig.cap="Stochastic population growth rates for 17 out of 23 orchid populations. "}
sgr <- allA %>% split(.$POPNUM) %>%
  map(~stoch.growth.rate(.x$A, verbose = FALSE))

# distribute results of stoch.growth.rate() into a tbl
xtrc <- function(x){
  data.frame(approx=x$approx,
                sim = x$sim,
                lowerci = x$sim.CI[1],
                upperci = x$sim.CI[2])
}
sgr <- bind_rows(map(sgr, xtrc), .id="POPNUM") 
filter(sgr, !is.na(sim)) %>%
ggplot(aes(x=approx, y = sim)) + geom_point() + 
  geom_errorbar(aes(ymin=lowerci, ymax=upperci)) + 
  geom_abline(slope = 1, intercept=0) + 
  rlt_theme +
  xlab(expression(paste("Tuljapurkar's approximate stochastic ", log(lambda)))) +
  ylab(expression(paste("Stochastic ", log(lambda), " by simulation")))
```

The six populations where the simulated values are unavailable have very low approximate growth rates as well. All the stochastic growth rates are less than 1. Note that these calculations assume each matrix is perfectly observed; the only stochasticity is between year environmental stochasticity as represented by the observed sample of years. 

## Fill in including constraints on transitions

This strategy uses the transitions in a smarter way that maintains the constraint that all survival/transition probabilities have to add up to 1. We do this by using a dirichlet prior with the function `fill_transitions()`. This function takes the matrix as a list of 2 components, a matrix of transition probabilities $T$ and a matrix of fertility contributions $F$. The Dirichlet prior only applies to the $T$ matrix. If not specified, the function assumes a uniform prior across the $m+1$ fates for each stage. The extra category is for individuals that do not survive. Let's look at a single example to see how it works, using population 231 in period 1. This matrix is non-ergodic and reducible, and there are individuals in every stage. 

```{r}
# 
tmat <- allA[23,"TF"][[1]][[1]]$T ## wow! hard to get at ... 
fmat <- allA[23,"TF"][[1]][[1]]$F
N <- allA[23,"N"][[1]][[1]]
tmat + fmat
```

The matrix is reducible because the 2nd column and 2nd row could be eliminated without affecting population growth; this is because the `r N[2]` juveniles died. The matrix is not ergodic because there is no path from the 2nd stage to either of the other stages. 

```{r}
isErgodic((tmat + fmat)[-2,-2])
isIrreducible((tmat + fmat)[-2,-2])
all.equal(lambda((tmat + fmat)[-2,-2]), lambda(tmat + fmat))
```

So, to add a uniform dirichlet prior with a weight of 1 to $T$, we have 4 fates (3 + death), so each fate adds 0.25 to the matrix of *observed* fates (not the transition matrix!).

```{r}
# get matrix of observed fates, including a row for mortality
# apply(tmat, 1, function(x)x * N)
(TNmat <- L_elto %>% 
  mutate(stage = factor(stage, levels=c("p","j","a","m")),
         fate = factor(next_stage, levels=c("p","j","a","m"))) %>%
  filter(POPNUM == allA$POPNUM[23], year == allA$year[23]+1) %>%
  as.data.frame() %>% 
  xtabs(~fate + stage, data = .) %>% 
  as.matrix())
(TNmat2 <- (TNmat + 0.25)[,-4]) # re-cycles automatically, and drop last column (no one starts as m)
```

This matrix has the dirichlet parameters for each state. To convert this to a transition matrix we have to normalize each column by the sum of that column, then drop the last row to return to a square matrix.

```{r}
columnsums <- colSums(TNmat2)
(tmat2 <- sweep(TNmat2, 2, columnsums, FUN = "/")[-4,])
lambda(tmat2 + fmat)
isErgodic(tmat2 + fmat)
isIrreducible(tmat2 + fmat)

```

Comparing these matrices, we can see that all transitions are now represented, even those that were not observed in the actual data. We can automate this exercise for all matrices.

```{r fillDirichlet, fig.cap = "Asymptotic growth rate using a uniform prior with a total weight of 1 vs. the asymptotic growth rate for the observed data.", warning=FALSE}
testing <- allA %>% mutate(Adirch = map2(TF, N, fill_transitions, returnType = "A"),
                           ldirch = map_dbl(Adirch, lambda),                 
         irrdirch = map_dbl(Adirch, isIrreducible),
         ergdirch = map_dbl(Adirch, isErgodic),
         TN = map2(TF, N, fill_transitions, returnType = "TN"))

ggplot(testing, aes(x=lmbda, y=ldirch)) + geom_point() + 
  geom_abline(slope = 1, intercept=0) +
  xlab(expression(paste(lambda, " ignoring zeros"))) + 
  ylab(expression(paste(lambda, " using a uniform prior"))) +
  rlt_theme
```

```{r, eval=FALSE, include=FALSE}
testing %>% summarize(m_ldirch = mean(ldirch)) %>% kableExtra::kable()
```

The matrices with $\lambda = 0$ typically have very few individuals that all die. Matrices with $\lambda = 1$ usually have observed transitions only on the main diagonal. 

```{r checkErgIrr}
ergo_irr <- as.data.frame(with(testing, table(ergdirch, irrdirch)))

knitr::kable(ergo_irr, caption = "Ergodicity and irreducibility of individual transition matrices.")

```

```{r stochDirchVsBad}
sgr_unif <- testing %>% split(.$POPNUM) %>%
  map(~stoch.growth.rate(.x$Adirch, verbose = FALSE))

sgr_unif <- bind_rows(map(sgr_unif, xtrc), .id="POPNUM") 
compare_approx <- tibble(unif = sgr_unif$approx,
                             ignored = sgr$approx)
ggplot(compare_approx, aes(x=ignored, y = unif)) + 
  geom_point() + 
  geom_abline(slope = 1, intercept=0) + 
  xlab(expression(paste("log(",lambda,") ignoring zeros"))) + 
  ylab(expression(paste("log(",lambda,") with uniform prior"))) +
  rlt_theme
```

So using a Dirichlet prior to fill in the gaps works, and doesn't appear to distort the stochastic population growth rate; all populations still have negative growth. Most populations end up with stochastic growth rates that are lower than with the problematic matrices. All resulting matrices are now ergodic and irreducible.

## Fill in using uninformative priors on fertility

The other component of the matrix is the fertility. We do this by using a Gamma prior with the function `fill_fertility()`. This function takes the matrix as a list of 2 components, a matrix of transition probabilities $T$ and a matrix of fertility contributions $F$. The Gamma prior only applies to the $F$ matrix. If not specified, the function assumes an uniformative prior with $\alpha = 0.00001, \beta = 0.00001$ across all stages. Again, demonstrate with a single example. The posterior $\alpha$ is given by the sum of the prior $\alpha = 0.00001$ and the number of offspring next year (2 in this case), while the posterior $\beta$ is simply the sum of the prior and the number of adults contributing to the observed reproduction.

```{r}
post_alpha = 0.00001 + 2
post_beta = 0.00001 + N[3]
```

The expected value for the posterior $F$ is $\alpha / \beta$.

```{r}
fmat2 <- fmat
fmat2[1,3] <- post_alpha / post_beta
fmat2

lambda(tmat + fmat2)
isErgodic(tmat+fmat2)
isIrreducible(tmat+fmat2)
```

The uninformative prior on fertility makes very little difference to lambda, and does not help with issues of ergodicity and irreducibility. 

```{r fillGamma, fig.cap = "Asymptotic growth rate using a uniform prior with a total weight of 1 vs. the asymptotic growth rate for the observed data.", warning=FALSE}
testing <- allA %>% mutate(Agamma = map2(TF, N, fill_fertility, 
                                         alpha = 0.00001, 
                                         beta =  0.00001, 
                                         priorweight = 1, returnType = "A"),
                           lgamma = map_dbl(Agamma, lambda),                 
         irrgamma = map_dbl(Agamma, isIrreducible),
         erggamma = map_dbl(Agamma, isErgodic))

ggplot(testing, aes(x=lmbda, y=lgamma)) + geom_point() + 
  geom_abline(slope = 1, intercept=0) +
  xlab(expression(paste(lambda, " ignoring zeros"))) + 
  ylab(expression(paste(lambda, " using an uninformative prior"))) +
  rlt_theme
```

The matrices with $\lambda = 0$ typically have very few individuals. Matrices with $\lambda = 1$ usually have transitions only on the main diagonal. The uninformative Gamma prior does not change $\lambda$, and does not improve the ergodicity and irreducibility issue. When there is at least 1 adult present, the expected value of fertility is very close to zero. When there are no adults present, all those matrices cannot reach the adult stage and so the expected value of fertility does not affect $\lambda$. However, this is not a necessary condition. 

```{r checkErgIrrGamma}
ergo_irr <- as.data.frame(with(testing, table(erggamma, irrgamma)))

knitr::kable(ergo_irr, caption = "Ergodicity and irreducibility of individual transition matrices.")

```

## Effects of informative priors

We extracted an informative prior from an expert on epiphytic orchids (RLT) to compare with the uniform prior. In particular, we considered the effects of weighting the prior by different fractions of the actual sample size for each population and year. 

```{r makePriors, eval=TRUE}
RLT_Tprior <- matrix(c(0.25, 0.025, 0.0,
                       0.05, 0.9,   0.025,
                       0.01, 0.025, 0.95,
                       0.69, 0.05,  0.025), 
                     byrow = TRUE, nrow = 4, ncol = 3)
RLT_Fprior <- matrix(c(NA_real_, NA_real_, 0.025,
                       NA_real_, NA_real_, NA_real_,
                       NA_real_, NA_real_, NA_real_), 
                     byrow = TRUE, nrow = 3, ncol = 3)
unif_gamma <- matrix(c(NA_real_, NA_real_, 0.00001,
                       NA_real_, NA_real_, NA_real_,
                       NA_real_, NA_real_, NA_real_), 
                     byrow = TRUE, nrow = 3, ncol = 3)
```

```{r calcPriorLambda, include = FALSE}
F1 <- RLT_Fprior
F1[is.na(F1)] <- 0
prior_lambda <- lambda(RLT_Tprior[-4,] + F1)
```

The RLT prior gives $\lambda = `r format(prior_lambda, digits = 2)`$. In an ideal world this prior would be extracted prior to collecting the data. In this case the prior is only being used to demonstrate the technique.

```{r fillDirichletPrior1, fig.cap="Asymptotic growth rates using prior information on transitions and fertility vs. the raw observations alone. The horizontal red line indicates $\\lambda$ for the RLT prior. Points for all four priors shown in transparent grey."}
diffPriors <- list()
diffPriors[["Uniform"]] <- testing %>% 
  mutate(Tprior = map2(TF, N, fill_transitions, returnType = "T"),
         Fprior = map2(TF, N, fill_fertility,
                       alpha = unif_gamma,
                       beta = unif_gamma, priorweight = -1, returnType = "F"),
         Aprior = map2(Tprior, Fprior, `+`),
         lprior = map_dbl(Aprior, lambda))

diffPriors[["RLT, weight = 1"]] <- testing %>% 
  mutate(Tprior = map2(TF, N, fill_transitions, 
                       P = RLT_Tprior, returnType = "T"),
         Fprior = map2(TF, N, fill_fertility,
                       alpha = RLT_Fprior,
                       beta = unif_gamma + 1, priorweight = -1, returnType = "F"),
         Aprior = map2(Tprior, Fprior, `+`),
         lprior = map_dbl(Aprior, lambda))

diffPriors[["RLT, weight = 0.5N"]] <- testing %>% 
  mutate(Tprior = map2(TF, N, fill_transitions, 
                       P = RLT_Tprior, priorweight = 0.5, returnType = "T"),
         Fprior = map2(TF, N, fill_fertility,
                       alpha = RLT_Fprior,
                       beta = unif_gamma + 1, priorweight = 0.5, returnType = "F"),
         Aprior = map2(Tprior, Fprior, `+`),
         lprior = map_dbl(Aprior, lambda))

diffPriors[["RLT, weight = N"]] <- testing %>% 
  mutate(Tprior = map2(TF, N, fill_transitions, 
                       P = RLT_Tprior, priorweight = 1, returnType = "T"),
         Fprior = map2(TF, N, fill_fertility,
                       alpha = RLT_Fprior,
                       beta = unif_gamma + 1, priorweight = 1, returnType = "F"),
         Aprior = map2(Tprior, Fprior, `+`),
         lprior = map_dbl(Aprior, lambda))

diffPriors <- bind_rows(diffPriors, .id="prior")
diffPriors$prior <- factor(diffPriors$prior, levels = c("Uniform", "RLT, weight = 1",
                                                        "RLT, weight = 0.5N", "RLT, weight = N"))
ggplot(diffPriors, aes(x=lmbda, y=lprior)) + 
  geom_point() + 
  geom_point(data=select(diffPriors, -prior), alpha=0.1) + 
  geom_abline(slope = 1, intercept=0, linetype = 2) +
  ylab(expression(paste(lambda, " including prior information"))) + 
  xlab(expression(paste(lambda, " from raw observations"))) + 
  facet_wrap(~prior) + 
  geom_hline(yintercept = 0.96, color="red", linetype = 2 ) + 
  theme_bw() + rlt_theme
```

### Interactive Exploration

You can interactively explore the posterior beta densities and the effects of these informative priors across different populations and years of *Lepanthes eltoroensis* by running the built-in Shiny application:

```{r, eval=FALSE}
raretrans::run_app()
```

# Obtaining credible intervals on vital rates and $\lambda$

Recognizing that the observed transitions are realizations of a Dirichlet distribution gives us a very simple way to derive credible intervals for the transition probabilities in our matrix. The marginal distribution of a single transition is a beta distribution as described above, and we can simply report the 2.5% and 97.5% percentiles of that distribution to provide a credible interval on a transition rate. These intervals do shift and shrink as the weight on the prior increases.

```{r credibleIntervals1, echo=FALSE, fig.width = 7, fig.height = 7, fig.cap="Confidence limits or Credible intervals for transition probabilities out of seedling stage in Population 250, year 5. Shaded bands correspond to 50% (darkest), 90%, 95% and 99.9% (lightest) credible intervals. Smaller intervals are not visible in the p->a transitions when the upper boundary of the interval is smaller than 0.01. Labels are median (lower 95% CI, upper 95% CI). Note that where the median and lower interval are equal to 0.001, the actual value is < 0.001.", warning=FALSE}
# get multinomial confidence interval for Likelihood based interval on NONE
myfunc <- function(x){
  colnames(x) <- c("lcl", "ucl")
  x
}
# this doesn't give the same result as the beta distribution
mnCI <- list(ci_95 = MultinomialCI::multinomialCI(c(1, 7, 0, 3), alpha = 0.05),
             ci_80 = MultinomialCI::multinomialCI(c(1, 7, 0, 3), alpha = 0.2),
             ci_50 = MultinomialCI::multinomialCI(c(1, 7, 0, 3), alpha = 0.5)) %>%
  map(myfunc) %>%
  map(as_data_frame) %>%
  bind_rows(.id = "interval_width")%>%
  mutate(p = rep(c(1, 7, 0, 3) / 11, times=3),
         prior = factor("None", levels = c("None", "Uninformative", "RLT, w = 1", "RLT, w = 0.5N", "RLT, w = N")), 
         transition =  factor(rep(c("s to s", "s to j", "s to a", "s to m"), times=3), levels = c("s to s", "s to j", "s to a", "s to m"))) %>%
  filter(transition != "s to m")

xx <- crossing(prior = c("None", "Uninformative", "RLT, w = 1", "RLT, w = 0.5N", "RLT, w = N"),
         transition = c("s to s", "s to j", "s to a", "s to m"),
         prob = seq(0.001,0.999, 0.001))
params <- tribble(
  ~prior, ~transition, ~a_obs, ~a_prior,
  "None", "s to s",     1,    0,
  "None", "s to j",     7,    0,
  "None", "s to a",     0,    0,
  "None", "s to m",     3,    0,
  "Uninformative", "s to s",     1,.25,    
  "Uninformative", "s to j",     7,.25,
  "Uninformative", "s to a",     0,.25,
  "Uninformative", "s to m",     3,.25,
  "RLT, w = 1", "s to s",     1,.25,    
  "RLT, w = 1", "s to j",     7,.05,
  "RLT, w = 1", "s to a",     0,.01,
  "RLT, w = 1", "s to m",     3,.69,
  "RLT, w = 0.5N", "s to s",     1, 1.375,    
  "RLT, w = 0.5N", "s to j",     7, .275,
  "RLT, w = 0.5N", "s to a",     0, .055,
  "RLT, w = 0.5N", "s to m",     3, 3.795,
  "RLT, w = N", "s to s",     1, 2.75,    
  "RLT, w = N", "s to j",     7, .55,
  "RLT, w = N", "s to a",     0, .11,
  "RLT, w = N", "s to m",     3, 7.59
) 
params <- group_by(params, prior) %>%
  mutate(b_obs = sum(a_obs) - a_obs,
         b_prior = sum(a_prior) - a_prior,
         a = a_obs + a_prior,
         b = b_obs + b_prior,
         lowerci = qbeta(0.025, a, b),
         median = paste0("median=",format(qbeta(0.5, a, b),digits=1,nsmall=3)),
         upperci = qbeta(0.975, a, b),
         expected = paste0("mean=",format(a / (a+b), digits=1, nsmall=3)),
         x = 0.9, y=12) %>%
  ungroup()

# cut strategy doesn't work
# build T/F column for each polygon
xx <- left_join(xx, params, by = c("prior", "transition")) %>%
  mutate(dens = dbeta(prob, a, b),
         prior_dens = dbeta(prob, a_prior, b_prior),
         cump = pbeta(prob, a, b),
         in95 = cump > 0.025 & cump < 0.975,
         in90 = cump > 0.05 & cump < 0.95,
         in80 = cump > 0.1 & cump < 0.9,
         in50 = cump > 0.25 & cump < 0.75,
         prior = factor(prior, levels = c("None", "Uninformative", "RLT, w = 1", "RLT, w = 0.5N", "RLT, w = N")),
         transition = factor(transition, levels = c("s to s", "s to j", "s to a", "s to m"))) 

#xx$layer <- forcats::fct_collapse(xx$layer,
#                                  full = c("full", "ufull"),
#                                  `95%` = c("95%", "u95%"),
#                                  `90%` = c("90%", "u90%"))
params$prior <- factor(params$prior, levels = c("None", "Uninformative", "RLT, w = 1", "RLT, w = 0.5N", "RLT, w = N"))
params$transition <- factor(params$transition, levels = c("s to s", "s to j", "s to a", "s to m"))

# fix alphas
# see discussions at https://stackoverflow.com/questions/17311917/ggplot2-the-unit-of-size 
# to see where I get the conversion factor for points - size to keep fonts the same.
ggplot(filter(xx, transition != "s to m", prior != "None"), aes(x = prob, y=dens)) + #geom_area(aes(alpha = layer)) +
  geom_area(alpha = 0.25) +
  geom_area(data = function(x)x[x$in95,], alpha = 0.25) + 
  geom_area(data = function(x)x[x$in80,], alpha = 0.25) + 
  geom_area(data = function(x)x[x$in50,], alpha = 1) + 
  geom_line(aes(y=prior_dens)) + 
  theme_classic() +
  geom_errorbarh(data = mnCI, mapping = aes(x = p, xmin=lcl, xmax=ucl, size=interval_width, y = 1))+
  geom_point(data = mnCI, mapping = aes(x=p, y=1), color = "white") +
  facet_grid(prior~transition) + #, labeller=label_wrap_gen(width = 15)) + 
  xlab("Transition Probability") + 
  ylab("Probability Density") + 
  ylim(c(0,15)) + 
  geom_text(data=filter(params, transition != "s to m"), 
            mapping=aes(x=x,y=y, label=expected),
            size=12/.pt, hjust=1) +
  geom_text(data=filter(params, transition != "s to m"), 
            mapping=aes(x=x,y=y-3, label=median),
            size=12/.pt, hjust=1) + 
  rlt_theme + theme(panel.spacing = unit(1, "lines"),
                    strip.text = element_text(size=11)) +
  scale_x_continuous(breaks = c(0,0.5,1.0)) + 
  scale_size_manual(values = 
                      c("ci_50"=3, "ci_80" = 2, "ci_95" = 1),
                    guide = FALSE)
```

Obtaining credible intervals on $\lambda$ requires simulation, because it is difficult (or maybe impossible) to use the characteristic equation to combine the probability distributions of the matrix entries to obtain the distribution of $\lambda$. 

```{r lambda_ci, fig.cap="Distribution of asymptotic population growth rates for population 250, year 5, with different prior information."}
#71 popn 250 year 5 lambda = 0.93
diffPriors_lci <- list()
samples = 10000
diffPriors_lci[["Uninformative"]] <- tibble(A = sim_transitions(allA$TF[[71]], 
                                                          allA$N[[71]], 
                                                          samples = samples), 
                                      lprior = map_dbl(A, lambda))

diffPriors_lci[["RLT, weight = 1"]] <- tibble(A = sim_transitions(allA$TF[[71]], allA$N[[71]], 
                                                                  P=RLT_Tprior, 
                                                                  alpha = RLT_Fprior,
                                                                  beta = unif_gamma + 1, 
                                                                  samples = samples), 
                                              lprior = map_dbl(A, lambda))

diffPriors_lci[["RLT, weight = 0.5N"]] <- tibble(A = sim_transitions(allA$TF[[71]], allA$N[[71]], 
                                                                     P=RLT_Tprior,
                                                                     alpha = RLT_Fprior,
                                                                     beta = unif_gamma + 1, 
                                                                     priorweight = 0.5,
                                                                     samples = samples), 
                                                 lprior = map_dbl(A, lambda))

diffPriors_lci[["RLT, weight = N"]] <- tibble(A = sim_transitions(allA$TF[[71]], allA$N[[71]], 
                                                                  P=RLT_Tprior, 
                                                                  alpha = RLT_Fprior,
                                                                  beta = unif_gamma + 1,
                                                                  priorweight = 1,
                                                                  samples = samples), 
                                              lprior = map_dbl(A, lambda))

diffPriors_lci <- bind_rows(diffPriors_lci, .id="prior")
diffPriors_lci$prior <- factor(diffPriors_lci$prior, levels = c("Uninformative", "RLT, weight = 1",
                                                                "RLT, weight = 0.5N", "RLT, weight = N"))

diffPriors_lci_sum <- group_by(diffPriors_lci, prior) %>% 
  summarize(pgt1 = sum(lprior > 1)/samples,
            lmedian = median(lprior),
            lmean = mean(lprior),
            label = paste0("p(lambda > 1) == ",pgt1)) %>% 
  mutate(x = 0.7, y = 10) %>% 
  ungroup()
ggplot(diffPriors_lci, aes(x=lprior)) + 
  geom_density(fill="grey75") + 
  geom_vline(data = diffPriors_lci_sum,
             mapping = aes(xintercept = lmedian), linetype = 2, color = "red") + 
  geom_text(data = diffPriors_lci_sum,
            mapping = aes(x = x, y = y, label = label), parse = TRUE, hjust = "left") + 
  facet_wrap(~prior) + 
  theme_classic() + 
  rlt_theme + 
  xlab(expression(paste(lambda," asymptotic population growth"))) +
  ylab("Posterior density")
```

\clearpage

# References