Mixing model in Short Creek

George G. Vega Yon, Ph.D.

2026-02-03

Warning

The data and model used in this vignette are for demonstration purposes only and do not reflect real-world conditions accurately. As in most agent-based models, the simulations performed here are simplified representations of the reality and provide rich information for scenario modeling, but not necessarily for forecasting or parameter estimation.

During the 2025 US Measles outbreak, the number of cases in the state of Utah started to climb quickly in the Short Creek region (on the Utah-Arizona border). This vignette shows how we can use the mixing model of the measles R package to do scenario modeling of the situation using both age and school data.

Set up

We start by loading the measles R package. Since the package depends on epiworldR, epiworldR is loaded automatically too:

library(measles)
Loading required package: epiworldR
Thank you for using epiworldR! Please consider citing it in your work.
You can find the citation information by running
  citation("epiworldR")
===================== {measles} note =====================
The installed version of the {epiworldR} R package
(0.11.0.1) is not the latest available. Since {epiworldR}
is actively used in public health responses, the package
is continually updated. Please install version 0.11.2.0.
==========================================================

For the data, we need two datasets: information about the population distribution by age and vaccination status, and the contact matrix. The measles R package comes with a pre-processed dataset that was created using the multigroup.vaccine R package. This data uses real vaccination rates from publicly available school records, and population age structure and composition from the latest US census. Vaccination rates for the non-school-aged population were imputed based on assumptions and do not reflect the actual vaccination information for those age groups. We can load the needed data using the data() function:

data(short_creek, package = "measles")
data(short_creek_matrix, package = "measles")

Let’s take a look at the datasets:

short_creek
   age_labels agepops agelims vacc_rate
1      under1     113       1 0.0000000
2        1to4     450       4 0.2355556
3     5to11s1     281      11 0.0569395
4     5to11s2     393      11 0.3180662
5     5to11s3     213      11 0.3661972
6    12to13s4      85      13 0.2235294
7    12to13s5     149      13 0.3557047
8    12to13s6      83      13 0.5783133
9    14to17s7     178      17 0.1460674
10   14to17s8     169      17 0.2307692
11   14to17s9     320      17 0.2812500
12     18to24     892      24 0.2331839
13     25to44    1279      44 0.5770133
14     45to69     962      69 0.9209979
15        70+      63      90 1.0000000

As we can see, the vaccination rates in this area are significantly low. The vaccination rate needed to avert an outbreak is close to the 0.93 percent of the population. Let’s now look at the first few entries of the contact matrix

# Looking at the first 5 columns
short_creek_matrix[, 1:5] |>
  round(2)
         under1 1to4 5to11s1 5to11s2 5to11s3
under1     0.10 0.17    0.05    0.07    0.04
1to4       0.02 0.35    0.06    0.09    0.05
5to11s1    0.01 0.06    0.50    0.08    0.05
5to11s2    0.01 0.06    0.06    0.52    0.05
5to11s3    0.01 0.06    0.06    0.08    0.48
12to13s4   0.00 0.01    0.05    0.07    0.04
12to13s5   0.00 0.01    0.05    0.07    0.04
12to13s6   0.00 0.01    0.05    0.07    0.04
14to17s7   0.00 0.01    0.01    0.02    0.01
14to17s8   0.00 0.01    0.01    0.02    0.01
14to17s9   0.00 0.01    0.01    0.02    0.01
18to24     0.00 0.02    0.01    0.02    0.01
25to44     0.01 0.07    0.04    0.06    0.03
45to69     0.01 0.05    0.03    0.04    0.02
70+        0.00 0.05    0.03    0.05    0.03
# Looking at it as a heatmap
image(short_creek_matrix)

One important thing to note is that the data in short_creek must coincide with that in the contact matrix, particularly, there should be one entry in the data.frame for each row/column of the contact matrix.

Creating the model

To create the model, we do the following:

N <- sum(short_creek$agepops)

measles_model <- ModelMeaslesMixing(
  n                            = N,
  prevalence                   = 1 / N,
  contact_rate                 = 15 / 0.9 / 4,
  transmission_rate            = 0.9,
  vax_efficacy                 = 0.97,
  contact_matrix               = short_creek_matrix,
  hospitalization_rate         = 0.1,
  hospitalization_period       = 7,
  days_undetected              = 2,
  quarantine_period            = 21,
  quarantine_willingness       = 0.9,
  isolation_willingness        = 0.9,
  isolation_period             = 4,
  prop_vaccinated              = 0.95,
  contact_tracing_success_rate = 0.8,
  contact_tracing_days_prior   = 4
)

Now, since this is a mixing model, we are required to specify the entities. To do so, we can either use the functions entity() and add_entity(), or use the wrapper add_entities_from_dataframe() as follows:

# Adding the entities to the model
add_entities_from_dataframe(
  model = measles_model,
  entities = short_creek,
  col_name = "age_labels",
  col_number = "agepops",
  as_proportion = FALSE
)

We can also specify the vaccination rates. There are multiple ways in which we can do that. The default behavior of the model is, at each run, randomly sample prop_vaccinated percent of the population and assign the vaccine. Thus, the number of vaccinated individuals will be constant across simulations. Nonetheless, since the schools and age groups have different vaccination rates, it is better to use the distribute_tool_to_entities() function. Like the add_entities_from_dataframe() function, the distribute_tool_to_entities() function simplifies the process:

# Creating the distribution function
dist_fun <- distribute_tool_to_entities(
  prevalence = short_creek$vacc_rate,
  as_proportion = TRUE
)

# Setting the distribution function
measles_model |>
  get_tool(0) |>
  set_distribution_tool(dist_fun)

Now, we can specify what to record from the simulation.

measles_model |>
  run_multiple(
    ndays = 100,
    nsims = 100,
    seed  = 8812,
    saver = make_saver("outbreak_size", "hospitalizations"),
    nthreads = 2L
  )
Starting multiple runs (100) using 2 thread(s)
_________________________________________________________________________
_________________________________________________________________________
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| done.

After calling the function run_multiple(), the C++ function will write the information to disk. Before we read in the data, we can take a look at the summary of the model, which will give us an overview of the last run, including how much time it spend per simulation, and what is the transsition matrix for the current run.

summary(measles_model)
________________________________________________________________________________
________________________________________________________________________________
SIMULATION STUDY

Name of the model   : Measles with Mixing and Quarantine
Population size     : 5630
Agents' data        : (none)
Number of entities  : 15
Days (duration)     : 100 (of 100)
Number of viruses   : 1
Last run elapsed t  : 0.00s
Total elapsed t     : 5.00s (100 runs)
Last run speed      : 4.30 million agents x day / second
Average run speed   : 9.97 million agents x day / second
Rewiring            : off

Global events:
 - Update infected individuals (runs daily)

Virus(es):
 - Measles

Tool(s):
 - Vaccine

Model parameters:
 - (IGNORED) Vax improved recovery : 0.5000
 - Contact rate                    : 4.1667
 - Contact tracing days prior      : 4.0000
 - Contact tracing success rate    : 0.8000
 - Days undetected                 : 2.0000
 - Hospitalization period          : 7.0000
 - Hospitalization rate            : 0.1000
 - Incubation period               : 12.0000
 - Isolation period                : 4.0000
 - Isolation willingness           : 0.9000
 - Prodromal period                : 4.0000
 - Quarantine period               : 21.0000
 - Quarantine willingness          : 0.9000
 - Rash period                     : 3.0000
 - Transmission rate               : 0.9000
 - Vaccination rate                : 0.9500
 - Vax efficacy                    : 0.9700

Distribution of the population at time 100:
  - ( 0) Susceptible             : 5629 -> 2017
  - ( 1) Exposed                 :    1 -> 75
  - ( 2) Prodromal               :    0 -> 33
  - ( 3) Rash                    :    0 -> 7
  - ( 4) Isolated                :    0 -> 4
  - ( 5) Isolated Recovered      :    0 -> 17
  - ( 6) Detected Hospitalized   :    0 -> 11
  - ( 7) Quarantined Exposed     :    0 -> 5
  - ( 8) Quarantined Susceptible :    0 -> 0
  - ( 9) Quarantined Prodromal   :    0 -> 1
  - (10) Quarantined Recovered   :    0 -> 0
  - (11) Hospitalized            :    0 -> 7
  - (12) Recovered               :    0 -> 3453

Transition Probabilities:
 - Susceptible              0.99  0.01     -     -     -     -     -  0.00  0.00     -     -     -     -
 - Exposed                     -  0.89  0.08     -     -     -     -  0.02     -  0.00     -     -     -
 - Prodromal                   -     -  0.74  0.25  0.00     -     -     -     -  0.01     -     -     -
 - Rash                        -     -     -  0.17  0.17  0.29  0.05     -     -     -     -  0.05  0.28
 - Isolated                    -     -     -  0.02  0.31  0.55  0.09     -     -     -     -  0.01  0.04
 - Isolated Recovered          -     -     -     -     -  0.60     -     -     -     -     -     -  0.40
 - Detected Hospitalized       -     -     -     -     -     -  0.84     -     -     -     -     -  0.16
 - Quarantined Exposed         -  0.02  0.00     -     -     -     -  0.90     -  0.09     -     -     -
 - Quarantined Susceptible  0.05     -     -     -     -     -     -     -  0.95     -     -     -     -
 - Quarantined Prodromal       -     -  0.02     -  0.25     -     -     -     -  0.73     -     -     -
 - Quarantined Recovered       -     -     -     -     -     -     -     -     -     -     -     -     -
 - Hospitalized                -     -     -     -     -     -     -     -     -     -     -  0.85  0.15
 - Recovered                   -     -     -     -     -     -     -     -     -     -     -     -  1.00

To retrieve the results, we use the run_multiple_get_results() function:

# Extracting the results
ans <- measles_model |>
  run_multiple_get_results(
    freader = data.table::fread
  )

# Taking a look at the structure
str(ans)
List of 2
 $ outbreak_size   :Classes 'epiworld_multiple_save_i', 'data.table' and 'data.frame':  10100 obs. of  5 variables:
  ..$ sim_num      : int [1:10100] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ date         : int [1:10100] 0 1 2 3 4 5 6 7 8 9 ...
  ..$ virus_id     : int [1:10100] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ virus        : chr [1:10100] "Measles" "Measles" "Measles" "Measles" ...
  ..$ outbreak_size: int [1:10100] 1 1 1 1 1 1 1 1 4 7 ...
  ..- attr(*, ".internal.selfref")=<externalptr> 
  ..- attr(*, "what")= chr "outbreak_size"
 $ hospitalizations:Classes 'epiworld_multiple_save_i', 'data.table' and 'data.frame':  7483 obs. of  6 variables:
  ..$ sim_num : int [1:7483] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ date    : int [1:7483] 18 25 26 29 30 31 32 33 34 35 ...
  ..$ virus_id: int [1:7483] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ tool_id : int [1:7483] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
  ..$ count   : int [1:7483] 1 1 1 1 5 2 5 4 6 4 ...
  ..$ weight  : int [1:7483] 1 1 1 1 5 2 5 4 6 4 ...
  ..- attr(*, ".internal.selfref")=<externalptr> 
  ..- attr(*, "what")= chr "hospitalizations"
 - attr(*, "class")= chr [1:2] "epiworld_multiple_save" "list"

The function call will get us the results as a list of data.frames (data.table objects in this case). We will use the data.table package to manipulate the information.

# Converting into data.table format for convenience
library(data.table)
outbreak_size <- ans$outbreak_size |> as.data.table()
hospitalizations <- ans$hospitalizations |> as.data.table()

Outbreak size

Finally, since we are only interested about the final outbreak size (in this case), can will collapse the data to get the total number of cases at the final simulation day. Subsequently, we can plot the results using the hist function:

# Aggregating to get the final
with(
  outbreak_size,
  {
    plot(
      x = date,
      y = outbreak_size,
      col = adjustcolor("blue", alpha.f = .2),
      pch = 19,
      ylab = "Cases",
      xlab = "Day",
      main = "Measles outbreak size in Short Creek",
      sub = "Mixing model with age and school data (100 simulations)"
    )
})

We can also investigate the distribution of the final counts

# Aggregating to get the final
with(
  outbreak_size[date == max(date)],
  {
    hist(
      outbreak_size,
      main = "Measles outbreak size in Short Creek",
      sub = "Mixing model with age and school data (100 simulations)",
      breaks = 20
    )
})

Hospitalizations

In the case of the hospitalizations, we can draw a similar figure. The hospitalizations data contains the following information:

  1. Cases per virus (just measles in this case).
  2. Cases per tool (or the lack of). Information about “no tool” is recorded with a tool_id == -1.
  3. The counts (how many records in the data).
  4. Weights.

The weights attribute is to ensure that we take into consideration counting individuals more than once. For instance, if a model has more than one tool (not just vaccination), the individual who has two tools would be included twice in count. Thus, if we wanted to count the raw number of hospitalization cases, we would add across weight, but the count variable yields how many individuals were hospitalized under that combination of tool and virus id.

hosp_tot <- hospitalizations[, .(total = sum(count)), by = .(sim_num, tool_id)]
hosp_tot[, status := fifelse(tool_id == -1, "Unvax", "Vax")]
hosp_tot[, tool_id := NULL]

We can create a couple of boxplot to show how many cases we see per vaccination status:

hosp_tot <- merge(
  hosp_tot[status == "Unvax", .(sim_num, unvax = total)],
  hosp_tot[status == "Vax", .(sim_num, vax = total)],
  all = TRUE
)

# Filling zeros
hosp_tot[, unvax := fcoalesce(unvax, 0L)]
hosp_tot[, vax := fcoalesce(vax, 0L)]

hosp_tot[, .(vax, unvax)] |>
  as.matrix() |>
  boxplot(
    ylab = "Count",
    xlab = "Status",
    main = "Hospitalization per vaccination status",
    sub  = "Mixing model with age and school data (100 simulations)"
  )

Final comments

Ultimately, based on the current results, we would be expecting all unvaccinated individuals to become infected, with a small number of vaccinated individuals also getting infected. This is consistent with the high R0 of measles and the low vaccination rates in the area. Nonetheless, these results should be taken with caution, as this simulation is a simplification of reality that does not consider, for example, individuals’ changing their behavior as a function of what is observed; it would be natural to expect some level of precaution, e.g., reducing contact rates, would be observed in such a situation.