In contrast with counts of finds as a depositional phenomenon (see
the vignette Measures of Homogeneity for Depositional
Contexts), it is just as much of interest to evaluate the
homogeneity between contexts for an amount that was “in use.” Treating
counts of finds as minimum thresholds in a random right-censored model,
arkhaia offers functionality for simulating potential
distributions of use counts. For more information, see the paper,
“Random Right Censoring of Archaeological Count Data” (under
review).
Consider counts of artifacts at two different sites, S1 and S2, and two different time periods, T1 and T2. Have the distributional patterns of artifact use according to type become more or less homogenous across sites over time? This can be difficult to assess in this example, given that types A and B are not attested in the second time period (T2):
x1 <- c(2, 0, 10, 11, 5)
x2 <- c(1, 1, 17, 23, 3)
x3 <- c(2, 81, 11)
x4 <- c(5, 0, 1)
T1 <- matrix(c(x1, x2), ncol = 2)
T2 <- matrix(c(x3, x4), ncol = 2)
colnames(T1) <- c("S1T1", "S2T1")
rownames(T1) <- LETTERS[1:5]
colnames(T2) <- c("S1T2", "S2T2")
rownames(T2) <- LETTERS[3:5]
T1
#> S1T1 S2T1
#> A 2 1
#> B 0 1
#> C 10 17
#> D 11 23
#> E 5 3
T2
#> S1T2 S2T2
#> C 2 5
#> D 81 0
#> E 11 1First, for counts of artifacts in use (as opposed to those deposited)
the rate of a Poisson distribution is estimated for each artifact type,
treating the counts for each type as a minimum threshold (i.e., it is
not simply computing counts per context). For this task, the matrix
should have types along columns, not rows, using the
pois_rens() function, either retaining or omitting zeros
(the latter is recommended):
T1 <- t(T1)
T2 <- t(T2)
pois_rcens(T1, omit_zero = TRUE) # omit_zero = TRUE is default
#> A B C D E
#> 4.00 0.01 15.30 18.17 6.32
pois_rcens(T1, omit_zero = FALSE)
#> A B C D E
#> 4.00 2.00 15.30 18.17 6.32To generate a sample contingency table of use counts, the function
trunc_pois() ensures that all counts are greater than or
equal to those in the contingency table of counts deposited:
set.seed(9)
trunc_pois(T1) # omitting zeros
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 4 0 17 20 8
#> [2,] 3 1 17 24 13
trunc_pois(T2) # omitting zeros
#> [,1] [,2] [,3]
#> [1,] 4 81 11
#> [2,] 5 0 4One can estimate the effect size of the table (e.g., bias-corrected
Cramér’s \(V\)) in order to assess the
homogeneity of its distribution. This can be done more effectively with
the function VB_trunc_pois(), which will generate a number
of samples of \(V_B\) by sampling over
a specified number of iterations:
As the bias-corrected estimate \(V_B\) will be zero-inflated, it is recommend to retain samples on the support \((0,1)\) rather than \([0,1]\):
VB_T1 <- VB_T1[VB_T1 > 0]
VB_T2 <- VB_T2[VB_T2 > 0]
VB_T1 <- sample(VB_T1, 10^3, replace = TRUE)
VB_T2 <- sample(VB_T2, 10^3, replace = TRUE)To evaluate the change in the homogeneity of artifact distributions between the two time periods, the difference fof the distributions is evaluated, as \(D = V_{B1} - V_{B2}\), such that if \(D > 0\) then the second time period has a more homogeneous distribution than the first.
As the distribution of \(D\) is entirely negative, it instead indicates that the first time period has a more homogeneous distribution of artifact types than the second period.