Type: | Package |
Title: | Binary and Categorical Image Similarity Index |
Version: | 0.2.4 |
Description: | Computes a structural similarity metric (after the style of MS-SSIM for images) for binary and categorical 2D and 3D images. Can be based on accuracy (simple matching), Cohen's kappa, Rand index, adjusted Rand index, Jaccard index, Dice index, normalized mutual information, or adjusted mutual information. In addition, has fast computation of Cohen's kappa, the Rand indices, and the two mutual informations. Implements the methods of Thompson and Maitra (2020) <doi:10.48550/arXiv.2004.09073>. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | Rcpp |
RoxygenNote: | 7.3.2 |
URL: | https://gzt.github.io/catsim/ |
BugReports: | https://github.com/gzt/catsim/issues |
LinkingTo: | Rcpp, testthat |
Suggests: | testthat, covr, knitr, rmarkdown |
VignetteBuilder: | knitr |
Depends: | R (≥ 3.6) |
NeedsCompilation: | yes |
Packaged: | 2024-09-30 20:22:18 UTC; gzt |
Author: | Geoffrey Thompson |
Maintainer: | Geoffrey Thompson <gzthompson@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-10-01 05:00:02 UTC |
CatSIM: Binary and Categorical Image Structural Similarity Metric
Description
Computes a structural similarity metric (after the style of MS-SSIM for images) for binary and categorical 2D and 3D images. Can be based on accuracy (simple matching), Cohen's kappa, Rand index, adjusted Rand index, Jaccard index, Dice index, normalized mutual information, or adjusted mutual information. In addition, has fast computation of Cohen's kappa, the Rand indices, and the two mutual informations.
Author(s)
Maintainer: Geoffrey Thompson gzthompson@gmail.com (ORCID)
See Also
Useful links:
Report bugs at https://github.com/gzt/catsim/issues
A hand-constructed image from Besag (1986)
Description
An 100 \times 88
matrix representing a two-color
hand-drawn scene designed specifically to contain some
awkward features for an image reconstruction method
evaluated in the paper.
Usage
data(besag)
Format
an 100 \times 88
matrix with entries 1
and 2
denoting the color of the corresponding pixels. The example code will
produce the image as it is in the original paper. To use as a 0-1
binary
dataset, either use besag - 1
or besag %% 2
.
References
J. Besag, “On the statistical analysis of dirty pictures,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 48, no. 3, pp. 259–279, 1986. doi:10.1111/j.2517-6161.1986.tb01412.x
Examples
image(besag[, 88:1])
Categorical Structural Similarity Index Metric (whole image)
Description
This computes the categorical or binary structural similarity index metric on a whole-image scale. The difference between this and the default 2-D method is that this considers the whole image at once and one scale rather than computing the index over a sliding window and downsampling to consider it at other scales.
Usage
binssim(
x,
y,
alpha = 1,
beta = 1,
gamma = 1,
c1 = 0.01,
c2 = 0.01,
method = "Cohen",
...
)
Arguments
x , y |
binary or categorical image |
alpha |
normalizing parameter, by default 1 |
beta |
normalizing parameter, by default 1 |
gamma |
normalizing parameter, by default 1 |
c1 |
small normalization constant for the |
c2 |
small normalization constant for the |
method |
whether to use Cohen's kappa ( |
... |
Constants can be passed to the components of the index. |
Value
Structural similarity index.
Examples
set.seed(20181207)
x <- matrix(sample(1:4, 10000, replace = TRUE), nrow = 100)
y <- x
for (i in 1:100) y[i, i] <- 1
for (i in 1:99) y[i, i + 1] <- 1
binssim(x, y)
Multiscale Categorical Structural Similarity Index Measure (2D)
Description
The categorical structural similarity index measure for 2D categorical or binary images for multiple scales. The default is to compute over 5 scales.
Usage
catmssim_2d(
x,
y,
levels = NULL,
weights = NULL,
window = 11,
method = "Cohen",
...,
random = "random"
)
Arguments
x , y |
a binary or categorical image |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
window |
by default 11 for 2D and 5 for 3D images,
but can be specified as a
vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
method |
whether to use Cohen's kappa ( |
... |
additional constants can be passed to internal functions. |
random |
whether to have deterministic PRNG ( |
Value
a value less than 1 indicating the similarity between the images.
Examples
set.seed(20181207)
x <- matrix(sample(0:3, 128^2, replace = TRUE), nrow = 128)
y <- x
for (i in 1:128) y[i, i] <- 0
for (i in 1:127) y[i, i + 1] <- 0
catmssim_2d(x, y, method = "Cohen", levels = 2) # the default
# now using a different similarity score (Jaccard Index)
catmssim_2d(x, y, method = "NMI")
Multiscale Categorical Structural Similarity Index Measure for a Cube (3D)
Description
The categorical structural similarity index measure for 3D
categorical or binary images for multiple scales.
The default is to compute over 5 scales.
This computes a 3D measure based on 5 \times 5 \times 5
windows by default with 5 levels of downsampling.
Usage
catmssim_3d_cube(
x,
y,
levels = NULL,
weights = NULL,
window = 5,
method = "Cohen",
...,
random = "random"
)
Arguments
x , y |
a binary or categorical image |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
window |
by default 11 for 2D and 5 for 3D images,
but can be specified as a
vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
method |
whether to use Cohen's kappa ( |
... |
additional constants can be passed to internal functions. |
random |
whether to have deterministic PRNG ( |
Value
a value less than 1 indicating the similarity between the images.
Examples
set.seed(20181207)
dim <- 16
x <- array(sample(0:4, dim^3, replace = TRUE), dim = c(dim, dim, dim))
y <- x
for (j in 1:dim) {
for (i in 1:dim) y[i, i, j] <- 0
for (i in 1:(dim - 1)) y[i, i + 1, j] <- 0
}
catmssim_3d_cube(x, y, weights = c(.75, .25))
# Now using a different similarity score
catmssim_3d_cube(x, y, weights = c(.75, .25), method = "Accuracy")
Multiscale Categorical Structural Similarity Index Measure by Slice (3D)
Description
The categorical structural similarity index measure for 3D categorical or binary images for multiple scales. The default is to compute over 5 scales. This computes a 2D measure for each x-y slice of the z-axis and then averages over the z-axis.
Usage
catmssim_3d_slice(
x,
y,
levels = NULL,
weights = NULL,
window = 11,
method = "Cohen",
...,
random = "random"
)
Arguments
x , y |
a binary or categorical image |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
window |
by default 11 for 2D and 5 for 3D images,
but can be specified as a
vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
method |
whether to use Cohen's kappa ( |
... |
additional constants can be passed to internal functions. |
random |
whether to have deterministic PRNG ( |
Value
a value less than 1 indicating the similarity between the images.
Examples
set.seed(20181207)
dim <- 8
x <- array(sample(0:4, dim^5, replace = TRUE), dim = c(dim^2, dim^2, dim))
y <- x
for (j in 1:(dim)) {
for (i in 1:(dim^2)) y[i, i, j] <- 0
for (i in 1:(dim^2 - 1)) y[i, i + 1, j] <- 0
}
catmssim_3d_slice(x, y, weights = c(.75, .25)) # by default method = "Cohen"
# compare to some simple metric:
mean(x == y)
Multiscale Categorical Structural Similarity Index Measure
Description
The categorical structural similarity index measure for 2D or 3D categorical or
binary images for multiple scales. The default is to compute over 5 scales.
This determines whether this is a 2D or 3D image and applies the appropriate
windowing, weighting, and scaling. Additional arguments can be passed.
This is a wrapper function for the 2D and 3D functions whose functionality
can be accessed through the ... arguments. This function is a wrapper for the
catmssim_2d()
, catmssim_3d_slice()
, and
catmssim_3d_cube()
functions.
Usage
catsim(
x,
y,
...,
cube = TRUE,
levels = NULL,
weights = NULL,
method = "Cohen",
window = NULL
)
Arguments
x , y |
a binary or categorical image |
... |
additional arguments, such as window, can be passed as well as arguments for internal functions. |
cube |
for the 3D method, whether to use the true 3D method
(cube or |
levels |
how many levels of downsampling to use. By default, 5. If
|
weights |
a vector of weights for the different scales. By default,
equal to |
method |
whether to use Cohen's kappa ( |
window |
by default 11 for 2D and 5 for 3D images, but can be
specified as a vector if the window sizes differ by dimension.
The vector must have the same number of
dimensions as the inputted |
Value
a value less than 1 indicating the similarity between the images.
Examples
set.seed(20181207)
dim <- 16
x <- array(sample(0:4, dim^3, replace = TRUE), dim = c(dim, dim, dim))
y <- x
for (j in 1:dim) {
for (i in 1:dim) y[i, i, j] <- 0
for (i in 1:(dim - 1)) y[i, i + 1, j] <- 0
}
catsim(x, y, weights = c(.75, .25))
# Now using a different similarity score
catsim(x, y, levels = 2, method = "accuracy")
# with the slice method:
catsim(x, y, weights = c(.75, .25), cube = FALSE, window = 8)
Diversity Indices
Description
gini()
is a measure of diversity that goes by a
number of different names, such as the probability of interspecific encounter
or the Gibbs-Martin index. It is 1 - sum(p_i^2)
, where p_i
is the
probability of observing class i.
The corrected Gini-Simpson index, ginicorr
takes the
index and corrects it so that the maximum possible is 1. If there are
k
categories, the maximum possible of the uncorrected index is
1-1/k
. It corrects the index by dividing by the maximum.
k
must be specified.
The modified Gini-Simpson index is similar to the unmodified,
except it uses the square root of the summed squared
probabilities, that is, 1 - \sqrt{ sum(p_i^2)}
, where p_i
is the
probability of observing class i.
The modified corrected Gini index then
corrects the modified index for the number of categories, k
.
Usage
gini(x)
ginicorr(x, k)
sqrtgini(x)
sqrtginicorr(x, k)
Arguments
x |
binary or categorical image or vector |
k |
number of categories |
Value
The index (between 0 and 1), with 0 indicating no variation and 1
being maximal. The Gini index is bounded above by 1-1/k
for a group
with k
categories. The modified index is bounded above by
1-1/\sqrt{k}
. The corrected indices fix this by dividing by the
maximum.
Examples
x <- rep(c(1:4), 5)
gini(x)
x <- rep(c(1:4), 5)
ginicorr(x, 4)
x <- rep(c(1:4), 5)
sqrtgini(x)
x <- rep(c(1:4), 5)
sqrtginicorr(x, 4)
An example fMRI phantom
Description
A 128 \times 128
activation map for a slice of an fMRI
phantom and an anatomical reference.
Usage
data(hoffmanphantom)
Format
a 128 \times 128 \times 2
array with the
first slice an activation map for an MRI phantom and the second an
anatomical overlay. NA
values are outside the surface. The
activation map (hoffmanphantom[,,1]
) is 1 if activated,
0 otherwise. The second layer (hoffmanphantom[,,2]
)
indicates the anatomical structure. Approximately 3.8 percent of the
pixels are activated in this slice.
References
E. Hoffman, P. Cutler, W. Digby, and J. Mazziotta, “3-D phantom to simulate cerebral blood flow and metabolic images for PET,” Nuclear Science, IEEE Transactions on, vol. 37, pp. 616 – 620, 05 1990.
I. A. Almodóvar-Rivera and R. Maitra, “FAST adaptive smoothed thresholding for improved activation detection in low-signal fMRI,” IEEE Transactions on Medical Imaging, vol. 38, no. 12, pp. 2821–2828, 2019.
Examples
image(hoffmanphantom[, , 2], col = rev(gray(0:15 / 16))[1:4], axes = FALSE)
image(hoffmanphantom[, , 1],
add = TRUE, zlim = c(0.01, 1),
col = c("yellow", "maroon")
)
Similarity Indices
Description
The Rand index, rand_index, computes the agreement between two different clusterings or partitions of the same set of objects. The inputs to the function should be binary or categorical and of the same length.
The adjusted Rand index, adj_rand
,
computes a corrected version
of the Rand index, adjusting for the probability
of chance agreement of clusterings. A small constant is added to the
numerator and denominator of the adjusted Rand index to ensure stability
when there is a small or 0 denominator, as it is possible to have a zero
denominator.
Cohen's kappa, cohen_kappa
,
is an inter-rater agreement metric for two raters which
corrects for the probability of chance agreement. Note
there is a difference here
between this measure and the Rand indices and mutual information:
those consider the similarities of the groupings of points,
while this considers how often the
raters agreed on individual points.
Like the Rand index, the mutual information
computes the agreement between two different clusterings or
partitions of the same set of objects. If H(X)
is the
entropy of some probability distribution X
, then
the mutual information of two distributions is
I(X;Y) = -H(X,Y) +H(X) + H(Y)
.
The normalized mutual information, normalized_mi
, is defined here as:
2I(X;Y)/(H(X)+H(Y)),
but is set to be 0 if both H(X) and H(Y) are 0.
The adjusted mutual information, adjusted_mi
,
is a correction of the mutual information to account
for the probability of chance agreement in a manner similar to the
adjusted Rand index
or Cohen's kappa.
Usage
rand_index(x, y, na.rm = FALSE)
adj_rand(x, y, na.rm = FALSE)
cohen_kappa(x, y, na.rm = FALSE)
normalized_mi(x, y, na.rm = FALSE)
adjusted_mi(x, y, na.rm = FALSE)
Arguments
x , y |
a numeric or factor vector or array |
na.rm |
whether to remove |
Value
the similarity index, which is between 0 and 1 for most of the options. The adjusted Rand and Cohen's kappa can be negative, but are bounded above by 1.
References
W. M. Rand (1971). "Objective criteria for the evaluation of clustering methods". Journal of the American Statistical Association. American Statistical Association. 66 (336): 846–850. doi:10.2307/2284239
Lawrence Hubert and Phipps Arabie (1985). "Comparing partitions". Journal of Classification. 2 (1): 193–218. doi:10.1007/BF01908075
Cohen, Jacob (1960). "A coefficient of agreement for nominal scales". Educational and Psychological Measurement. 20 (1): 37–46. doi:10.1177/001316446002000104
Jaccard, Paul (1912). "The distribution of the flora in the alpine zone,” New Phytologist, vol. 11, no. 2, pp. 37–50. doi:10.1111/j.1469-8137.1912.tb05611.x
Nguyen Xuan Vinh, Julien Epps, and James Bailey (2010). Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. J. Mach. Learn. Res. 11 (December 2010), 2837–2854. https://jmlr.org/papers/v11/vinh10a.html
Examples
x <- rep(0:5, 5)
y <- c(rep(0:5, 4), rep(0, 6))
# Simple Matching, or Accuracy
mean(x == y)
# Hamming distance
sum(x != y)
rand_index(x, y)
adj_rand(x, y)
cohen_kappa(x, y)
normalized_mi(x, y)
adjusted_mi(x, y)