Type: | Package |
Title: | Density-Weighted Proportion |
Version: | 1.1 |
Date: | 2023-06-30 |
Description: | Fit a Poisson regression to carcass distance data and integrate over the searched area at a wind farm to estimate the fraction of carcasses falling in the searched area and format the output for use as the dwp parameter in the 'GenEst' or 'eoa' package for estimating bird and bat mortality, following Dalthorp, et al. (2022) <doi:10.48550/arXiv.2201.10064>. |
Depends: | R (≥ 3.6.0) |
License: | CC0 |
LazyData: | true |
Encoding: | UTF-8 |
Imports: | boot, expint, GenEst, gtools, invgamma, magrittr, MASS, matrixStats, methods, mvtnorm, numDeriv, plotrix, pracma, sf, statmod, VGAM |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2023-06-30 19:36:40 UTC; ddalt |
Author: | Daniel Dalthorp [aut, cre], Manuela Huso [aut] |
Maintainer: | Daniel Dalthorp <ddalthorp@protonmail.com> |
Repository: | CRAN |
Date/Publication: | 2023-07-01 07:40:05 UTC |
Calculate Area of Intersection inside Circle and Square with Common Center
Description
Calculate Area of Intersection inside Circle and Square with Common Center
Usage
Acins(r, s)
Arguments
r |
radius of the circle (vector or scalar) |
s |
half-width of the square (scalar) |
Value
vector of intersections of interiors of circles with squaure
Examples
# calculate area in annulus intersecting square
s <- 10 # radius or half-width of square
r <- c(11, 12) # inner and outer radii of circle
diff(Acins(r, s)) # intersection of square and annulus
# figure to illustrate the calculated area:
theta <- seq(0, 2 * pi, length = 1500)
plot(0, xlim = max(r) * c(-1, 1), ylim = max(r) * c(-1, 1),
xlab = "x", ylab = "y", asp = 1, bty = "n", type = "n")
xi <- r[1] * cos(theta)
yi <- r[1] * sin(theta)
xo <- r[2] * cos(theta)
yo <- r[2] * sin(theta)
i1 <- which(abs(xi) <= s & abs(yi) <= s)
i2 <- which(abs(xo) <= s & abs(yo) <= s)
i2 <- sort(i2, decreasing = TRUE)
xi <- xi[i1]
yi <- yi[i1]
xo <- xo[i2]
yo <- yo[i2]
polygon(col = 8, border = NA,
x = c(xi[xi >= 0 & yi >= 0], xo[xo >= 0 & yo >= 0]),
y = c(yi[xi >= 0 & yi >= 0], yo[xo >= 0 & yo >= 0]))
polygon(col = 8, border = NA,
x = c(xi[xi <= 0 & yi >= 0], xo[xo <= 0 & yo >= 0]),
y = c(yi[xi <= 0 & yi >= 0], yo[xo <= 0 & yo >= 0]))
polygon(col = 8, border = NA,
x = c(xi[xi <= 0 & yi <= 0], xo[xo <= 0 & yo <= 0]),
y = c(yi[xi <= 0 & yi <= 0], yo[xo <= 0 & yo <= 0]))
polygon(col = 8, border = NA,
x = c(xi[xi >= 0 & yi <= 0], xo[xo >= 0 & yo <= 0]),
y = c(yi[xi >= 0 & yi <= 0], yo[xo >= 0 & yo <= 0]))
lines(r[1] * cos(theta), r[1]* sin(theta))
lines(r[2]* cos(theta), r[2] * sin(theta))
rect(-s, -s, s, s)
# calculate areas in series of 1 m annuli extending to corner of square
s <- 10.5 # radius of square (center to side)
diff(Acins(r = 0:ceiling(sqrt(2) * s), s))
Probability Distributions for Carcasses Versus Distance from Turbine
Description
PDFs and CDFs that are required by ddd
, pdd
and
qdd
but are not included among the standard R distributions. Relying on
custom code and included here are the Maxwell-Boltzmann (pmb
and
dmb
), xep0 (Pareto), xep1, xepi0 (inverse gamma), xep2 (Rayleigh),
xep02, xep12, xep012, xep123, and xep0123. Not included here are the
distributions that can be calculated using standard probability functions from
base R, namely the exponential, truncated normal, lognormal, gamma (xep01), and
chisquared distributions and the inverse gaussian, which is calculated using
statmod::dinvgauss
and statmod::dinvgauss
. The functions are
designed for vector x
or q
and scalar parameters.
Usage
dmb(x, a)
pmb(q, a)
dxep1(x, b1)
pxep1(q, b1)
pxep02(q, b0, b2)
dxep02(x, b0, b2)
dxep12(x, b1, b2)
pxep12(x, b1, b2)
dxep123(x, b1, b2, b3, const = NULL)
pxep123(x, b1, b2, b3, const = NULL)
dxepi0(x, shape, scale)
pxepi0(x, shape, scale)
dxep0123(x, b0, b1, b2, b3, const = NULL)
pxep0123(x, b0, b1, b2, b3, const = NULL)
dxep012(x, b0, b1, b2, const = NULL)
pxep012(x, b0, b1, b2, const = NULL)
dxep2(x, s2)
pxep2(x, s2)
dxep0(x, a)
pxep0(x, a)
Arguments
x , q |
vector of distances |
a , b0 , b1 , b2 , b3 , shape , scale , s2 |
parameters used in the respective distributions. |
const |
(optional) scalar normalizing constant for distributions that are
numerically integrated using |
Details
An xep distribution is calculated by dividing its kernel (for the
densities) or the integral of its kernel (for the cumulative distributions)
by the normalizing constant, which is the integral of the kernel from 0 to Inf.
The kernel of an xep distribution is defined as x e^{P(x)}
,
where P(x)
is a polynomial with terms defined by the suffix on xep. For
example, the kernel of xep12 would be x e^{b_1*x + b_2*x^2}
. A 0
in the suffix indicates a log(X)
term and an i
indicates a 1/x
term. The parameters of the xep
distributions are some combination of b_0, b_1, b_2, b_3
. The parameterizations of the
inverse gamma (xepi0), Rayleigh (xep2), and Pareto (xep0) follow the standard
conventions of shape
and scale
for the inverse gamma, s2
=
s^2
for the Rayleigh, and a
= a
for the Pareto (with
a scale or location parameter of 1 and PDF = a/x^(x + 1)
with support (1, Inf).
The Maxwell-Boltzmann is a one-parameter family with parameter a
and PDF
f(a) = \sqrt{2/\pi}\frac{x^2 e^{-x^2/(2a^2)}}{a^3}
. The kernel is
f(a) = x^2 e^{-x^2}
, which has a simple closed-form
integral that involves the error function (pracma::erf
).
Value
vector of probability densities or cumulative probabilities
Check validity of format of custom prior for M
Description
Check validity of format of custom prior for M
Usage
MpriorOK(prior)
Arguments
prior |
a custom prior for M must be a matrix with columns for M and and associated probabalities P(M = m). The M column must begin at 0 and the probabilities must sum to 1. |
Value
boolean. Is the prior formatted properly?
Plot dd and ddArray Objects
Description
Plot CDF, PDF, or rcd (relative carcass density) for a single
carcass dispersion glm model (dd
object) or a list of
models (ddArray
object).
Usage
## S3 method for class 'ddArray'
plot(
x,
type = "CDF",
extent = "full",
distr = "all",
xmax = NULL,
resolution = 250,
mod_highlight = NULL,
...
)
## S3 method for class 'dd'
plot(
x,
type = "CDF",
extent = "full",
xmax = NULL,
resolution = 250,
nsim = 1000,
CL = 0.9,
...
)
## S3 method for class 'fmod'
plot(x, ...)
## S3 method for class 'polygonLayout'
plot(x, ...)
## S3 method for class 'layoutSimple'
plot(x, ...)
## S3 method for class 'psiHat'
plot(x, ...)
## S3 method for class 'dwphat'
plot(x, ...)
Arguments
x |
model(s) to plot |
type |
Type or representation of carcass dispersion to plot:
|
extent |
Plot dispersions as fraction of total carcasses ( |
distr |
vector of names of distributions to plot or |
xmax |
maximum distance to show in the graph; if |
resolution |
The number of line segments to break the curves into when
plotting (i.e., |
mod_highlight |
Character string giving the name of the model to
highlight by plotting it last and with |
... |
arguments that may be passed to plotting functions |
nsim |
Number of simulation reps to use for estimating confidence bounds
for |
CL |
confidence level to show in a |
Details
ddArray
objects are plotted with lines in order
of decreasing AICc, so that the "better" models are closer to the top and
more prominent. The model with the lowest AICc ("best" model) is plotted
last with a heavier line than the others.
For dd
objects, the curve for the MLE of the parameters
is plotted, along with a 100CL
% confidence bounds determined for
nsim
simulation reps
The legend follows the ordering given by modelFilter
with
the default sieve or, if extent = "win"
by (1) delta AICc < 10,
(2) the absence of high-influence points, and (2) AICc. The best model
according to the filter is listed first, with a heavier line than the others;
the remaining distributions are listed in descending order, with the best
models in the leftmost column.
Value
Plot displayed; no return value.
Subset a Set of Fitted Dispersion Models (ddArray
)
Description
Subset a Set of Fitted Dispersion Models (ddArray
)
Usage
## S3 method for class 'ddArray'
x[distr]
Arguments
x |
object to subset |
distr |
vector of names or integer indices of distributions to extract from fitted models |
Details
Subset the ddArray object as if it were a simple vector
Value
list of selected models with crucial statistics
Subset Simulated Dispersion Parameters while Preserving Attributes
Description
Subset Simulated Dispersion Parameters while Preserving Attributes
Usage
## S3 method for class 'ddSim'
x[i, j, ...]
Arguments
x |
object to subset |
i , j |
row and column indices to subset |
... |
ignored |
Details
Subset the ddSim object as if it were a simple matrix or array
Value
array with simulated beta parameters from the glm model, their conversion to distribution parameters. NOTE: subsetting to a column or a row returns a matrix rather than a vector. This simplifies the coding and makes it easier to maintain integrity of data structures, but behavior differs from what is done when subsetting standard R matrices and arrays to a single column or row. Also unlike with standard R arrays and matrices, the class structure and attributes are preserved upon subsetting.
Add Carcasses to a Site Layout
Description
After the site layout is analyzed and structured by rings for analysis,
carcass data may still need to be added to the site data. addCarcass
grabs carcass location data from a shape file or data frame and formats it
into ring data, with carcass tallies in every 1m ring from the turbine to
the maximum search distance away from any turbine.
Usage
addCarcass(x, ...)
## S3 method for class 'shapeCarcass'
addCarcass(
x,
data_ring,
plotLayout = NULL,
ncarcReset = TRUE,
ccCol = NULL,
...
)
## S3 method for class 'data.frame'
addCarcass(
x,
data_ring,
ccCol = NULL,
ncarcReset = TRUE,
unitCol = "turbine",
rCol = "r",
...
)
Arguments
x |
carcass data to insert into |
... |
ignored |
data_ring |
ring data for receiving carcass data from |
plotLayout |
(optional) |
ncarcReset |
boolean to direct the function to set the carcass counts
in all the rings to 0 before adding the new carcasses (default) or to add the
new carcasses to the old totals ( |
ccCol |
name of carcass class column (optional). Typically, the "carcass class" would be for carcass characteristics that would be expected to affect distances that carcasses would fall from the turbine. For example, distances would not be expected to be the same for large and small carcasses, and bats may have significantly different distance distributions than small birds. The ccCol could also be used for subsetting by any covariate that would be expected to interact with carcass distance distributions, like season (if winds vary by season) or turbine type (if the site has a diverse mix of turbines of different sizes or types). Additionally, ccCol can be used to subset the data by area (for example, NW, NE, SW, SE; or hilltop, river bank) or any other discrete covariate that the user may be interested in. |
unitCol |
name of unit column |
rCol |
name of column with carcass distances |
Value
an object of class rings
with a tally of the number of
of carcasses discovered in each concentric 1m ring from the turbine to the
most distant point searched.
Examples
data(layout_simple)
data(carcass_simple)
sitedata <- initLayout(layout_simple)
ringdata <- prepRing(sitedata)
ringsWithCarcasses <- addCarcass(carcass_simple, data_ring = ringdata)
Calculate Akaike Information Criterion (AICc) for Distance Distributions
Description
functions for calculating AICc for carcass dispersion models.
Usage
aic(x, ...)
## S3 method for class 'ddArray'
aic(x, extent = "full", ...)
## S3 method for class 'ddArraycc'
aic(x, extent = "full", ...)
## S3 method for class 'dd'
aic(x, ...)
Arguments
x |
list of models ( |
... |
ignored |
extent |
Include only the extensible models ( |
Value
Data frame with AICc and deltaAICc for all models in x
Names of the Named Distributions
Description
Names of the Named Distributions
Usage
alt_names
Format
An object of class character
of length 10.
Example Carcass Distances to Accompany the Polygon Layout Data Set
Description
Example Carcass Distances to Accompany the Polygon Layout Data Set
Usage
carcass_polygon
Format
A data frame illustrating an import format for carcass distances.
There are columns with turbine IDs (turbine
) and carcass distances
from turbine (r
). Distances (r
) are in meters from the nearest
turbine. The data set is used in the "polygon" example in the User Guide.
Carcass Data to Accompany the Simple Geometry Data Format
Description
Carcass Data to Accompany the Simple Geometry Data Format
Usage
carcass_simple
Format
An object of class data.frame
with 55 rows and 2 columns.
Full, Simulated carcass_simple Data Set (Including Locations and Missed Carcasses)
Description
Full, Simulated carcass_simple Data Set (Including Locations and Missed Carcasses)
Usage
carcass_simple0
Format
An object of class data.frame
with 100 rows and 6 columns.
Convert GLM Coefficients into Named Distribution Parameters
Description
Convert GLM Coefficients into Named Distribution Parameters
Usage
cof2parms(x, ...)
## S3 method for class 'matrix'
cof2parms(x, distr, ...)
## S3 method for class 'numeric'
cof2parms(x, distr, ...)
## S3 method for class 'dd'
cof2parms(x, ...)
## S3 method for class 'matrix'
parms2cof(x, distr, ...)
Arguments
x |
object (vector or matrix of parameters, dd, or glm)
with named glm parameters ( |
... |
ignored |
distr |
name of the distribution |
Value
matrix of parameters
Check Whether GLM Coefficients Give Proper Distribution
Description
In order for a fitted GLM to convert to a proper distance distribution, its
integral from 0 to Inf must be finite. As a rule, when the leading coefficient
is positive, the integral diverges as the upper bound of integration approaches
infinity, and cofOK
would return FALSE
. Likewise, in some cases,
the GLM coefficients yield an integral that diverges as the lower bound
approaches 0, in which case cofOK
returns FALSE
as well.
cofOK0
and cofOKInf
check the left and right tails of the
candidate distribution, repectively, for convergence.
Usage
cofOK(cof, distr)
cofOK0(cof, distr)
cofOKInf(cof, distr)
Arguments
cof |
vector or matrix of named glm parameters (with |
distr |
name of the distribution |
Value
boolean vector (or scalar)
Names of the GLM Coefficients for Each Distribution
Description
Names of the GLM Coefficients for Each Distribution
Usage
cof_name
Format
A list with the names of the coefficients (vector of character strings)
as they appear in the $coefficients
value returned from glm
for
each model.
Constraints on GLM Coefficients for Extensibility to a Distribution
Description
Constraints on GLM Coefficients for Extensibility to a Distribution
Usage
constraints
Format
A list of matrices giving the upper and lower bounds that each
model's coefficients must meet for the model to be extensible to a
distribution. The parscale
column may be used in optim
for
fitting a truncated weighted likelihood model.
Constraints on Parameters to Assure Extensibility
Description
Constraints on Parameters to Assure Extensibility
Usage
constraints_par
Format
An object of class list
of length 17.
Extract Parameters from a Distance Model (dd
) and Format as ddSim
Object
Description
This is a utility function called internally by dwp
functions
to extract parameters from a fitted dd
model and
formats them for analysis and calculation.
Usage
dd2ddSim(dd)
Arguments
dd |
|
Value
ddSim
object with 1 row
Calculate CI for CDF, PDF, or quantile
Description
Calculate a confidence interval for the CDF, PDF, or quantile of a carcass distance distribution.
Usage
ddCI(
mod,
x,
type = "CDF",
CL = 0.9,
nsim = 1000,
extent = "full",
zrad = 200,
na.tol = 0.1
)
Arguments
mod |
a |
x |
distance from turbine (scalar or vector) or probability (for quantile) |
type |
|
CL |
confidence level for the confidence interval(s) |
nsim |
number of simulation draws to base the estimate of |
extent |
whether to calculate |
zrad |
an ad hoc radius to integrate to when the (uncommon) simulated parameter estimates do not result in an extensible distribution. In effect, This replaces NAs with 1s in CDFs and with 0s in PDFs. |
na.tol |
maximum fraction of invalid parameter sets to discard when
constructing CIs; abort if |
Value
array (ddCI
class) with columns for distance and the CI bounds
Fit Distance Distribution Model(s)
Description
Fit generalized linear models (glm) for distance distribution models corresponding to standard forms [xep1, xep01 (gamma), xep2 (Rayleigh), xep02, xep12, xep012, xep123, xep0123 (normal-gamma with x = tau), lognormal, truncated normal, Maxwell Boltzmann, and constant] and supplentary forms [exponential, chi-squared, inverse gamma, and inverse Gaussian].
The glm is converted to a probability distribution by dividing by a
normalizing constant, namely the integral of the glm evaluated from 0 to
infinity. In some cases (most notably when the leading coefficient of the
glm is positive so the fitted curve does not converge to zero as x increases),
converted to a probability distribution. In these cases, the distribution
parameters are given as NA
, but the fitted model itself is saved.
Usage
ddFit(x, ...)
## S3 method for class 'data.frame'
ddFit(
x,
distr = "standard",
scVar = NULL,
rCol = "r",
expoCol = "exposure",
ncarcCol = "ncarc",
silent = FALSE,
...
)
## S3 method for class 'rings'
ddFit(
x,
distr = "standard",
scVar = NULL,
rCol = "r",
expoCol = "exposure",
ncarcCol = "ncarc",
silent = FALSE,
...
)
## S3 method for class 'list'
ddFit(
x,
distr = "standard",
scVar = NULL,
rCol = "r",
expoCol = "exposure",
ncarcCol = "ncarc",
silent = FALSE,
...
)
## S3 method for class 'xyLayout'
ddFit(
x,
distr = "standard",
scVar = NULL,
notSearched = NULL,
rCol = "r",
ncarcCol = "ncarc",
unitCol = "turbine",
silent = FALSE,
...
)
## S3 method for class 'ringscc'
ddFit(
x,
distr = "standard",
scVar = NULL,
rCol = "r",
expoCol = "exposure",
ncarcCol = "ncarc",
silent = FALSE,
...
)
Arguments
x |
a search plot layout object to fit carcass distribution models to. The layout may be a data frame with columns for ring radii, exposure (or searched area in each ring), search class variable (optional), and number of carcasses in each ring; |
... |
ignored |
distr |
names (vector of character strings) of glm distribution templates
to fit. Default is |
scVar |
Search class variable to include in the model (optional). |
rCol |
name of the distance column (which gives the outer radii of the rings).
This will be correct by default for objects coming from |
expoCol |
name of the column with the exposure, which is the area in the ring
with outer radius |
ncarcCol |
name of the column with tallies of carcasses by ring. This
will be correct by default for objects coming from |
silent |
set |
notSearched |
the name of the level (if any) in |
unitCol |
name of the column with turbine IDs |
Value
A list of fitted glm models as dd
objects in a
ddArray
object if a vector of distributions is fit, or a single dd
object if a single model is fit. The dd
objects are
lists that include the following elements:
glm
the fitted model
$distr
name of the distribution (
"xep01"
, etc.)$parms
vector of distribution parameter estimates (or
NA
if the model based on the MLE is not extensible)$varbeta
the variance-covariance matrix of the glm parameter estimates. NOTE: This is identical to the covariance matrix from the glm, which can be extracted via
summary(x)$cov.unscaled
$scVar
name of the (optional) search class variable (or
NULL
)$ncarc
number of carcasses
$aicc
the AICc value of the fit
$n
number of rings
$k
number of parameters
$srad
search radius
When a dd
object is printed, only a small subset of the elements are
shown. To see a full list of the objects, use names(x)
. The elements
can be extracted in the usual R way via $
or [[x]]
.
Examples
data(layout_simple)
data(carcass_simple)
sitedata <- initLayout(layout_simple) # initialize
ringdata <- prepRing(sitedata) # format site layout data for modeling
ringsWithCarcasses <- addCarcass(carcass_simple, data_ring = ringdata) # add carcasses to site
distanceModels <- ddFit(ringsWithCarcasses) # fit distance models
Print S3 Objects in dwp
Package
Description
dd
, ddArray
, and
fmod
objects are lists consisting of a great
amount of data. Only a few of the elements are printed automatically. Other
elements of object x
can be viewed and extracted as with other lists
in R, namely, by using the x$element
or x[[element]]
operator,
where element
is the name of one of the elements of x
, all of
which can be viewed via names(x)
.
Usage
## S3 method for class 'dd'
print(x, ...)
## S3 method for class 'ddArray'
print(x, ...)
## S3 method for class 'fmod'
print(x, ...)
Arguments
x |
|
... |
ignored |
Value
no return value; output printed to the console
Simulation of Dispersion Parameters
Description
Simulation of Dispersion Parameters
Usage
ddSim(x, ...)
## S3 method for class 'dd'
ddSim(x, nsim = 1000, extent = "full", ...)
Arguments
x |
object to simulate from |
... |
ignored |
nsim |
number of simulation draws |
extent |
simulate according to full distribution, including extrapolation
beyond the search radius ( |
Value
array with simulated beta parameters from the glm model, and their conversion to distribution parameters
Calculate Probability Functions for Distance Distributions
Description
Calculate the standard d/p/q/r family of R probability functions for distance
distributions (dd
) as well as the relative carcass
density (rcd
). Usage broadly parallels that of the d/p/q/r probability
functions like dnorm
, pnorm
, qnorm
, and
rnorm
.
Usage
ddd(x, model, parms = NULL, extent = "full", zrad = 200)
pdd(q, model, parms = NULL, extent = "full", zrad = 200, silent = FALSE)
qdd(p, model, parms = NULL, extent = "full", zrad = 200, subdiv = 1000)
rdd(n, model, parms = NULL, extent = "full", zrad = 200, subdiv = 1000)
rcd(x, model, parms = NULL, extent = "full", zrad = 200)
Arguments
x , q , p , n |
numeric, |
model |
either a |
parms |
model parameters; required if model is specified as a character
string rather than a |
extent |
for a full distribution extrapolated beyond the search radius
to account for all carcasses, use |
zrad |
the distance at which carcass density is assumed to be zero; to be used only in simulation reps in which simulated parameters do not yield extensible distributions, essentially returning 0 rather than NA for those pathological cases. |
silent |
If |
subdiv |
if the number of values to calculate with |
Details
The probability density function (PDF(x) = f(x) = ddd(x, ...)
)
gives the probability that a carcass falls in a 1 meter ring centered at the
turbine and with an outer radius of x
meters. The cumulative distribution
function [CDF(x) = F(x) = pdd(x, ...)
] gives the
probability that a carcass falls within x
meters from the turbine. For
a given probability, p
, the inverse CDF [qdd(p,...)
] gives the
p
quantile of carcass distances. For example, qdd(0.5,...)
gives the median carcass distance, and qdd(0.9, ...)
gives the radius
that 90% of the carcasses are expected to fall in. Random carcass distances
can be generated using rdd
.
The relative carcass density function(rcd
) gives relative carcass
densities at a point x
meters from a turbine. In general, rcd is
proportional to PDF(x)/x, normalized so that the surface of rotation of rcd(x)
has total volume of 1. There are more stringent contstraints on the allowable
parameters in the fitted (or simulated) glm's because the integral of PDF(x)/x
must converge.
Distributions may be extrapolated beyond the search radius to account for all
carcasses, including those that land beyond the search radius
(extent = "full"
), or may be restricted to carcasses falling within the
searched area (extent = "win"
). Typically, in estimating dwp
for
a fatality estimator like eoa
or GenEst
, the full distributions
would be used.
The probability functions have a number of purposes. A few of the more commonly used are listed below.
- PDF and CDF (
ddd
andpdd
): -
to calculate the probability that carcass lands at a distance
x
meters from the turbine (or, more precisely, within 0.5 meters ofx
) or withinx
meters from the turbine, use a scalar value ofx
and a single model (dd
orddSim
) withddd
orpdd
, repspectively;to account for uncertainty in the probabilities at
x
, useddd
orpdd
for with scalarx
and a simulated set of parameters from the fitted model (ddSim
object). This would be useful for calculating confidence intervals for the probabilities;to calculate probabilities for a range of
x
values according to a single model, use a vectorx
with add
object or addSim
object with one row. This would be useful for drawing graphs of PDFs or CDFs;to calculate simulated probabilites for a range of
x
values, use a vectorx
and addSim
object of simulated parameter sets. This would be useful for drawing confidence regions around a fitted PDF or CDF.
- Inverse CDF (
qdd
): -
to calculate the distance that 100
p
% of the carcasses are expected to fall, use a scalarp
in the interval (0, 1) and a single model (dd
) or parameter set (ddSim
with one row);to calculate account for the uncertainty in estimating the inverse CDF for a given
p
, use a scalarp
and addSim
object. This would be useful for calculating a confidence interval for, say, the median or the expected 90th percentile of carcass distances;to calculate the inverse CDF for a range of probabilities for a single model, use a vector
p
and a single model (dd
orddSim
object with one row.
- Random Carcasses Distances (
rdd
): -
to generate
n
random carcass distances for a given (fixed) model, use add
object or addSim
object with a single row;to generate
n
random carcass distances for a model and account for the uncertainty in estimating the model, use addSim
object withn
rows, wheren
is also used as then
argument in the call tordd
.
- Relative Carcass Density (per m^2):
-
to calculate the relative carcass density at a number of distances, use a vector
x
. This would be useful in generating maps of carcass density at a site.
Value
vector or matrix of values; a vector is returned unless model
is a ddSim
object with more than one row and is to be calculated for
more than one value (x
, q
, p
), in which case an array
with dimensions length(x)
by nrow(model)
is returned (where
"x
" is x
, q
, or p
, depending on whether ddd
,
pdd
, or qdd
is called).
Examples
data(layout_simple)
data(carcass_simple)
sitedata <- initLayout(layout_simple)
ringdata <- prepRing(sitedata)
ringsWithCarcasses <- addCarcass(carcass_simple, data_ring = ringdata)
distanceModels <- ddFit(ringsWithCarcasses)
modelEvaluations <- modelFilter(distanceModels)
bestModel <- modelEvaluations$filtered
pdd(100, model = bestModel) # estimated fraction of carcasses within 100m
ddd(1:150, model = bestModel) # estimated PDF of the carcass distances
qdd(0.9, model = bestModel) # estimated 0.9 quantile of carcass distances
rdd(1000, model = bestModel) # 1000 random draws from estimated carcass distribution
An Ordering of the Models by Degree of the Polynomial
Description
An Ordering of the Models by Degree of the Polynomial
Usage
degOrder
Format
An object of class character
of length 17.
Full List of Names of Distributions
Description
Full List of Names of Distributions
Usage
distr_names
Format
An object of class character
of length 17.
Density-Weighted Proportion
Description
This package is designed to analyze carcass dispersion data and fit models of carcass density as function of distance from turbine.
Data sets
carcass_polygon
carcass_simple
layout_eagle
layout_polygon
layout_simple
layout_xy
xyr
sieve_default
Main Command-Line Functions
initLayout, prepRing, readCarcass, addCarcass
import and format data
ddFit
fit carcass distribution models
estpsi, estdwp
estimate probability that carcass will lie in the searched area (psi) and the fraction of carcasses lying in the searched area ('dwp')
formatGenEst, exportGenEst
format and export 'dwphat' objects for use with GenEst
aic, modelFilter, stats, ddCI
statistics for fitted models
plot
S3 function for
ddArray
,dd
,fmod
,polygonLayout
,psiHat
,dwpHat
objects.ddd, pdd, qdd, rdd, rcd
probability functions for distance distributions
Potentially Useful Calculation and Editing Functions
,
getncarc
extract the number of carcasses per turbine from a data set; method for many types of objects
,
cof2parms, cofOK, cofOK0, cofOKInf, constraints
functions for manipulating and checking model coefficients
Acins
calculate the area of the intersection of a circle and square sharing a common center
rmat, off
functions for constructing functions out of distribution information
exclude
simple function for excluding items from a superset
Estimate DWP
Description
Estimate the density-weighted proportion (DWP) of carcasses lying in the
searched area at each turbine at a site. The calculation requires prior
estimation of the expected proportion (psi
) and the
number of carcasses found (ncarc
). NOTE: The
carcass counts affect the uncertainty in the estimate of the fraction
of carcasses in the searched area (DWP), and ncarc
is required for
accounting for uncertainty in estimates of DWP.
Usage
estdwp(x, ...)
## S3 method for class 'psiHat'
estdwp(x, ncarc, nboot = NULL, forGenEst = FALSE, silent = TRUE, ...)
## S3 method for class 'psiHatcc'
estdwp(x, ncarc, nboot = NULL, forGenEst = FALSE, silent = TRUE, ...)
Arguments
x |
Either (1) |
... |
ignored |
ncarc |
vector of total carcass count at each turbine represented in x. |
nboot |
number of parametric bootstrap iterations for estimating CIs |
forGenEst |
format the results for importing into GenEst (boolean) |
silent |
suppress messages from the fitting of a beta distribution in internal calculations that, if successful, increase the speed of the calculations by 20-200x. The message would signal that this acceleration cannot be applied. |
Value
list
Estimate Probability Carcass lands in Searched Area
Description
Estimated probability that carcass lands in searched area. This
is an intermediate step in estimating dwp but is also interesting in its own
right. The estimation involves integrating the modeled carcass distribution
(model
) over the search plots at the turbines. Data for the search
plots is stored in the generic argument, x
, which can take any of a
number of different forms, as described in the Arguments
section (below).
Usage
estpsi(x, ...)
## S3 method for class 'rings'
estpsi(x, model, extent = "full", nsim = 1000, zrad = 200, ...)
## S3 method for class 'ringscc'
estpsi(
x,
model,
modnames = NULL,
extent = "full",
nsim = 1000,
zrad = 200,
...
)
## S3 method for class 'xyLayout'
estpsi(x, model, extent = "full", nsim = 1000, zrad = 200, ...)
## S3 method for class 'rpA'
estpsi(x, model, extent = "full", nsim = 1000, zrad = 200, ...)
## S3 method for class 'rdat'
estpsi(x, model, extent = "full", nsim = 1000, zrad = 200, ...)
## S3 method for class 'data.frame'
estpsi(x, model, extent = "full", nsim = 1000, zrad = 200, ...)
Arguments
x |
data
|
... |
ignored |
model |
A fitted |
extent |
calculate dwp within searched radius only ( |
nsim |
number of parametric bootstrap iterations for accounting for
uncertainty in the estimator. Default is |
zrad |
radius |
modnames |
if |
Value
A psiHat
object, which is either 1) an array giving the
expected fraction of carcasses lying in the searched area at each turbine
with nsim
rows and one column for each turbine + one row for the
total; or 2) a list of such arrays, one for each carcass class if x
is a ringscc
object. The uncertainty in the expected fractions is
characterized by simulation and reflected in the variation in psi
values within each column.
Remove Particular Names from a Longer List
Description
Removes specific values (what
) from a longer vector of values (from
).
By default, from = mod_standard
, and the intent is to simplify the
subsetting of ddArray
objects created with the default standard models.
For example, dmod2 <- dmod[exclude("lognormal")]
would subset the list
of models in mod_standard
to exclude "lognomal"
. The default can
be overridden by providing a specific vector for from
(for example,
dmod[exclude("lognormal", from = names(dmod)])
).
Usage
exclude(what, from = mod_standard)
Arguments
what |
vector of distribution names to exclude |
from |
vector of distribution names to be excluded from |
Value
vector of names from "from" after excluding "what"
Export Estimated Density-Weighted Proportion to File in Proper GenEst Format
Description
GenEst imports DWP from files with comma-separated values (.csv), with a
column giving turbine ID and a column of DWP values for each carcass class.
Column lengths are equal to the number of turbines times the number of
simulation reps (typically, nsim = 1000
), giving nsim
copies of
DWP values for each turbine in a single column.
Usage
exportGenEst(dwp, file)
Arguments
dwp |
a |
file |
name of file to export the |
Details
NOTE: The .csv
file uses the English convention (used in USA, UK,
Mexico, China, India, Australia, Japan, Korea, and others) with the comma ( , )
and not the semi-colon ( ; ) to separate values among different columns and
uses the period ( . ) as the decimal mark. Although GenEst can seamlessly
accommodate either format, users in countries where the comma or other
character is used as a decimal mark may need to adjust their software settings
or edit the data to be able to view it in a spreadsheet program (such
as Excel).
Value
The function writes the formatted data to a .csv
file and
returns NULL.
Find suitable mmax for clipping improper priors for M
Description
Improper priors need to be clipped in order to be usable.
fmmax
and fmmax.ab
find values of m
that are large enough
that the probability of exceeding is less than 0.0001 (depends on g
and
X
).
Usage
fmmax(x, g)
fmmax.ab(x, pBa, pBb)
Arguments
x |
carcass count |
g |
overall carcass detection probability |
pBa , pBb |
parameters for beta distribution characterizing estimated |
Value
integer m
such that Pr(M >= m) < 0.0001
Format DWP Estimate for Use in GenEst
Description
GenEst requires dwp
data to be formatted as a data frame with columns
for turbine ID and for estimated dwp
for each carcass class. To
incorporate uncertainty in the estimates, nsim
simulated copies of
the basic format are appended to the columns in the data set.
Usage
formatGenEst(dwphat)
Arguments
dwphat |
a |
Value
an nsim*nturbine
by nclass + 1
data frame, with columns
for the turbine ID and for estimated dwp
for each carcass class (e.g.,
large
, medium
, small
, bat
).
Simple Function to Extract Carcass Counts
Description
Carcass counts are easy to extract from any of the data structures, but it
may be difficult to remember where to retrieve the data from for any particular
structure. getncarc
simplifies the task by having the same usage for
all data types.
Usage
getncarc(x, ...)
## S3 method for class 'ringscc'
getncarc(x, ...)
## S3 method for class 'rings'
getncarc(x, ...)
## S3 method for class 'xyLayout'
getncarc(x, ...)
## S3 method for class 'ddArray'
getncarc(x, ...)
## S3 method for class 'ddArraycc'
getncarc(x, ...)
## S3 method for class 'dd'
getncarc(x, ...)
Arguments
x |
a data structure with |
... |
ignored |
Value
scalar number of carcasses used in the fitted model (
dd
andddArray
objectsvector of numbers of carcasses of each size used in the fitted models (
ddArraycc
objects)vector of carcass counts at each turbine and total at the site (
xyLayout
andrings
objectslist of vectors of carcass counts at each turbine for each carcass class
Incomplete Gamma Function
Description
Incomplete Gamma Function
Usage
incGamma(a, x, lower = FALSE)
Arguments
a |
positive numeric vector |
x |
non-negative numeric vector |
lower |
boolean for calculating lower or upper incomplete gamma function |
Details
The upper incomplete gamma function, following Wolfram Alpha, namely,
incGamma(a, x) = \int_x^\infty e^{-t} * t^{a - 1}dt
,
calculated using pgamma. Within the dwp
package, incGamma
is
used in the calculation of the cumulative distribution function (CDF) of the
xep02 distribution (pxep02
). NOTE: The function pracma::incgam
also
calculates incomplete gamma with pracma::incgam(x, a) = incGamma(a, x)
,
but pracma::incgam
is not vectorized and not used here.
Value
scalar or vector of length = max(length(x), length(a)), with values of the shorter recycled to match the length of the longer a la pnorm etc.
Create a Data Structure or Map for the Site Layout
Description
Read plot layout data and perform premliminary error-checking
and formatting. Search plot layout data can come in any of several different
formats, including shape files for search area polygons and turbine locations,
R polygons, (x, y) coordinates, or simple description of search plot type for
each turbine (square, circular, road & pad). A vector of distances along with
a search radius is also accommodated by dwp
, but these can be directly
processed in prepRing
without preprocessing in initLayout
.
Usage
initLayout(
data_layout,
dataType = "simple",
unitCol = "turbine",
file_turbine = NULL,
radCol = "radius",
shapeCol = "shape",
padCol = "padrad",
roadwidCol = "roadwidth",
nRoadCol = "n_road",
xCol = "x",
yCol = "y",
ncarcCol = "ncarc",
scCol = NULL,
notSearched = NULL,
quiet = FALSE
)
Arguments
data_layout |
Either the name of a shape file (polygons or multipolygons) that delineates areas searched at each turbine; a .csv file with R polygons, (x, y) coordinates, or simple descriptions of search parameters at each turbine; or a data frame with r polygons, (x, y) coordinates, or simple plot layout descriptions. See "Details" for details. |
dataType |
An identifier for the type of data to be imported: |
unitCol |
Column name for turbine IDs. If |
file_turbine |
The name of a shape file (points) giving the turbine
locations for turbines listed in the |
radCol |
for |
shapeCol |
for |
padCol |
for |
roadwidCol |
for |
nRoadCol |
for |
xCol |
for |
yCol |
for |
ncarcCol |
for |
scCol |
for |
notSearched |
for |
quiet |
boolean for controlling whether progress of calculations and other notes are printed to the console as the function runs |
Details
All the layout types (except for vector, which is addressed elsewhere)
can accommodate patterns of seached and not searched areas. If the searched
areas are subdivided into different search classes with different detection
probabilities, then search plot layout data must be input either from shape files
with non-intersecting polgons delineating the search classes or from x-y grid
data. If there is more than one search class variable (for example, ground
cover and search schedule), then the covariates may be entered in separate
columns if the layout files give grid coordinates or may be combined into
one column in the shape files. For example, ground visibility may be easy or
difficult and search schedule may be 1-day or 14-day. These can be combined
into a single column with values of, say, easy1
, easy14
,
difficult1
, and difficult14
.
There must be a turbine ID column (unitCol
) in each of the files. The
individual turbine ID's must be syntactically valid R names, i.e., contain
combinations of letters, numbers, dot ( . ), and underscores ( _ ) only and
not begin with a number or a dot followed by a number. Other characters are
not allowed: no spaces, hyphens, parentheses, etc.
If shape files are to be imported, both shape files (search area polygons and turbine locations) must have their three standard, mandatory component files (.shp, .shx, .dbf) stored in the same folder. Only the name of the .shp should be entered in the function call. Other components are automatically searched for and processed if available.
Value
A list or data frame with components appropriate to the type of data
imported. The data structure is returned as an S3 class object, which other
functions in dwp
can recognize and properly process. There is minimal
processing on the data after importing, but the structures are lightly error-checked
and formatted for more thorough processing, depending on data type and analysis
objectives. Typically, the layout data will be later processed by a call to
prepRing
to create a characterization of the searched area at
the site by "rings", with tallies of searched area, search classes, and
fraction of area searched in concentric, 1 meter rings around each turbine.
The format of the output depends on the format of the input. There are
several possibilities, including, each of which is an S3 object with
characteristics specific to the imported data:
shapeLayout
List with elements:
-
$layout
= turbine search area configurations (polygons and multipolygons) fromdata_layout
shape file as ansf
object. -
$layoutAdj
= polygons from$layout
but recentered at (0, 0) -
$turbines
= turbine data (assf
object) -
$unitCol
= name of the column with turbine IDs (character string) -
$tset
= turbine names (vector of character strings) -
$tcenter
= locations of turbine centers (nturb by 2 array) with UTMs of turbine locations, extracted and simplified from$turbines
. Column names areX
andY
, measuring meters from a reference point. Row names are the names of the turbines.
-
simpleLayout
Data frame with columns:
-
turbine
= turbine IDs (syntactically valid R names) -
radius
= search radius. Ifshape = "square"
, then radius is 1/2 the width of the square. -
shape
= general descriptor of the shape of the search plot as"square"
,"circular"
, or"RP"
(for roads and pads search). -
padrad
= radius of the turbine pad (assumed circular) -
roadwidth
= width of the access road(s) -
n_road
= number of access roads
-
polygonLayout
-
List of polygons, one for each turbine. The maximum search radius at any turbine is assigned as an attribute (
attr(, "rad")
). xyLayout
List with elements:
-
xydat
= data frame with columns for turbine names, x and y coordinates of 1m grid centers spanning the searched area, number of carcasses found in each grid cell, and optional covariates. -
tcenter
= matrix giving turbine locations (x, y
), with row names = turbine names. -
ncarc
= vector giving the number of carcasses found at each turbine. -
unitCol
= name of the column where turbine IDs are stored inxydat
. -
tset
= names of the searched turbines
-
Examples
data(layout_simple)
# converts properly formatted dataframe to 'simpleLayout' object
initLayout(layout_simple)
data(layout_xy)
initLayout(layout_xy, dataType = "xy")
data(layout_polygon)
initLayout(layout_polygon, dataType = "polygon", unitCol = "turbine")
Example Bare Vector Format for Eagle Data
Description
Example Bare Vector Format for Eagle Data
Usage
layout_eagle
Format
An object of class data.frame
with 60 rows and 3 columns.
Example Polygon Data for Site Layout
Description
Example Polygon Data for Site Layout
Usage
layout_polygon
Format
A data frame illustrating the required raw data format for using
standard R polygons to characterize a site layout. There must be three
columns: one giving turbine IDs (turbine
) and columns for the x
and y
coordinates that delineate the plot layouts. Turbine IDs must be
syntactitally valid R names, that is, combinations of letters, numbers,
underscores ( _ ) and periods ( . ) and no spaces, hyphens, or other special
characters. Names must not begin with a number, so 1, 2, 3, ...
,
3B1
, and .72S
are NOT valid names. Coordinates shoud be in
meters relative to a turbine at (0, 0).
Example Simple Geometry Data Format for Site Layout
Description
Example Simple Geometry Data Format for Site Layout
Usage
layout_simple
Format
A data frame illustrating an import format for a simple description of a
site layout by turbine. Each turbine (turbine
) is classed according to
the shape of its search plot, either circular
, square
, or
RP
(search on the roads and turbine pad only). For circular
plots, all ground within radius
meters of the turbine is searched.
For square
plots, the radius
is half the width of the square
along the x-axis, NOT the distance to the corner. For RP
plots, the
radius
is the maximum distance searched on the roads. The geometry of
the RP
also includes a circular turbine pad with radius padrad
,
a road width of roadwidth
meters, and the number of roads
(n_road
) searched out to radius
meters from the turbine.
Example Data for Site Layout on an (x, y) Grid
Description
Example Data for Site Layout on an (x, y) Grid
Usage
layout_xy
Format
A data frame illustrating the required raw data format for using
a grid format to characterize a site layout. There are five
columns: one giving turbine IDs (turbine
), columns for the x
and y
coordinates on 1 m. grids that overlay the search plot, a
column giving the carcass count in each grid cell, and a column giving the
distance of each cell from the turbine. There is only one turbine, searched
on road and pad out. Coordinates are in meters relative to the turbine at
(0, 0).
Names of All the Available Models
Description
Names of All the Available Models
Usage
mod_all
Format
An object of class character
of length 17.
Vector of Colors Used in Graphs of Fitted Models
Description
Vector of Colors Used in Graphs of Fitted Models
Usage
mod_color
Format
An object of class character
of length 17.
Vector of Line Types Used in Graphs of Fitted Models
Description
Vector of Line Types Used in Graphs of Fitted Models
Usage
mod_lty
Format
An object of class numeric
of length 17.
Vector of GLM Offsets for Available Models
Description
Vector of GLM Offsets for Available Models
Usage
mod_offset
Format
An object of class character
of length 17.
Vector of Names of Standard Models
Description
Vector of Names of Standard Models
Usage
mod_standard
Format
An object of class character
of length 12.
Vector of Names of Models Available for Grid Layout
Description
Vector of Names of Models Available for Grid Layout
Usage
mod_xy
Format
An object of class character
of length 8.
Run Models through a Sieve to Filter out Dubious Fits
Description
A set of fitted models (ddArray
) is filtered according
to a set of criteria that test for high AIC, high-influence points, and
plausibility of the tail probabilities of each fitted distribution.
modelFilter
will either auto-select the best model according to a set
of pre-defined, objective criteria or will will return all models that meet
a set of user-defined, or default criteria. A table of how the models
score according to each criterion is printed to the console.
Usage
modelFilter(dmod, sieve = "default", quiet = FALSE)
Arguments
dmod |
a |
sieve |
a list of criteria for ordering models |
quiet |
boolean to suppress ( |
Details
The criteria to test are entered in a list (sieve
) with components:
-
$rtail
= vector of probabilities that define a checkpoints on distributions to avoid situations where a model that may fit well within the range of data is nonetheless implausible because it predicts a significant or substantial probability of carcasses falling great distances from the nearest turbine. The default is to check whether or not a distribution predicts that less than 50% of carcasses fall within 80 meters, 90% within 120 meters, 95% within 150 meters, or 99% within 200 meters. Distributions that fall below any of these points (for example predicting only 42% within 80 meters or only 74% within 120 meters) fail the defaultrtail
test. The format of the default for the test is$rtail = c(p80 = 0.5, p120 = 0.90, p150 = 0.95, p200 = 0.99)
. Users may override the default by using, for example,sieve = list(rtail = c(p80 = 0.8, p120 = 0.99, p150 = 0.99, p200 = 0.999))
in the argument list for a more stringent test or for a situation where turbines are small or winds are light. Alternatively, users may forego the test altogether by enteringsieve = list(rtail = FALSE)
. If specific probabilities are provided, they must be in a vector of length 4 with names "p80
" etc. as in the examples above. -
$ltail
= vector of probabilities that define checkpoints on distributions to avoid situations where the search radius is short and a distribution that fits the limited data set well but crashes to zero just outside the search radius. The default is to check whether or not a distribution predicts that greater than 50% of carcasses fall with 20 meters or 90% within 50 meters. Distributions that pass above either of these checkpoints (for example predicting 61% of carcasses within 20 meters or 93% within 50 meters) are eliminated by the defaultltail
test. The format of the default for the test is$ltail = c(p20 = 0.5, p50 = 0.90)
. Users may override the default by using, for example,sieve = list(rtail = c(p20 = 0.6, p50 = 0.8))
in the argument list for a situation where it is known that carcasses beyond 50 meters are common. -
$aic
= a numeric scalar cutoff value for model's delta AICc scores. Models with AICc scores exceeding the minimum AICc among all the fitted models bysieve$aic
or more fail the test. The default value is 10. Users may override the default by using, for example,sieve = list(aic = 7)
in the argument list to use a delta AIC score of 7 as the cutoff or may forego the test altogether by settingsieve = list(aic = FALSE)
-
$hin
=TRUE
orFALSE
to test for high influence points, the presence of which cast doubt on the reliability of the model. The function defines "high influence" as models with high leverage points, namely, points with\frac{h}{1 - h} > \frac{2p}{n - 2p}
(whereh
is leverage,p
is the number of parameters in the model, andn
is the search radius) with Cook's distance> 8/(n - 2*p)
. The criteria for high influence points were adapted from Brian Ripley's GLM diagnostics packageboot
(glm.diag
). The test is perhaps most valuable in identifying distributions with high probability of carcasses landing well beyond what could reasonably be expected.
Several choices of pre-defined sieve
s are available (or, as described
above, users may define their own criteria):
sieve = "default"
The models are ordered by the following criteria:
extensibility
weight of right tail (discounting models that predict implausibly high proportions of carcasses beyond the search radius)
weight of the left tail (discounting models that predict implausibly high proportions of carcasses near the turbines)
AICc test (discounting models with delta AICc > 10)
high influence points (discounting models in which one or more of the data points exert a high influence on the fitted model, according to Ripley's GLM diagnostics package
boot
(glm.diag
))ranking by AICc
Precise definitions of the default sieve parameters are given in
sieve_default
.sieve = NULL
Returns a list of the extensible models without scoring them by other model selection criteria.
sieve = "win"
Sorts models by high-influence points and AICc
sieve = list(<custom>)
User provides a custom sieve, which may be a modification of the default sieve or de novo. To modify the default, use, for example,
sieve = list(hin = FALSE)
to disable thehin
test but keep the other default tests, orsieve = list(aic = 7)
to use 7 rather than 10 as the AIC cutoff, orsieve = list(ltail = c(p20 = 0.3, p50 = 0.8))
to use a more stringent left tail test that requires CDF graphs to pass below the points (20, 0.3) and (50, 0.8). Customltail
andrtail
parameters must match the formats of the default tests, but their probabilities may vary. To turn off theaic
filter, usesieve = list(aic = Inf)
. To turn off theltail
filter, usesieve = list(ltail = c(p20 = 1, p50 = 1))
. To turn off thertail
filter, usesieve = list(rtail = c(p80 = 0, p120 = 0, p150 = 0, p200 = 0))
. These custom components may be mixed and matched as desired.
Value
An fmod
object, which is an unordered list of extensible models if
sieve = NULL
; otherwise, a list of class fmod
with following
components:
$filtered
the selected
dd
object or addArray
list of models that passed the tests$scores
a matrix with all models tested (rownames = model names) and the results of each test (columns
aic_test
,rtail
,ltail
,hin
,aic
)$sieve
the test criteria, stored in a list with
-
$aic_test
= cutoff for AIC -
$hin
= boolean to indicate whether high influence points were considered -
$rtail
= numeric vector giving the probabilities that the right tail of the distribution must exceed at distances of 80, 120, 150, and 200 meters in order to pass -
$ltail
= numeric vector giving the probabilities that the left tail of the distribution must NOT exceed at distances of 20 and 50 meters in order to pass
-
models
a list (
ddArray
object) of all models testednote
notes on the tests
When a fmod
object is printed, only a small subset of the elements are
shown. To see a full list of the objects, use names(x)
, where x
is the name of the fmod
return value. The elements
can be extracted in the usual R way via, for example, x$sieve
or
x[["sieve"]]
.
Examples
data(layout_simple)
data(carcass_simple)
sitedata <- initLayout(layout_simple)
ringdata <- prepRing(sitedata)
ringsWithCarcasses <- addCarcass(carcass_simple, data_ring = ringdata)
distanceModels <- ddFit(ringsWithCarcasses)
stats(distanceModels)
stats(distanceModels[["tnormal"]])
stats(distanceModels[["lognormal"]])
Convert Distribution Name + Parameters to ddSim
Object
Description
Utility function to format a distribution name and a vector of its parameter values
to a ddSim
object for use in the p/d/r/q family of functions
Usage
mpp2ddSim(distr, parms)
Arguments
distr |
character string giving the name of one of the models fit by
|
parms |
vector of parameters for the |
Value
a ddSim object with srad = NA
Vector of Names of Models with Natural Offset
Description
The natural offset is the area (m^2) searched at a given distance. Models that use the natural offset are referred to as "natural". Other models may use different offsets which alter the shape of the curve with distance in an a priori way that is unaffected by the data.
Usage
natural
Format
An object of class logical
of length 17.
Utility Function for Constructing Offsets for GLMs
Description
This is a simple utility function for calculating offsets when exposure
is assumed to be 100
calculating fitted distributions (PDF
and CDF
) but cannot be
used in the fitting of the distributions themselves because it does not
account for incomplete search coverages at given distances.
Usage
off(r, distr)
Arguments
r |
vector of distances |
distr |
name of the distribution to calculate the offset for |
Value
A vector of offset values to use with distances r
when fitting
the distr
model.
Check Parameter Value Validity the Distribution
Description
Performs a ouick check on whether the parameters given in parms
are
valid for the distr
.
Usage
parOK(parms, distr)
Arguments
parms |
vector or matrix of named glm parameters (with |
distr |
name of the distribution |
Value
vector (or scalar) of 0s, 1s, and 2s to indicate whether the parameters are non-extensible (i.e., flat-out bogus), valid for the distribution, or valid for the distribution and give finite point densities.
Default Graphics Parameters
Description
Default Graphics Parameters
Usage
par_default
Format
An object of class list
of length 65.
List of Names of the Distribution Parameters Associated with Respective Models
Description
List of Names of the Distribution Parameters Associated with Respective Models
Usage
parm_name
Format
An object of class list
of length 17.
Calculate posterior distribution of M and extract statistics (M* and CI)
Description
Calculation of the posterior distribution of total mortality
(M
) given the carcass count, overall detection probability (g
),
and prior distribtion; calculation of summary statistics from the
posterior distribution of M
, including M*
and credibility
intervals.
Usage
postM(x, g, prior = "IbinRef", mmax = NA)
postM.ab(x, Ba, Bb, prior = "IbinRef", mmax = NULL)
calcMstar(pMgX, alpha)
MCI(pMgX, crlev = 0.95)
Arguments
x |
carcass count |
g |
overall carcass detection probability |
prior |
prior distribution of |
mmax |
cutoff for prior of M (large max requires large computing resources but does not help in the estimation) |
Ba , Bb |
parameters for beta distribution characterizing estimated |
pMgX |
posterior distribution of |
crlev , alpha |
credibility level ( |
Details
The functions postM
and postM.ab
return the posterior
distributions of M|(X, g)
and M|(X, Ba, Bb)
, respectively, where
Ba
and Bb
are beta distribution parameters for the estimated
detection probability. postM
and postM.ab
include options to
to specify a prior distribution for M
and a limit for truncating the
prior to disregard implausibly large values of M
and make the
calculations tractable in certain cases where they otherwise might not be.
Use postM
when g
is fixed and known; otherwise, use postM.ab
when uncertainty in g
is characterized in a beta distribution with
parameters Ba
and Bb
. The non-informative, integrated reference
prior for binomial random variables is the default (prior = "IbinRef"
).
Other options include "binRef", "IbetabinRef", and "betabinRef", which are
the non-integrated and integrated forms of the binomial and betabinomial
reference priors (Berger et al., 2012). For X > 2
, the integrated and
non-integrated reference priors give virtually identical posteriors. However,
the non-integrated priors assign infinite weight to m = 0
and return a
posterior of Pr(M = 0| X = 0, \hat{g}) = 1
, implying absolute certainty
that the total number of fatalities was 0 if no carcasses were observed. In
addition, a uniform prior may be specified by prior = "uniform". Alternatively,
a custom prior may be given as a 2-dimensional array with columns for m
and Pr(M = m)
, respectively. The first column (m
) must be
sequential integers starting at m = 0
. The second column gives the
probabilities associated with m
, which must be non-negative and sum to 1.
The named priors ("IbinRef"
, "binRef"
, "IbetabinRef"
,
and "betabinRef"
) are functions of m
and defined on m=0,1,2,...
without upper bound. However, the posteriors can only be calculated for a
finite number of m
's up to a maximum of mmax
, which is set by
default to the smallest value of m
such that
Pr(X \leq x | m, \hat{g}) < 0.0001
, where x
is the observed
carcass count, or, alternatively, mmax
may be specified by the user.
Value
The functions postM
and postM.ab
return the posterior
distributions of M | (X, g)
and M | (X, Ba, Bb)
, respectively.
The functions calcMstar
and MCI
return M^*
value and
credibility interval for the given posterior distribution, pMgX
(which may be the return value of postM
or postM.ab
) and
\alpha
value or credibility level.
Format a Search Layout into Rings for Analysis
Description
A function for creating a characterization of the search plot at each turbine by rings. The ground around each turbine is divided into 1 meter concentric rings out to the limit of the search plot. The amount of area searched in each ring and search class (if a search class column is present in the data) at each turbine is calculated, along with the fraction of area searched in each ring. In addition, sum totals of the area in each ring and the average fraction of the area searched in each ring across all turbines at the site are tallied as well. This is a convenient structure for the Poisson regressions that are used to estimate the carcass distributions with respect to distance from the turbines, the probabilities of carcasses landing in the searched areas, and the fraction of carcasses in the searched area.
Usage
prepRing(x, ...)
## S3 method for class 'shapeLayout'
prepRing(x, scVar = NULL, notSearched = NULL, silent = FALSE, ...)
## S3 method for class 'simpleLayout'
prepRing(x, ...)
## S3 method for class 'numeric'
prepRing(x, srad, ...)
## S3 method for class 'polygonLayout'
prepRing(x, ...)
Arguments
x |
a search plot layout as imported and processed by
|
... |
ignored |
scVar |
name of the search class variable (optional), a column in the
shape file for the search polygons. |
notSearched |
name of the search class(es) in |
silent |
Processing shape files into rings may take several minutes. By
default, |
srad |
search radius for data when |
Value
an object of class rings
, which is a list with components
$rdat
list of data frames giving the area searched (
"exposure"
), in a 1 meter ring with outer radius"r"
and the number of carcasses found"ncarc"
in each ring, with search classscVar
optional. There is also a summary data frame$rdat[["total"]]
that sums the exposures and carcass counts for all turbines across the site. The$rdat[["total"]]
is the data frame used in fitting the GLMs.$rpA
list of data frames giving the proportion of area included in the searches (
"pinc"
) in each ring ("r"
). and the number of carcasses found"ncarc"
in each ring, with search classscVar
optional. There is also a summary data frame that sums the exposures and carcass counts for all turbines across the site. The$rpA
data frames are used in estimating the probability of carcasses falling in the searched area at each turbine, which, in turn is used for calculatingdwp
$srad
the maximum search radius at any of the turbines
$ncarc
vector of the number of carcasses at each turbine with names equal to the turbine names.
$scVar
name of the search class variable(s) or
NULL
$tcenter
locations of turbine centers (nturb x 2 matrix) with UTMs of turbine locations. Column names are
X
andY
. Row names are the names of the turbines.
Internal Utility Function to Parse and Format Model for Calculating Psihat
Description
Internal Utility Function to Parse and Format Model for Calculating Psihat
Usage
prepmod(model, nsim)
Arguments
model |
|
nsim |
number of simulation reps. If |
Value
ddSim
object of simulated distribution model parameters
Simple Extension of a dd
Model beyond the Search Radius
Description
Extend a distance model beyond the search radius via multiplication by a fixed,
assumed constant rather than the default normalization used for extensible
models. psi_extend
should not be used with psiHat
objects that
were calculated with extent = "full"
.
Usage
psi_extend(psi, fwin)
Arguments
psi |
|
fwin |
fraction of carcasses assumed to lie within the search radius. If
|
Value
psiHat
object extended beyond the search radius
Import Carcass Observations Locations from Shape Files
Description
Carcass coordiates (x, y) and turbine IDs are read using st_read
and formatted for adding to rings
data structures for analysis.
Usage
readCarcass(file_cod, unitCol = "turbine", quiet = FALSE)
Arguments
file_cod |
name of the file with carcass observation data. Currently, the
function requires a shape file, which gives the carcass locations on the same
coordinate system that is used for the turbines. The geometry is a simple
features points file, consisting of at least the three mandatory files standard
components (.shp, .shx, .dbf) stored in the same directory. Only the name of the
.shp is required (for example, |
unitCol |
name of column with turbine IDs. Column name and turbine IDs
must match those of the |
quiet |
boolean for directing the function to print calculation progress
updates and messages to the console. This should be set to |
Value
a shapeCarcass
object, which is a list with $carcasses
,
which is a sf
representation of the shape file file_cod
data;
$unitCol
, which is the name of the unit column; and $ncarc
, which
is a vector of carcass counts at the turbines listed in unitCol
. The
elements of the $ncarc
are named by turbines at which they were found.
Simple Utility Function Used in Optimizing the GLM
Description
A simple utility function that is used in fitting a GLM, creating a matrix of "x" values for use in the polynomial part of a xep-type model.
Usage
rmat(r, distr)
Arguments
r |
vector of distances (>=0) |
distr |
name of the distribution |
Value
array with length(r)
rows and p columns, where p is the number
of parameters in the glm (including the intercept). The first column is all
1s, and the remaining columns are functions of r, specifically, log(r), r,
r^2, r^3, or 1/r, depending on what the distribution requires.
Test Criteria for Model Selection
Description
Test Criteria for Model Selection
Usage
sieve_default
Format
A list containing the parameters used for test criteria in model
selection an ddArray
objects. The sieve_default
values are used as
a default in modelFilter
. If desired, users may create their
own tests, using sieve_default
as a template. The same list elements
must all be present and have the same structure as the defaults, namely:
$aic
the cutoff for DeltaAIC scores; models with higher scores are removed from further consideration. Default is
$aic = 10
$hin
a boolean to indicate whether or not to use high leverage points as a criterion for model selection. Default is
$hin = TRUE
$rtail
a vector of probabilities that the fitted model must exceed at 80, 120, 150, and 200 meters. Default is
rtail = c(p80 = 0.50, p120 = 0.90, p150 = 0.95, p200 = 0.99)
. Custom test parameters must be a vector probabilities with "p80", "p120", "p150", and "p200" in the names.ltail
a vector of probabilities that a fitted model must not exceed at 20 and 50 meters. Default is
ltail = c(p20 = 0.50, p50 = 0.90)
. Custom test parameters must be a vector of probabilities with "p20" and "p50" innames
.
Test Criteria for Model Selection within Search Area
Description
Test Criteria for Model Selection within Search Area
Usage
sieve_win
Format
A list containing the parameters used for test criteria in model
selection in ddArray
objects. The sieve_win
values are used
when either sieve = "win"
or extent = "win"
in arg list of
modelFilter
. The sieve parameters are:
$aic = 10
the cutoff for DeltaAIC scores; models with higher scores are removed from further consideration.
$hin = T
a boolean to indicate whether or not to use high leverage points as a criterion for model selection.
$rtail
Appropriate only for extrapolating beyond the search radius. Automatically disabled via
rtail = sieve_default$rtail * 0
.ltail
Appropriate only for extrapolating beyond the search radius. Automaticall disabled via
ltail = sieve_default$ltail * 0 + 1
Display a Tables of Summary Statistics for Distance Distributions
Description
Calculate summary statistics for a single distance distribution
(dd
object), an array of distance distributions all fit to the
same data set (ddArray
), or a list of arrays of distance
distributions fit for different carcass classes but the same site layout
(ddArraycc
).
Usage
stats(x, ...)
## S3 method for class 'dd'
stats(x, extent = "full", zrad = 200, ...)
## S3 method for class 'ddArray'
stats(x, extent = "full", zrad = 200, ...)
## S3 method for class 'ddArraycc'
stats(x, extent = "full", zrad = 200, ...)
Arguments
x |
list of models ( |
... |
ignored |
extent |
distributions within searched area ( |
zrad |
maximum distance that carcasses can lie (only used when glm parameters not extensible to Inf) |
Value
list (or list of lists if x
is ddArray
) with $model
giving the model parameters and $stats
giving the median, and 75th,
90th, and 95th quantiles of carcass distances and the estimated probability
a carcass falls within the search area according to each model
Subset Data from an Imported and Formatted Shape File
Description
Subset Data from an Imported and Formatted Shape File
Usage
## S3 method for class 'shapeCarcass'
subset(x, subset, select, ...)
## S3 method for class 'shapeLayout'
subset(x, subset, select, ...)
Arguments
x |
object to be subsetted |
subset |
values to subset by. For example, to subset |
select |
the name of the column with the values to subset by. For example,
to subset |
... |
ignored |
Value
object of the same class as x
, subsetted to values of select
equal to some element in subset
.
Locations of All Carcasses in Grid Data
Description
Locations of All Carcasses in Grid Data
Usage
xyr
Format
matrix with columns x, y, r for all 100 carcasses in the simulation to generate the carcass data for the xy grid that was searched on road and pad.