Type: Package
Title: Disaster Victim Identification
Version: 3.3.0
Description: Joint DNA-based disaster victim identification (DVI), as described in Vigeland and Egeland (2021) <doi:10.21203/rs.3.rs-296414/v1>. Identification is performed by optimising the joint likelihood of all victim samples and reference individuals. Individual identification probabilities, conditional on all available information, are derived from the joint solution in the form of posterior pairing probabilities. 'dvir' is part of the 'pedsuite' collection of packages for pedigree analysis.
License: GPL-3
URL: https://github.com/magnusdv/dvir
BugReports: https://github.com/magnusdv/dvir/issues
Depends: pedtools (≥ 2.6.0), R (≥ 4.1.0)
Imports: forrel (≥ 1.5.2), pbapply, pedFamilias, pedprobr (≥ 0.8.0), ribd, verbalisr (≥ 0.7.1)
Suggests: knitr, rmarkdown
Encoding: UTF-8
Language: en-GB
LazyData: true
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2024-09-09 08:49:07 UTC; magnu
Author: Magnus Dehli Vigeland ORCID iD [aut, cre], Thore Egeland ORCID iD [aut]
Maintainer: Magnus Dehli Vigeland <m.d.vigeland@medisin.uio.no>
Repository: CRAN
Date/Publication: 2024-09-09 19:00:17 UTC

dvir: Disaster Victim Identification

Description

Joint DNA-based disaster victim identification (DVI), as described in Vigeland and Egeland (2021) doi:10.21203/rs.3.rs-296414/v1. Identification is performed by optimising the joint likelihood of all victim samples and reference individuals. Individual identification probabilities, conditional on all available information, are derived from the joint solution in the form of posterior pairing probabilities. 'dvir' is part of the 'pedsuite' collection of packages for pedigree analysis.

Author(s)

Maintainer: Magnus Dehli Vigeland m.d.vigeland@medisin.uio.no (ORCID)

Authors:

See Also

Useful links:


Posterior pairing probabilities

Description

Compute posterior pairing and non-pairing probabilities, based on a prior and the output from jointDVI().

Usage

Bmarginal(jointRes, missing, prior = NULL)

Arguments

jointRes

Output from jointDVI().

missing

Character vector with names of missing persons.

prior

A numeric vector of length equal the number of rows in jointRes. Default is a flat prior.

Details

The prior assigns a probability to each assignment, each row of jointRes. If the prior is not specified, a flat prior is used. The prior needs not sum to 1 since the user may rather choose a flat prior on the a priori possible assignments.

Value

A matrix. Row i gives the posterior probability that victim i is one of the missing persons or someone else, denoted '*'.

See Also

jointDVI()

Examples

jointRes = jointDVI(example1)

Bmarginal(jointRes, example1$missing)

# Artificial example: all but optimal solution excluded by prior
Bmarginal(jointRes, example1$missing, prior = c(1, rep(0,26)))



Data used in the book Kling et al. (2021)

Description

Data used in last example of Chapter 4 in Kling et al. (2021) "Mass Identifications: Statistical Methods in Forensic Genetics". There are 2 female victims, 2 male victims. There are four reference families with 2 missing females and 2 missing males. There are 21 markers. An 'equal mutation mode with rate 0.005 is specified.

Usage

KETPch4

Format

A dviData object with the following content:

Examples

KETPch4

plotDVI(KETPch4, nrowPM = 4)



Data used in the book Kling et al. (2021)

Description

Data used in Example 4.8.1 in Kling et al. (2021) "Mass Identifications: Statistical Methods in Forensic Genetics". The victims are V1 and V2, both females. There is one reference family with 2 missing persons, both females. There are 21 markers, no mutation model.

Usage

KETPex481

Format

A dviData object with the following content:

Examples

plotDVI(KETPex481, marker = 1)


Data used in the book Kling et al. (2021)

Description

Data used in Exercise 4.9.7 in Kling et al. (2021) "Mass Identifications: Statistical Methods in Forensic Genetics". There are 3 female victims and 3 reference families with 3 missing females. There are 23 markers, equal mutation model, rate 0.001.

Usage

KETPex497

Format

A dviData object with the following content:

Examples

plotDVI(KETPex497, nrowPM = 3)



Data used in the book Kling et al. (2021)

Description

Data used in Exercise 4.9.8 in Kling et al. (2021) "Mass Identifications: Statistical Methods in Forensic Genetics". There are 2 female victims and 1 male. There is one reference family with 2 missing females and one missing male. There are 16 markers, equal mutation model, rate 0.001.

Usage

KETPex498

Format

A dviData object with the following content:

Examples


plotDVI(KETPex498, nrowPM = 3)


AM driven DVI

Description

AM-driven identification, i.e., considering one AM family at a time. Simple families (exactly 1 missing) are handled directly from the LR matrix, while nonsimple families are analysed with dviJoint().

Usage

amDrivenDVI(
  dvi,
  fams = NULL,
  threshold = 10000,
  threshold2 = max(1, threshold/10),
  verbose = TRUE
)

Arguments

dvi

A dviData object.

fams

A character; the names of families to consider. By default, all families. Special keywords: "simple" (all families with exactly 1 missing) and "nonsimple" (all families with > 1 missing).

threshold

LR threshold for 'certain' match.

threshold2

LR threshold for 'probable' match (in simple families).

verbose

A logical.

Details

Note: This function assumes that undisputed identifications have been removed. Strange outputs may occur otherwise.

Value

A list of dviReduced and summary.

Examples

w = amDrivenDVI(example2)
w$summary
w$dviReduced

# Bigger example: Undisputed first
u = findUndisputed(planecrash)
u$summary

# AM-driven analysis of the remaining
amDrivenDVI(u$dviReduced, threshold2 = 500)


Direct match LR

Description

Computes the likelihood ratio comparing if two samples are from the same individual or from unrelated individuals.

Usage

directMatch(x, y, g1 = NULL, g2 = NULL, .skipChecks = FALSE)

Arguments

x, y

Typed singletons.

g1, g2

(Optional) Named character vectors with genotypes for x and y respectively.

.skipChecks

A logical indicating that various input checks can be skipped, e.g. when called by mergePM().

Value

A nonnegative likelihood ratio.

See Also

mergePM().

Examples


pm = singletons(c("V1", "V2", "V3")) |>
  addMarker(V1 = "1/1", V2 = "2/2", V3 = "1/1",
            afreq = c("1" = 0.01, "2" = 0.99), name = "L1")

directMatch(pm[[1]], pm[[2]])
directMatch(pm[[1]], pm[[3]])


Compare DVI approaches

Description

Compare the efficiency of different computational approaches to DVI.

Usage

dviCompare(
  dvi,
  true,
  refs = typedMembers(am),
  methods = 1:6,
  markers = NULL,
  threshold = 1,
  simulate = TRUE,
  db = getFreqDatabase(am),
  Nsim = 1,
  returnSims = FALSE,
  seed = NULL,
  numCores = 1,
  verbose = TRUE
)

Arguments

dvi

A dviData object, typically created with dviData().

true

A character of the same length as dvi$pm, with the true solution, e.g., true = c("M2", "M3", "*") if the truth is V1 = M2, V2 = M3 and V3 unmatched.

refs

Character vector with names of the reference individuals. By default the typed members of dvi$am.

methods

A subset of the numbers 1,2,3,4,5,6.

markers

If simulate = FALSE: A vector indicating which markers should be used.

threshold

An LR threshold passed on to the sequential methods.

simulate

A logical, indicating if simulations should be performed.

db

A frequency database used for simulation, e.g., forrel::NorwegianFrequencies. By default the frequencies attached to dvi$am are used.

Nsim

A positive integer; the number of simulations.

returnSims

A logical: If TRUE, the simulated data are returned without any DVI comparison.

seed

A seed for the random number generator, or NULL.

numCores

The number of cores used in parallelisation. Default: 1.

verbose

A logical.

Details

The following methods are available for comparison, through the methods parameter:

  1. Sequential, without LR updates

  2. Sequential, with LR updates

  3. Sequential (undisputed) + joint (remaining). Always return the most likely solution(s).

  4. Joint - brute force. Always return the most likely solution(s).

  5. Like 3, but return winner(s) only if LR > threshold; otherwise the empty assignment.

  6. Like 4, but return winner(s) only if LR > threshold; otherwise the empty assignment.

Value

A list of solution frequencies for each method, and a vector of true positive rates for each method.

Examples


refs = "R1"
db = forrel::NorwegianFrequencies[1:3]

# True solution
true = c("M1", "M2", "M3")

# Run comparison

# dviCompare(example1, refs, true = true, db = db, Nsim = 2, seed = 123)


# Alternatively, simulations can be done first...
sims = dviCompare(example1, refs, true = true, simulate = TRUE,
                  db = db, Nsim = 2, seed = 123, returnSims = TRUE)

# ... and computations after:

# dviCompare(sims, refs, true = true, simulate = FALSE)



DVI data

Description

DVI data

Usage

dviData(pm, am, missing, generatePairings = TRUE)

checkDVI(
  dvi,
  pairings = NULL,
  errorIfEmpty = FALSE,
  ignoreSex = FALSE,
  verbose = TRUE
)

Arguments

pm

A list of singletons: The victim samples.

am

A list of pedigrees: The reference families.

missing

A character vector with names of missing persons.

generatePairings

A logical. If TRUE (default) a list of sex-compatible pairings is included as part of the output.

dvi

A dviData object.

pairings

A list of pairings.

errorIfEmpty

A logical.

ignoreSex

A logical.

verbose

A logical.

Value

An object of class dviData, which is basically a list of pm, am, missing and pairings.

Examples

dvi = dviData(pm = singleton("V1"), am = nuclearPed(1), missing = "3")
dvi

checkDVI(dvi)


Finds Generalised Likelihood Ratios (GLRs)

Description

Based on a dviData object, or output from ⁠dviJoint()``or ⁠jointDVI()', the GLR, the ratio of the maximum likelihood under H0 to the maximum under H1, is calculated for specified hypotheses.

Usage

dviGLR(dvi, pairings = generatePairings(dvi), dviRes = NULL)

Arguments

dvi

A dviData object.

pairings

List. See details.

dviRes

data frame. Output from jointDVI().

Details

The Generalised Likelihood Ratio (GLR) statistic is defined as the ratio of the maximum likelihood for the alternatives in the numerator to the maximum in the denominator. The default pairings = dvir::generatePairings(dvi) tests all hypotheses. Specific tests can be specified as shown in an example: pairings = list(V1 = "M1") gives a test for H0: V1 = M1 against H1: V1 != M1. dviRes will be calculated using jointDVI() if not provided.

Value

A data frame with GLRs and SGLR (strict GLR, max replaced by min in the numerator).

Examples

# dviGLR(example2, pairings = list(V1 = "M1"))

# All tests with output from jointDVI
# dviGLR(example2)


Joint DVI search

Description

This is a redesign of jointDVI(), with narrower scope (no preprocessing steps) and more informative output. The output includes the pairwise GLR matrix based on the joint table.

Usage

dviJoint(
  dvi,
  assignments = NULL,
  ignoreSex = FALSE,
  disableMutations = FALSE,
  maxAssign = 1e+05,
  numCores = 1,
  cutoff = 0,
  verbose = TRUE,
  progress = verbose
)

Arguments

dvi

A dviData object, typically created with dviData().

assignments

A data frame containing the assignments to be considered in the joint analysis. By default, this is automatically generated by taking all combinations from dvi$pairings.

ignoreSex

A logical, only relevant if dvi$pairings is NULL, so that candidate pairings have to be generated.

disableMutations

Either TRUE, FALSE (default) or NA. If NA, mutation modelling is applied only in families where the reference data are incompatible with the pedigree unless at least one mutation has occurred.

maxAssign

A positive integer. If the number of assignments going into the joint calculation exceeds this, the function will abort with an informative error message. Default: 1e5.

numCores

An integer; the number of cores used in parallelisation. Default: 1.

cutoff

A number; if non-negative, the output table is restricted to LRs equal to or exceeding this value.

verbose

A logical.

progress

A logical, indicating if a progress bar should be shown.

Value

A data frame.

Examples

dviJoint(example2)


Simulate genotypes in a DVI dataset

Description

Simulates genotypes for the references and missing persons in each AM family, transfers to the PM singletons according to the indicated matching. Remaining victims are simulated as unrelated.

Usage

dviSim(
  dvi,
  N = 1,
  refs = typedMembers(dvi$am),
  truth = NULL,
  seed = NULL,
  conditional = FALSE,
  simplify1 = TRUE,
  verbose = FALSE
)

Arguments

dvi

A dviData object.

N

The number of complete simulations to be performed.

refs

A character indicating reference individuals. By default, the typed members of the input. If conditional = TRUE, the refs should be a subset of the typed members, and the simulations are conditional on these.

truth

A named vector of the format c(vic1 = mis1, vic2 = mis2, ...).

seed

An integer seed for the random number generator.

conditional

A logical, by default FALSE. If TRUE, references are kept unchanged, while the missing persons are simulated conditional on these.

simplify1

A logical, by default TRUE, removing the outer list layer when N = 1. See Value.

verbose

A logical.

Value

If N = 1, a dviData object similar to the input, but with new genotypes for the pm samples. If N > 1, a list of dviData objects.

See Also

forrel::profileSim().

Examples


# Simulate refs and missing once and plot:
ex = dviSim(example2, N = 1, truth = c(V1 = "M1", V2 = "M2", V3 = "M3"))
plotDVI(ex, marker = 1)

# Two simulations and plot for the first
ex = dviSim(example2, N = 2, truth = c(V1 = "M1", V2 = "M2", V3 = "M3"),
            seed = 1729)
plotDVI(ex[[1]], marker = 1)



A complete pipeline for solving a DVI case

Description

This wraps several other functions into a complete pipeline for solving a DVI case.

Usage

dviSolve(
  dvi,
  threshold = 10000,
  threshold2 = max(1, threshold/10),
  maxIncomp = 2,
  ignoreSex = FALSE,
  limit = 0,
  verbose = TRUE,
  debug = FALSE
)

Arguments

dvi

A dviData object.

threshold

LR threshold for 'significant' match.

threshold2

LR threshold for 'probable' match. By default set to threshold/10.

maxIncomp

An integer passed onto findExcluded(). A pairing is excluded if the number of incompatible markers exceeds this.

ignoreSex

A logical, by default FALSE.

limit

A number passed onto findUndisputed(); only pairwise LR values above this are considered.

verbose, debug

Logicals.

Value

A data frame.

Examples

dviSolve(example2)
dviSolve(example2, threshold = 5, verbose = FALSE)


DVI dataset: Generational trio

Description

A proof-of-concept dataset involving three missing members (child, father, grandfather) of a single family. With the given data, stepwise victim identification fails to find the correct solution, while joint identification succeeds.

Usage

example1

Format

A dviData object with the following content:

Examples

   
example1

plotDVI(example1, marker = 1)

jointDVI(example1)  


DVI dataset: Two reference families

Description

A small DVI example with three victims, and three missing persons from two reference families

Usage

example2

Format

A dviData object with the following content:

Examples

   
example2

plotDVI(example2, marker = 1, nrowPM = 3)

jointDVI(example2)


Exclude pairings

Description

Disallow certain pairings by removing them from the list dvi$pairings of candidate pairings for a given victim sample.

Usage

excludePairing(dvi, victim, missing)

Arguments

dvi

A dviData object.

victim

The name of a single victim sample.

missing

The name(s) of one or several missing individuals.

Value

A dviData object.

Examples

# Disallow V1 = M1 in the `example2` dataset:
ex = excludePairing(example2, victim = "V1", missing = "M1")
jointDVI(ex, verbose = FALSE)

# Compare with original
jointDVI(example2, verbose = FALSE)

# The only difference is in the `pairings` slot:
ex$pairings
example2$pairings


Dataset: Exclusion example

Description

This data is based on a real case, but pedigrees have been changed and marker data simulated to preserve anonymity.

Usage

exclusionExample

Format

A dviData object with the following content:

Examples


exclusionExample


Combinations without duplications

Description

This is similar to expand.grid() except that combinations with repeated elements are not included. The element "*" is treated separately, and is allowed to be repeated.

Usage

expand.grid.nodup(lst, max = 1e+05)

Arguments

lst

A list of vectors.

max

A positive integer. If the number of combinations (according to a preliminary lower bound) exceeds this, the function aborts with an informative error message. Default: 1e5.

Value

A data frame.

See Also

expand.grid()

Examples


lst = list(1, 1:2, 3:4)

# Compare
expand.grid.nodup(lst)
expand.grid(lst)

# Typical use case for DVI
lst2 = generatePairings(example1)
expand.grid.nodup(lst2)


Convert a Familias file to DVI data

Description

This is a wrapper for pedFamilias::readFam() that reads Familias files with DVI information.

Usage

familias2dvir(
  famfile,
  victimPrefix = NULL,
  familyPrefix = NULL,
  refPrefix = NULL,
  missingPrefix = NULL,
  missingFormat = NULL,
  othersPrefix = NULL,
  verbose = FALSE,
  missingIdentifier = "^Missing"
)

Arguments

famfile

Path to Familias file.

victimPrefix

Prefix used to label PM individuals.

familyPrefix

Prefix used to label the AM families.

refPrefix

Prefix used to label the reference individuals, i.e., the typed members of the AM families.

missingPrefix

Prefix used to label the missing persons. At most one of missingPrefix and missingFormat can be given.

missingFormat

A string indicating family-wise labelling of missing persons, using ⁠[FAM]⁠, ⁠[IDX]⁠, ⁠[MIS]⁠ as place holders with the following meanings (see Examples):

  • ⁠[FAM]⁠: family index

  • ⁠[IDX]⁠: index of missing person within the family

  • ⁠[MIS]⁠: index within all missing persons

othersPrefix

Prefix used to label other untyped individuals. Use "" for numeric labels ( 1, 2, ...).

verbose

A logical, passed on to readFam().

missingIdentifier

A character of length 1 used to identify missing persons in the Familias file. The default chooses everyone whose label begins with "Missing".

Details

The sex of the missing persons need to be checked as this information may not be correctly recorded in the fam file.

Value

A dviData object.

See Also

dviData(), relabelDVI()

Examples


# Family with three missing
file = system.file("extdata", "dvi-example.fam", package="dvir")

# Read file without relabelling
y = familias2dvir(file)
plotDVI(y)

# With relabelling
z = familias2dvir(file, missingFormat = "M[FAM]-[IDX]",
                   refPrefix = "ref", othersPrefix = "E")
plotDVI(z)


Excluded individuals and pairings in a DVI dataset

Description

Analysing exclusions is often an efficient way to reduce large DVI datasets. A pairing V = M is excluded if it implies (too many) genetic inconsistencies. The function findExcluded() identifies and removes (i) victim samples with too many inconsistencies against all missing persons, (ii) missing persons with too many inconsistencies against all victim samples, and (iii) inconsistent pairings among the remaining.

Usage

findExcluded(
  dvi,
  maxIncomp = 2,
  pairings = NULL,
  ignoreSex = FALSE,
  verbose = TRUE
)

exclusionMatrix(dvi, pairings = NULL, ignoreSex = FALSE)

Arguments

dvi

A dviData() object.

maxIncomp

An integer. A pairing is excluded if the number of incompatible markers exceeds this.

pairings

A list of possible pairings for each victim. By default, dvi$pairings is used, or, if this is NULL, generatePairings(dvi, ignoreSex).

ignoreSex

A logical, by default: FALSE.

verbose

A logical, by default TRUE.

Details

The main calculation in findExcluded() is done by exclusionMatrix(), which records number of incompatible markers of each pairwise comparison.

Value

A list with the following entries:

See Also

findUndisputed(). See also forrel::findExclusions() for analysis of a specific pairwise comparison.

Examples


e = findExcluded(icmp)
e$summary
e$exclusionMatrix

# The exclusion matrix can also be computed directly:
exclusionMatrix(icmp)

# Inspect a particular pair: M4 vs V4
forrel::findExclusions(icmp$am, id = "M4", candidate = icmp$pm$V4)

# Plot one of the incompatible markers
plotDVI(icmp, pm = 4, marker ="D7S820")


Nonidentifiable missing persons

Description

A missing person in a DVI case is nonidentifiable if unrelated to all (genotyped) reference individuals and all other missing persons in the reference family. It is often wise to ignore such individuals in jointDVI() and other analyses, to relieve the computational burden.

Usage

findNonidentifiable(dvi)

Arguments

dvi

A dviData object, typically created with dviData().

Details

The implementation uses ribd::kinship() to identify individuals having kinship coefficient 0 with all relevant individuals.

Value

A list with the following entries:

Examples

# Example 1: No nonidentifiables in dataset `example1`
findNonidentifiable(example1)

# Example 2: Add nonidentifiable person "A"
amNew = example1$am[[1]] |>
  addSon(parents = c("NN", "A"))
missNew = c(example1$missing, "A")

dvi = dviData(pm = example1$pm, am = amNew, missing = missNew)
plotDVI(dvi, textAbove = c(A = "nonidentif."))

findNonidentifiable(dvi)


Undisputed identifications in a DVI problem

Description

This function uses the pairwise LR matrix to find undisputed matches between victims and missing individuals. An identification V_i = M_j is called undisputed, relative to a threshold T, if the corresponding likelihood ratio LR_{i,j} \geq T AND LR_{i,j} is at least T times greater than all other pairwise LRs involving V_i or M_j.

Usage

findUndisputed(
  dvi,
  pairings = NULL,
  ignoreSex = FALSE,
  threshold = 10000,
  strict = FALSE,
  relax = !strict,
  limit = 0,
  nkeep = NULL,
  numCores = 1,
  verbose = TRUE
)

Arguments

dvi

A dviData object, typically created with dviData().

pairings

A list of possible pairings for each victim. If NULL, all sex-consistent pairings are used.

ignoreSex

A logical.

threshold

A non-negative number. If no pairwise LR exceed this, the iteration stops.

strict

A logical affecting the definition of being undisputed (see Details). Default: FALSE.

relax

Deprecated; use strict = FALSE instead.

limit

A positive number. Only pairwise LR values above this are considered.

nkeep

An integer, or NULL. If given, only the nkeep most likely pairings are kept for each victim.

numCores

An integer; the number of cores used in parallelisation. Default: 1.

verbose

A logical. Default: TRUE.

Details

If the parameter strict is set to TRUE, the last criterion is replaced with the stronger requirement that all other pairwise LRs involving V_i or M_j must be at most 1.

Value

A list with the following entries:

See Also

pairwiseLR(), findExcluded()

Examples



u1 = findUndisputed(planecrash, verbose = FALSE)
u1$summary 

# With `strict = TRUE`, the match M3 = V2 goes away
u2 = findUndisputed(planecrash, strict = TRUE, verbose = FALSE)
u2$summary

# Reason: M3 has LR > 1 also against V7
u2$LRmatrix[, "M3"] |> round(2)



DVI dataset: Family of fire victims

Description

A family with three missing persons after a fire, and one reference individual. This example is featured in the GLR paper (Egeland & Vigeland, 2024).

Usage

fire

Format

A dviData object with the following content:

Examples

fire

plotDVI(fire, marker = 1)

jointDVI(fire)


Format final summary table

Description

Combines and harmonises summary tables from different DVI analyses

Usage

formatSummary(dfs, orientation = c("AM", "PM"), columns = NULL, dvi = NULL)

Arguments

dfs

A list of data frames.

orientation

Either "AM" or "PM", controlling column order and sorting.

columns

A (optional) character vector with column names in the wanted order.

dvi

A dviData object used for sorting. Note that if given, this must contain all victims and families.

Details

The default column order is controlled by orientation, which the following effect:

Columns (in any of the data frames) other than these are simply ignored.

Value

A data frame.

Examples

u = findUndisputed(planecrash)
a = amDrivenDVI(u$dviReduced, threshold2 = 500)

u$summary
a$summary

formatSummary(list(u$summary, a$summary$AM))
formatSummary(list(u$summary, a$summary$PM), orientation = "PM", dvi = planecrash)


Sex-consistent pairings

Description

Generate a list of sex-consistent pairings for each victim in a DVI problem. By default, the empty pairing (denoted *) is included for each victim.

Usage

generatePairings(dvi, includeEmpty = TRUE, ignoreSex = FALSE)

Arguments

dvi

A dviData object, typically created with dviData().

includeEmpty

A logical. If TRUE (default), the do-nothing symbol (*) is included for each victim.

ignoreSex

A logical.

Value

A list of character vectors. Each vector is a subset of missing, plus the character * denoting no pairing.

See Also

jointDVI()

Examples


pm = singletons(c("V1", "V2"), sex = 1:2)
          
missing = paste0("M", 1:4)
am = list(nuclearPed(children = missing[1:3]),
          nuclearPed(children = missing[4], sex = 2))

dvi = dviData(pm, am, missing)
generatePairings(dvi)


Get AM component of selected individuals

Description

Get AM component of selected individuals

Usage

getFamily(dvi, ids)

Arguments

dvi

A dviData() object.

ids

A vector of ID labels of members of dvi$am.

Value

A vector of the same length as ids, containing the family names (if dvi$am is named) or component indices (otherwise) of the ids individuals.

Examples

getFamily(example2, ids = example2$missing)


Find the simple families of a DVI dataset

Description

Extract the names (if present) or indices of the simple reference families, i.e., the families containing exactly 1 missing person.

Usage

getSimpleFams(dvi)

Arguments

dvi

A dviData object.

Value

A character (if dvi$am has names) or integer vector.

See Also

getFamily()

Examples

# No simple families
simple1 = getSimpleFams(example1)
stopifnot(length(simple1) == 0)

# Second family is simple
simple2 = getSimpleFams(example2)
stopifnot(simple2 == "F2")


DVI dataset: Family grave

Description

Family grave data in Kling et al. (2021) "Mass Identifications: Statistical Methods in Forensic Genetics". There are 5 female victims and 3 male victims. There is one reference family with 5 missing females and 3 missing males. There are 23 markers, no mutation model.

Usage

grave

Format

A dviData object with the following content:

Examples

grave

# plotDVI(grave, marker = 1)

# jointDVI(grave)


DVI dataset: A large reference pedigree

Description

DVI dataset based loosely on the ICMP 2017 workshop material https://www.few.vu.nl/~ksn560/Block-III-PartI-KS-ISFG2017.pdf (page 18). There are 3 female victims, 2 male victims and 6 missing persons of both sexes. We have renamed the individuals and simulated data for 13 CODIS markers (see Details).

Usage

icmp

Format

A dviData object with the following content:

Details

The 13 markers are, in order: CSF1PO, D3S1358, D5S818,D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, FGA, TH01, TPOX, and vWA.

Source code for the simulation, and a file containing the allele frequencies, can be found in the data-raw folder of the GitHub repository: https://github.com/magnusdv/dvir.

Examples

icmp

# plotDVI(icmp)

# Markers and allele frequencies
db = pedtools::getFreqDatabase(icmp$pm)
db


Joint DVI search

Description

Victims are given as a list of singletons, and references as a list of pedigrees. All possible assignments are evaluated and solutions ranked according to the likelihood.

Usage

jointDVI(
  dvi,
  pairings = NULL,
  ignoreSex = FALSE,
  assignments = NULL,
  limit = 0,
  nkeep = NULL,
  undisputed = TRUE,
  markers = NULL,
  threshold = 10000,
  strict = FALSE,
  relax = !strict,
  disableMutations = NA,
  maxAssign = 1e+05,
  numCores = 1,
  check = TRUE,
  verbose = TRUE
)

compactJointRes(jointRes, LRthresh = NULL)

Arguments

dvi

A dviData object, typically created with dviData().

pairings

A list of possible pairings for each victim. If NULL, all sex-consistent pairings are used.

ignoreSex

A logical.

assignments

A data frame containing the assignments to be considered in the joint analysis. By default, this is automatically generated by taking all combinations from pairings.

limit

A positive number, by default 0. Only pairwise LR values above this are considered.

nkeep

An integer, or NULL. If given, only the nkeep most likely pairings are considered for each victim.

undisputed

A logical, by default TRUE.

markers

A vector indicating which markers should be included in the analysis. By default all markers are included.

threshold

A positive number, passed onto findUndisputed(). Default: 1e4.

strict

A logical, passed onto findUndisputed(). Default: FALSE.

relax

Deprecated.

disableMutations

A logical, or NA (default). The default action is to disable mutations in all reference families without Mendelian errors.

maxAssign

A positive integer. If the number of assignments going into the joint calculation exceeds this, the function will abort with an informative error message. Default: 1e5.

numCores

An integer; the number of cores used in parallelisation. Default: 1.

check

A logical, indicating if the input data should be checked for consistency.

verbose

A logical.

jointRes

A data frame produced by jointDVI().

LRthresh

A positive number, used as upper limit for the LR comparing the top result with all others.

Value

A data frame. Each row describes an assignment of victims to missing persons, accompanied with its log likelihood, the LR compared to the null (i.e., no identifications), and the posterior corresponding to a flat prior.

The function compactJointRes() removes columns without assignments, and solutions whose LR compared with the top result is below 1/LRthresh.

See Also

pairwiseLR(), findUndisputed()

Examples

jointDVI(example2)


Identity and merge matching PM samples

Description

Computes the direct matching LR of each pair of samples, and merges the matching samples.

Usage

mergePM(
  pm,
  threshold = 10000,
  method = c("mostcomplete", "first", "combine"),
  verbose = TRUE
)

Arguments

pm

A list of typed singletons.

threshold

LR threshold for positive identification.

method

A keyword indicating how to merging matching samples. See Details.

verbose

A logical.

Details

The available methods for merging matched samples are:

Value

A list with the following entries:

See Also

directMatch().

Examples


pm = singletons(c("V1", "V2", "V3")) |> 
  addMarker(V1 = "1/1", V2 = "2/2", V3 = "1/1", 
            afreq = c("1" = 0.01, "2" = 0.99), name = "L1")

mergePM(pm)


The number of assignments for DVI problem

Description

The number of victims and missing persons of each sex is given. The number of possible assignments, i.e., the number of ways the victims can be identified with the missing persons, is calculated.

Usage

ncomb(nVfemales, nMPfemales, nVmales, nMPmales)

Arguments

nVfemales

Integer. The number of female victims.

nMPfemales

Integer. The number of female missing persons.

nVmales

Integer. The number of male victims.

nMPmales

Integer. The number of male missing persons.

Value

The total number of possible assignments.

Examples


# Example
m1 = ncomb(5,5,5,5) #

# Example: 3 male victims; 2 male missing persons.
# The number of a priori possible assignments is
m1 = ncomb(0,0,3,2) # 13


# Compare with the complete list of assignments
m2 = expand.grid.nodup(list(V1 = c("*", "M1", "M2"),
                            V2 = c("*", "M1", "M2"),
                            V3 = c("*", "M1", "M2")))
stopifnot(m1 == nrow(m2))


Pairwise LR matrix

Description

For a given DVI problem, compute the matrix consisting of pairwise likelihood ratios LR_{i,j} comparing V_i = M_j to the null. The output may be reduced by specifying arguments limit or nkeep.

Usage

pairwiseLR(
  dvi,
  pairings = NULL,
  ignoreSex = FALSE,
  limit = 0,
  nkeep = NULL,
  check = TRUE,
  numCores = 1,
  verbose = FALSE
)

Arguments

dvi

A dviData object, typically created with dviData().

pairings

A list of possible pairings for each victim. If NULL, all sex-consistent pairings are used.

ignoreSex

A logical.

limit

A nonnegative number controlling the pairings slot of the output: Only pairings with LR greater or equal to limit are kept. If zero (default), pairings with LR > 0 are kept.

nkeep

An integer, or NULL. If given, only the nkeep most likely pairings are kept for each victim.

check

A logical, indicating if the input data should be checked for consistency.

numCores

An integer; the number of cores used in parallelisation. Default: 1.

verbose

A logical.

Value

A list with 3 elements:

Examples

pairwiseLR(example1, verbose = TRUE)


DVI dataset: Simulated plane crash

Description

A simulated dataset based on Exercise 3.3 in Egeland et al. "Relationship Inference with Familias and R" (2015).

Usage

planecrash

Format

A dviData object with the following content:

Details

The 15 markers are CSF1PO, D13S317, D16S539, D18S51, D21S11, D3S1358, D5S818, D7S820, D8S1179, FGA, PENTA_D, PENTA_E, TH01, TPOX, and VWA.

Source code for the simulation, and a file containing the allele frequencies, can be found in the data-raw folder of the GitHub repository: https://github.com/magnusdv/dvir.

Examples

planecrash

# plotDVI(planecrash)

# Markers and allele frequencies
db = pedtools::getFreqDatabase(planecrash$pm)
db


Plot a DVI problem

Description

Plot a DVI problem

Usage

plotDVI(
  dvi,
  pm = TRUE,
  am = TRUE,
  style = 1,
  famnames = NA,
  hatched = typedMembers,
  frames = TRUE,
  titles = c("PM", "AM"),
  cex = NA,
  col = 1,
  lwd = 1,
  fill = NA,
  carrier = NULL,
  widths = NULL,
  heights = NULL,
  nrowPM = NA,
  nrowAM = NA,
  dev.height = NULL,
  dev.width = NULL,
  newdev = !is.null(c(dev.height, dev.width)),
  ...
)

Arguments

dvi

A dviData object, typically created with dviData().

pm

Either a logical indicating if the PM data should be plotted (as a set of singletons), or a vector of indices selecting a subset of the PM samples. Default: TRUE.

am

Either a logical indicating if the AM families data should be plotted, or a vector of indices selecting a subset of the families. Default: TRUE.

style

An integer (currently 1 or 2) indicating the style of the plot.

famnames

A logical. If NA (default) family names are included if there are multiple families.

hatched

A character vector of ID labels, or the name of a function. By default, typed individuals are hatched.

frames

A logical, by default TRUE.

titles

A character of length 2.

fill, cex, col, lwd, carrier

Arguments passed on to pedtools::plot.ped().

widths, heights

Numeric with relative columns widths / row heights, to be passed on to layout().

nrowPM

The number of rows in the array of PM singletons.

nrowAM

The number of rows in the array of AM families.

dev.height, dev.width

Plot height and widths in inches. These are optional, and only relevant if newdev = TRUE.

newdev

A logical indicating if a new plot window should be opened.

...

Further parameters to be passed on to pedtools::plot.ped(), e.g., marker, cex, cex.main, symbolsize.

Examples


plotDVI(example2)

# Override default layout of PM singletons
plotDVI(example2, nrowPM = 1)

# Subset
plotDVI(example2, pm = 1:2, am = 1, titles = c("PM (1-2)", "AM (1)"))

# AM only
plotDVI(example2, pm = FALSE, titles = "AM families")

# Further plot options
plotDVI(example2, frames = FALSE, marker = 1, cex = 1.2, nrowPM = 3, nrowAM = 2,
  textAnnot = list(inside = list(c(M1 = "?", M2 = "?", M3 = "?"), cex = 1.5)))


Plot DVI solution

Description

A version of plotDVI() tailor-made to visualise identified individuals, for example as reported by jointDVI().

Usage

plotSolution(dvi, assignment, k = 1, format = "[S]=[M]", ...)

Arguments

dvi

A dviData object.

assignment

A named character of the format c(victim = missing, ...), or a data frame produced by jointDVI().

k

An integer; the row number when assignment is a data frame.

format

A string indicating how identified individuals should be labelled, using ⁠[M]⁠ and ⁠[S]⁠ as place holders for the missing person and the matching sample, respectively. (See Examples.)

...

Further arguments passed on to plotDVI().

Value

NULL.

Examples


res = jointDVI(example2, verbose = FALSE)

plotSolution(example2, res)

# With line break in labels
plotSolution(example2, res, format = "[M]=\n[S]")

# With genotypes for marker 1
plotSolution(example2, res, marker = 1)

# Non-optimal solutions
plotSolution(example2, res, k = 2, pm = FALSE)
plotSolution(example2, res, k = 2, cex = 1.3)


Plot undisputed identifications

Description

Plot undisputed identifications

Usage

plotUndisputed(dvi, undisputed, ...)

Arguments

dvi

A dviData object.

undisputed

A data frame containing the undisputed matches, typically the entry undisputed in output from findUndisputed() (only three first columns used).

...

Further arguments passed on to plotSolution().

See Also

findUndisputed(), plotSolution()

Examples


# Example
res = findUndisputed(example2, threshold = 2, verbose = FALSE)
u = res$summary
plotUndisputed(example2, u, marker = 1)


Automatic labelling of a DVI dataset

Description

Relabel the individuals and families in a DVI dataset.

Usage

relabelDVI(
  dvi,
  victims = NULL,
  victimPrefix = NULL,
  familyPrefix = NULL,
  refs = NULL,
  refPrefix = NULL,
  missingPrefix = NULL,
  missingFormat = NULL,
  othersPrefix = NULL
)

Arguments

dvi

A dviData object, typically created with dviData().

victims

A named vector of the form c(old = new) with names for the PM samples, or a function to be applied to the existing names.

victimPrefix

Prefix used to label PM individuals.

familyPrefix

Prefix used to label the AM families.

refs

A named vector of the form c(old = new) with names for the typed references, or a function to be applied to the existing names.

refPrefix

Prefix used to label the reference individuals, i.e., the typed members of the AM families.

missingPrefix

Prefix used to label the missing persons. At most one of missingPrefix and missingFormat can be given.

missingFormat

A string indicating family-wise labelling of missing persons, using ⁠[FAM]⁠, ⁠[IDX]⁠, ⁠[MIS]⁠ as place holders with the following meanings (see Examples):

  • ⁠[FAM]⁠: family index

  • ⁠[IDX]⁠: index of missing person within the family

  • ⁠[MIS]⁠: index within all missing persons

othersPrefix

Prefix used to label other untyped individuals. Use "" for numeric labels ( 1, 2, ...).

Value

A dviData() object.

Examples


# Builtin dataset `example2`
relabelDVI(example2,
           victimPrefix  = "vic",
           familyPrefix  = "fam",
           refPrefix     = "ref",
           missingPrefix = "mp")

# Family-wise labelling of missing persons
relabelDVI(example2, missingFormat = "M[FAM]-[IDX]")
relabelDVI(example2, missingFormat = "M[IDX] (F[FAM])")
relabelDVI(example2, missingFormat = "fam[FAM].m[IDX]")


Sequential DVI search

Description

Performs a sequential matching procedure based on the pairwise LR matrix. In each step the pairing corresponding to the highest LR is selected and included as a match if the LR exceeds the given threshold. By default, (updateLR = TRUE) the pairwise LRs are recomputed in each step after including the data from the identified sample.

Usage

sequentialDVI(
  dvi,
  updateLR = TRUE,
  threshold = 1,
  check = TRUE,
  verbose = TRUE,
  debug = FALSE
)

Arguments

dvi

A dviData object, typically created with dviData().

updateLR

A logical. If TRUE, the LR matrix is updated in each iteration.

threshold

A non-negative number. If no pairwise LR values exceed this, the iteration stops.

check

A logical, indicating if the input data should be checked for consistency.

verbose

A logical, by default TRUE.

debug

A logical, by default FALSE. If TRUE, the LR matrix is printed in each step.

Details

If, at any point, the highest LR is obtained by more than one pairing, the process branches off and produces multiple solutions. (See Value.)

Value

A list with two elements:

Examples

# Without LR updates
sequentialDVI(example1, updateLR = FALSE)

# With LR updates (default). Note two branches!
r = sequentialDVI(example1)

# Plot the two solutions
plotSolution(example1, r$matches, k = 1)
plotSolution(example1, r$matches, k = 2)

# Add `debug = T` to see the LR matrix in each step
sequentialDVI(example1, debug = TRUE)

# The output of can be fed into `jointDVI()`:
jointDVI(example1, assignments = r$matches)


Set identifications manually

Description

Manually set one or several identifications in a DVI dataset. Typically, these are obtained by external means, e.g., fingerprints, dental records etc.

Usage

setPairing(
  dvi,
  match = NULL,
  victim = NULL,
  missing = NULL,
  Conclusion = "Provided",
  Comment = "",
  verbose = TRUE
)

Arguments

dvi

A DVI dataset.

match

A named vector of the format c(vic1 = miss2, vic2 = miss2, ...).

victim

A vector of victim sample names. If NULL, defaulting to names(match).

missing

A vector of missing person names, of the same length as victim. If NULL, defaulting to as.character(match).

Conclusion

A character passed on to the Conclusion column of the output summary.

Comment

A character passed on to the Comment column of the output summary.

verbose

A logical, by default TRUE.

Details

The command setPairing(dvi, c("V" = "M")) does the following:

Value

A list with the following entries:

Examples

x = setPairing(example2, match = c("V3" = "M2"))
x$dviReduced
x$summary

# Alternative syntax, using `victim` and `missing`
y = setPairing(planecrash, victim = c("V4", "V5"), missing = c("M4", "M5"),
           Conclusion = "External evidence", Comment = "Dental")
y$dviReduced
y$summary


Extract a subset of a DVI dataset

Description

Extract a subset of a DVI dataset

Usage

subsetDVI(dvi, pm = NULL, am = NULL, missing = NULL, verbose = TRUE)

Arguments

dvi

A dviData() object

pm

A vector with names or indices of victim samples. By default, all are included.

am

A vector with names or indices of AM components. By default, components without remaining missing individuals are dropped.

missing

A vector with names or indices of missing persons. By default, all missing persons in the remaining AM families are included.

verbose

A logical.

Value

A dviData object.

Examples


subsetDVI(example2, pm = 1:2) |> plotDVI()
subsetDVI(example2, pm = "V1", am = 1) |> plotDVI()
subsetDVI(example2, missing = "M3") |> plotDVI()


Swap orientation of an assignment table

Description

This function switches the roles of victims and missing persons in a table of assignments, from PM-oriented (victims as column names) to AM-oriented (missing persons as column names), and vice versa. In both version, each row describes the same assignment vector.

Usage

swapOrientation(df, from = NULL, to = NULL)

Arguments

df

A data frame. Each row is an assignment, with * representing non-pairing.

from

A character vector; either victims or missing persons. By default, the column names of df. The only time this argument is needed, if when df has other columns in addition, as in output tables of dviJoint().

to

The column names of the transformed data frame. If missing, the unique elements of df are used. An error is raised if to does not contain all elements of df (except *).

Value

A data frame with nrow(df) rows and length(to) columns.

Examples

df = example1 |> generatePairings() |> expand.grid.nodup()
df
swapOrientation(df)

# Swap is idempotent
stopifnot(identical(swapOrientation(swapOrientation(df)), df))


Dataset: Symmetry examples

Description

A toy DVI dataset illustrating various forms of undecidability due to symmetries in the solutions.

Usage

symmetricSibs

Format

A dviData object with the following content:

Examples


symmetricSibs