Help for package fxTWAPLS

Title:

An Improved Version of WA-PLS

Version:

0.1.3

Description:

The goal of this package is to provide an improved version of WA-PLS (Weighted Averaging Partial Least Squares) by including the tolerances of taxa and the frequency of the sampled climate variable. This package also provides a way of leave-out cross-validation that removes both the test site and sites that are both geographically close and climatically close for each cycle, to avoid the risk of pseudo-replication.

License:

GPL-3

Encoding:

UTF-8

URL:

https://github.com/special-uor/fxTWAPLS/, https://special-uor.github.io/fxTWAPLS/, https://research.reading.ac.uk/palaeoclimate/

BugReports:

https://github.com/special-uor/fxTWAPLS/issues/

Imports:

doFuture, foreach, future, geosphere, ggplot2, JOPS, MASS, parallel, progressr

Suggests:

magrittr, progress, scales, tictoc

Depends:

R (≥ 3.6)

RoxygenNote:

7.3.1

Language:

en-GB

NeedsCompilation:

Packaged:

2024-06-25 11:27:00 UTC; r.villegas-diaz

Author:

Mengmeng Liu

[aut], Iain Colin Prentice

[aut], Cajo J. F. ter Braak

[aut], Sandy P. Harrison

[aut], Roberto Villegas-Diaz

[aut, cre], SPECIAL Research Group @ University of Reading [cph]

Maintainer:

Roberto Villegas-Diaz <r.villegas-diaz@outlook.com>

Repository:

CRAN

Date/Publication:

2024-06-25 11:50:13 UTC

fxTWAPLS: An Improved Version of WA-PLS

Description

Author(s)

Maintainer: Roberto Villegas-Diaz r.villegas-diaz@outlook.com (ORCID)

Authors:

Mengmeng Liu m.liu18@imperial.ac.uk (ORCID)
Iain Colin Prentice c.prentice@imperial.ac.uk (ORCID)
Cajo J. F. ter Braak cajo.terbraak@wur.nl (ORCID)
Sandy P. Harrison s.p.harrison@reading.ac.uk (ORCID)

Other contributors:

SPECIAL Research Group @ University of Reading [copyright holder]

TWA-PLS predict function

Description

TWA-PLS predict function

Usage

TWAPLS.predict.w(TWAPLSoutput, fossil_taxa)

Arguments

TWAPLSoutput

The output of the TWAPLS.w training function, either with or without fx correction.

fossil_taxa

Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon.

Value

A list of the reconstruction results. Each element in the list is described below:

fit: the fitted values using each number of components.
nPLS: the total number of components extracted.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]

## Train
fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin <- fxTWAPLS::TWAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

## Predict
fossil_t_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin, core)
fossil_tf_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin, core)
fossil_t_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin2, core)
fossil_tf_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin2, core)

## End(Not run)

TWA-PLS training function

Description

TWA-PLS training function, which can perform fx correction. 1/fx^2 correction will be applied at step 7.

Usage

TWAPLS.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site.

nPLS

The number of components to be extracted.

usefx

Boolean flag on whether or not use fx correction.

fx_method

Binned or p-spline smoothed fx correction: if usefx = FALSE, this should be NA; otherwise, fx function will be used when choosing "bin"; fx_pspline function will be used when choosing "pspline".

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

Value

A list of the training results, which will be used by the predict function. Each element in the list is described below:

fit: the fitted values using each number of components.
x: the observed modern climate values.
taxon_name: the name of each taxon.
optimum: the updated taxon optimum
comp: each component extracted (will be used in step 7 regression).
u: taxon optimum for each component (step 2).
t: taxon tolerance for each component (step 2).
z: a parameter used in standardization for each component (step 5).
s: a parameter used in standardization for each component (step 5).
orth: a list that stores orthogonalization parameters (step 4).
alpha: a list that stores regression coefficients (step 7).
meanx: mean value of the observed modern climate values.
nPLS: the total number of components extracted.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Training
fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin <- fxTWAPLS::TWAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

## End(Not run)

TWA-PLS training function v2

Description

TWA-PLS training function, which can perform fx correction. 1/fx correction will be applied at step 2 and step 7.

Usage

TWAPLS.w2(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site.

nPLS

The number of components to be extracted.

usefx

Boolean flag on whether or not use fx correction.

fx_method

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

Value

A list of the training results, which will be used by the predict function. Each element in the list is described below:

fit: the fitted values using each number of components.
x: the observed modern climate values.
taxon_name: the name of each taxon.
optimum: the updated taxon optimum
comp: each component extracted (will be used in step 7 regression).
u: taxon optimum for each component (step 2).
t: taxon tolerance for each component (step 2).
z: a parameter used in standardization for each component (step 5).
s: a parameter used in standardization for each component (step 5).
orth: a list that stores orthogonalization parameters (step 4).
alpha: a list that stores regression coefficients (step 7).
meanx: mean value of the observed modern climate values.
nPLS: the total number of components extracted.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Training
fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

## End(Not run)

WA-PLS predict function

Description

WA-PLS predict function

Usage

WAPLS.predict.w(WAPLSoutput, fossil_taxa)

Arguments

WAPLSoutput

The output of the WAPLS.w training function, either with or without fx correction.

fossil_taxa

Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon.

Value

A list of the reconstruction results. Each element in the list is described below:

fit: The fitted values using each number of components.
nPLS: The total number of components extracted.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]

## Train
fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin <- fxTWAPLS::WAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)
## Predict
fossil_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_Tmin, core)
fossil_f_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin, core)
fossil_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_Tmin2, core)
fossil_f_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin2, core)

## End(Not run)

WA-PLS training function

Description

WA-PLS training function, which can perform fx correction. 1/fx^2 correction will be applied at step 7.

Usage

WAPLS.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site.

nPLS

The number of components to be extracted.

usefx

Boolean flag on whether or not use fx correction.

fx_method

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

Value

A list of the training results, which will be used by the predict function. Each element in the list is described below:

fit: the fitted values using each number of components.
x: the observed modern climate values.
taxon_name: the name of each taxon.
optimum: the updated taxon optimum (u* in the WA-PLS paper).
comp: each component extracted (will be used in step 7 regression).
u: taxon optimum for each component (step 2).
z: a parameter used in standardization for each component (step 5).
s: a parameter used in standardization for each component (step 5).
orth: a list that stores orthogonalization parameters (step 4).
alpha: a list that stores regression coefficients (step 7).
meanx: mean value of the observed modern climate values.
nPLS: the total number of components extracted.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Training
fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin <- fxTWAPLS::WAPLS.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

## End(Not run)

WA-PLS training function v2

Description

WA-PLS training function, which can perform fx correction. 1/fx correction will be applied at step 2 and step 7.

Usage

WAPLS.w2(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site.

nPLS

The number of components to be extracted.

usefx

Boolean flag on whether or not use fx correction.

fx_method

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

Value

A list of the training results, which will be used by the predict function. Each element in the list is described below:

fit: the fitted values using each number of components.
x: the observed modern climate values.
taxon_name: the name of each taxon.
optimum: the updated taxon optimum (u* in the WA-PLS paper).
comp: each component extracted (will be used in step 7 regression).
u: taxon optimum for each component (step 2).
z: a parameter used in standardization for each component (step 5).
s: a parameter used in standardization for each component (step 5).
orth: a list that stores orthogonalization parameters (step 4).
alpha: a list that stores regression coefficients (step 7).
meanx: mean value of the observed modern climate values.
nPLS: the total number of components extracted.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Training
fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

## End(Not run)

Pseudo-removed leave-out cross-validation

Description

Pseudo-removed leave-out cross-validation

Usage

cv.pr.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  trainfun,
  predictfun,
  pseudo,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA,
  cpus = 4,
  test_mode = TRUE,
  test_it = 5
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site.

nPLS

The number of components to be extracted.

trainfun

Training function you want to use, either WAPLS.w or TWAPLS.w.

predictfun

Predict function you want to use: if trainfun is WAPLS.w, then this should be WAPLS.predict.w; if trainfun is TWAPLS.w, then this should be TWAPLS.predict.w.

pseudo

The geographically and climatically close sites to each test site, obtained from get_pseudo function.

usefx

Boolean flag on whether or not use fx correction.

fx_method

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

cpus

Number of CPUs for simultaneous iterations to execute, check parallel::detectCores() for available CPUs on your machine.

test_mode

Boolean flag to execute the function with a limited number of iterations, test_it, for testing purposes only.

test_it

Number of iterations to use in the test mode.

Value

Leave-one-out cross validation results.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
pseudo_Tmin <- fxTWAPLS::get_pseudo(
  dist,
  modern_pollen$Tmin,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)

cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  pseudo_Tmin,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)

# Run with progress bar
`%>%` <- magrittr::`%>%`
cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  pseudo_Tmin,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>%
  fxTWAPLS::pb()

## End(Not run)

Leave-one-out cross-validation

Description

Leave-one-out cross-validation as rioja (https://cran.r-project.org/package=rioja).

Usage

cv.w(
  modern_taxa,
  modern_climate,
  nPLS = 5,
  trainfun,
  predictfun,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA,
  cpus = 4,
  test_mode = FALSE,
  test_it = 5
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site.

nPLS

The number of components to be extracted.

trainfun

Training function you want to use, either WAPLS.w or TWAPLS.w.

predictfun

Predict function you want to use: if trainfun is WAPLS.w, then this should be WAPLS.predict.w; if trainfun is TWAPLS.w, then this should be TWAPLS.predict.w.

usefx

Boolean flag on whether or not use fx correction.

fx_method

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

cpus

Number of CPUs for simultaneous iterations to execute, check parallel::detectCores() for available CPUs on your machine.

test_mode

boolean flag to execute the function with a limited number of iterations, test_it, for testing purposes only.

test_it

number of iterations to use in the test mode.

Value

leave-one-out cross validation results

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

## LOOCV
test_mode <- TRUE # It should be set to FALSE before running
cv_tf_Tmin2 <- fxTWAPLS::cv.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)

# Run with progress bar
`%>%` <- magrittr::`%>%`
cv_tf_Tmin2 <- fxTWAPLS::cv.w(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  fxTWAPLS::TWAPLS.w2,
  fxTWAPLS::TWAPLS.predict.w,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>% fxTWAPLS::pb()

## End(Not run)

Get frequency of the climate value

Description

Function to get the frequency of the climate value, which will be used to provide fx correction for WA-PLS and TWA-PLS.

Usage

fx(x, bin, show_plot = FALSE)

Arguments

x

Numeric vector with the modern climate values.

bin

Binwidth to get the frequency of the modern climate values.

show_plot

Boolean flag to show a plot of fx ~ x.

Value

Numeric vector with the frequency of the modern climate values.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Get the frequency of each climate variable fx
fx_Tmin <- fxTWAPLS::fx(modern_pollen$Tmin, bin = 0.02, show_plot = TRUE)
fx_gdd <- fxTWAPLS::fx(modern_pollen$gdd, bin = 20, show_plot = TRUE)
fx_alpha <- fxTWAPLS::fx(modern_pollen$alpha, bin = 0.002, show_plot = TRUE)

## End(Not run)

Get frequency of the climate value with p-spline smoothing

Description

Function to get the frequency of the climate value, which will be used to provide fx correction for WA-PLS and TWA-PLS.

Usage

fx_pspline(x, bin, show_plot = FALSE)

Arguments

x

Numeric vector with the modern climate values.

bin

Binwidth to get the frequency of the modern climate values, the curve will be p-spline smoothed later

show_plot

Boolean flag to show a plot of fx ~ x.

Value

Numeric vector with the frequency of the modern climate values.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Get the frequency of each climate variable fx
fx_pspline_Tmin <- fxTWAPLS::fx_pspline(
  modern_pollen$Tmin,
  bin = 0.02,
  show_plot = TRUE
)
fx_pspline_gdd <- fxTWAPLS::fx_pspline(
  modern_pollen$gdd,
  bin = 20,
  show_plot = TRUE
)
fx_pspline_alpha <- fxTWAPLS::fx_pspline(
  modern_pollen$alpha,
  bin = 0.002,
  show_plot = TRUE
)

## End(Not run)

Get the distance between points

Description

Get the distance between points, the output will be used in get_pseudo.

Usage

get_distance(point, cpus = 4, test_mode = FALSE, test_it = 5)

Arguments

point

Each row represents a sampling site, the first column is longitude and the second column is latitude, both in decimal format.

cpus

Number of CPUs for simultaneous iterations to execute, check parallel::detectCores() for available CPUs on your machine.

test_mode

Boolean flag to execute the function with a limited number of iterations, test_it, for testing purposes only.

test_it

Number of iterations to use in the test mode.

Value

Distance matrix, the value at the i-th row, means the distance between the i-th sampling site and the whole sampling sites.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>%
  fxTWAPLS::pb()

## End(Not run)

Get geographically and climatically close sites

Description

Get the sites which are both geographically and climatically close to the test site, which could result in pseudo-replication and inflate the cross-validation statistics. The output will be used in cv.pr.w.

Usage

get_pseudo(dist, x, cpus = 4, test_mode = FALSE, test_it = 5)

Arguments

dist

Distance matrix which contains the distance from other sites.

x

The modern climate values.

cpus

Number of CPUs for simultaneous iterations to execute, check parallel::detectCores() for available CPUs on your machine.

test_mode

Boolean flag to execute the function with a limited number of iterations, test_it, for testing purposes only.

test_it

Number of iterations to use in the test mode.

Value

The geographically and climatically close sites to each test site.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
  point,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
pseudo_Tmin <- fxTWAPLS::get_pseudo(
  dist,
  modern_pollen$Tmin,
  cpus = 2, # Remove the following line
  test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
pseudo_Tmin <- fxTWAPLS::get_pseudo(
  dist,
  modern_pollen$Tmin,
  cpus = 2, # Remove the following line
  test_mode = test_mode
) %>%
  fxTWAPLS::pb()

## End(Not run)

Show progress bar

Description

Show progress bar

Usage

pb(expr, ...)

Arguments

expr

R expression.

...

Arguments passed on to progressr::with_progress

cleanup: If TRUE, all progression handlers will be shutdown at the end regardless of the progression is complete or not.
delay_terminal: If TRUE, output and conditions that may end up in the terminal will delayed.
delay_stdout: If TRUE, standard output is captured and relayed at the end just before any captured conditions are relayed.
delay_conditions: A character vector specifying base::condition classes to be captured and relayed at the end after any captured standard output is relayed.
interrupts: Controls whether interrupts should be detected or not. If TRUE and a interrupt is signaled, progress handlers are asked to report on the current amount progress when the evaluation was terminated by the interrupt, e.g. when a user pressed Ctrl-C in an interactive session, or a batch process was interrupted because it ran out of time. Note that it's optional for a progress handler to support this and only some do.
interval: (numeric) The minimum time (in seconds) between successive progression updates from handlers.
enable: (logical) If FALSE, then progress is not reported. The default is to report progress in interactive mode but not batch mode. See below for more details.

Value

Return data from the function called.

Plot the residuals

Description

Plot the residuals, the black line is 0 line, the red line is the locally estimated scatterplot smoothing, which shows the degree of local compression.

Usage

plot_residuals(train_output, col)

Arguments

train_output

Training output, can be the output of WA-PLS, WA-PLS with fx correction, TWA-PLS, or TWA-PLS with fx correction

col

Choose which column of the fitted value to plot, in other words, how many number of components you want to use.

Value

Plotting status.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

nsig <- 3 # This should be got from the random t-test of the cross validation
fxTWAPLS::plot_residuals(fit_tf_Tmin2, nsig)

## End(Not run)

Plot the training results

Description

Plot the training results, the black line is the 1:1 line, the red line is the linear regression line to fitted and x, which shows the degree of overall compression.

Usage

plot_train(train_output, col)

Arguments

train_output

Training output, can be the output of WA-PLS, WA-PLS with fx correction, TWA-PLS, or TWA-PLS with fx correction.

col

Choose which column of the fitted value to plot, in other words, how many number of components you want to use.

Value

Plotting status.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
  taxa,
  modern_pollen$Tmin,
  nPLS = 5,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02
)

nsig <- 3 # This should be got from the random t-test of the cross validation
fxTWAPLS::plot_train(fit_tf_Tmin2, nsig)

## End(Not run)

Random t-test

Description

Do a random t-test to the cross-validation results.

Usage

rand.t.test.w(cvoutput, n.perm = 999)

Arguments

cvoutput

Cross-validation output either from cv.w or cv.pr.w.

n.perm

The number of permutation times to get the p value, which assesses whether using the current number of components is significantly different from using one less.

Value

A matrix of the statistics of the cross-validation results. Each component is described below:

R2

the coefficient of determination (the larger, the better the fit).

Avg.Bias

average bias.

Max.Bias

maximum bias.

Min.Bias

minimum bias.

RMSEP

root-mean-square error of prediction (the smaller, the better the fit).

delta.RMSEP

the percent change of RMSEP using the current number of components than using one component less.

p

assesses whether using the current number of components is significantly different from using one component less, which is used to choose the last significant number of components to avoid over-fitting.

-

The degree of overall compression is assessed by doing linear regression to the cross-validation result and the observed climate values.

Compre.b0: the intercept.
Compre.b1: the slope (the closer to 1, the less the overall compression).
Compre.b0.se: the standard error of the intercept.
Compre.b1.se: the standard error of the slope.

Examples

## Not run: 

## Random t-test
rand_pr_tf_Tmin2 <- fxTWAPLS::rand.t.test.w(cv_pr_tf_Tmin2, n.perm = 999)

# note: choose the last significant number of components based on the p-value,
# see details at Liu Mengmeng, Prentice Iain Colin, ter Braak Cajo J. F.,
# Harrison Sandy P.. 2020 An improved statistical approach for reconstructing
# past climates from biotic assemblages. Proc. R. Soc. A. 476: 20200346.
# <https://doi.org/10.1098/rspa.2020.0346>

## End(Not run)

Calculate Sample Specific Errors

Description

Calculate Sample Specific Errors

Usage

sse.sample(
  modern_taxa,
  modern_climate,
  fossil_taxa,
  trainfun,
  predictfun,
  nboot,
  nPLS,
  nsig,
  usefx = FALSE,
  fx_method = "bin",
  bin = NA,
  cpus = 4,
  seed = NULL,
  test_mode = FALSE,
  test_it = 5
)

Arguments

modern_taxa

The modern taxa abundance data, each row represents a sampling site, each column represents a taxon.

modern_climate

The modern climate value at each sampling site

fossil_taxa

Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon.

trainfun

Training function you want to use, either WAPLS.w or TWAPLS.w.

predictfun

Predict function you want to use: if trainfun is WAPLS.w, then this should be WAPLS.predict.w; if trainfun is TWAPLS.w, then this should be TWAPLS.predict.w.

nboot

The number of bootstrap cycles you want to use.

nPLS

The number of components to be extracted.

nsig

The significant number of components to use to reconstruct past climates, this can be obtained from the cross-validation results.

usefx

Boolean flag on whether or not use fx correction.

fx_method

bin

Binwidth to get fx, needed for both binned and p-splined method. if usefx = FALSE, this should be NA;

cpus

Number of CPUs for simultaneous iterations to execute, check parallel::detectCores() for available CPUs on your machine.

seed

Seed for reproducibility.

test_mode

Boolean flag to execute the function with a limited number of iterations, test_it, for testing purposes only.

test_it

Number of iterations to use in the test mode.

Value

The bootstrapped standard error for each site.

Examples

## Not run: 
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")

# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]

# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]

## SSE
nboot <- 5 # Recommended 1000
nsig <- 3 # This should be got from the random t-test of the cross validation

sse_tf_Tmin2 <- fxTWAPLS::sse.sample(
  modern_taxa = taxa,
  modern_climate = modern_pollen$Tmin,
  fossil_taxa = core,
  trainfun = fxTWAPLS::TWAPLS.w2,
  predictfun = fxTWAPLS::TWAPLS.predict.w,
  nboot = nboot,
  nPLS = 5,
  nsig = nsig,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2,
  seed = 1
)

# Run with progress bar
`%>%` <- magrittr::`%>%`
sse_tf_Tmin2 <- fxTWAPLS::sse.sample(
  modern_taxa = taxa,
  modern_climate = modern_pollen$Tmin,
  fossil_taxa = core,
  trainfun = fxTWAPLS::TWAPLS.w2,
  predictfun = fxTWAPLS::TWAPLS.predict.w,
  nboot = nboot,
  nPLS = 5,
  nsig = nsig,
  usefx = TRUE,
  fx_method = "bin",
  bin = 0.02,
  cpus = 2,
  seed = 1
) %>% fxTWAPLS::pb()

## End(Not run)

fxTWAPLS: An Improved Version of WA-PLS

Description

Author(s)

See Also

TWA-PLS predict function

Description

Usage

Arguments

Value

See Also

Examples

TWA-PLS training function

Description

Usage

Arguments

Value

See Also

Examples

TWA-PLS training function v2

Description

Usage

Arguments

Value

See Also

Examples

WA-PLS predict function

Description

Usage

Arguments

Value

See Also

Examples

WA-PLS training function

Description

Usage

Arguments

Value

See Also

Examples

WA-PLS training function v2

Description

Usage

Arguments

Value

See Also

Examples

Pseudo-removed leave-out cross-validation

Description

Usage

Arguments

Value

See Also

Examples

Leave-one-out cross-validation

Description

Usage

Arguments

Value

See Also

Examples

Get frequency of the climate value

Description

Usage

Arguments

Value

See Also

Examples

Get frequency of the climate value with p-spline smoothing

Description

Usage

Arguments

Value

See Also

Examples

Get the distance between points

Description

Usage

Arguments

Value

See Also