Type: Package
Title: Quality Control–based Robust LOESS Signal Correction
Version: 0.1.3
Description: An R implementation of quality control–based robust LOESS(local polynomial regression fitting) signal correction for metabolomics data analysis, described in Dunn, W., Broadhurst, D., Begley, P. et al. (2011) <doi:10.1038/nprot.2011.335>. The optimisation of LOESS's span parameter using generalized cross-validation (GCV) is provided as an option. In addition to signal correction, 'qcrlscR' includes some utility functions like batch shifting and data filtering.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Depends: R (≥ 3.0.0)
URL: https://github.com/wanchanglin/qcrlscR
BugReports: https://github.com/wanchanglin/qcrlscR/issues
Encoding: UTF-8
RoxygenNote: 7.3.2
LazyLoad: yes
LazyData: yes
ZipData: No
NeedsCompilation: no
Packaged: 2025-04-17 16:17:18 UTC; wanch
Author: Wanchang Lin [aut, cre], Warwick Dunn [aut]
Maintainer: Wanchang Lin <wanchanglin@hotmail.com>
Repository: CRAN
Date/Publication: 2025-04-22 13:20:02 UTC

qcrlscR: Quality Control–based Robust LOESS Signal Correction

Description

qcrlscR implements quality control–based robust LOESS signal correction for metabolomics data analysis.

Main functions

The qcrlscR provides functions for metabolomics data signal correction. It allows users to optimise span for LOESS in the range of 0.05 and 0.95. This package also provides simple functions for missing value filtering. If the data set used for signal correction has large portion of missing values, users need to filter the data based on percentage of missing values. A straightforward batch shifting function is provided for batch elimination if the batch effects are still present after QC-SLRC process. User can use other R packages's PCA plot (un-supervised) or PLS and LDA plots (supervised) to assess the goodness of signal correction.

Package context

This package does not use "tidyverse" and only uses basic R for simplicity and easy maintenance. A vignette (in R and PDF format) is located in ⁠\qcrlsc\examples⁠.

Author(s)

Maintainer: Wanchang Lin wanchanglin@hotmail.com

Authors:

See Also

Useful links:


Batch shifting

Description

Remove batch effect withing each block.

Usage

batch.shift(x, y, method = "mean", overall_average = TRUE)

Arguments

x

a data matrix.

y

a categorical data for batch/block information.

method

method for shifting.

overall_average

a logical value to indicate whether or not an overall average will be added after shifting.

Value

a shifted data matrix.

References

Silvia Wagner, et.al, Tools in Metabonomics: An Integrated Validation Approach for LC-MS Metabolic Profiling of Mercapturic Acids in Human Urine Anal. Chem., 2007, 79 (7), pp 2918-2926, DOI: 10.1021/ac062153w

Examples

names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## batch shifting
cls.bl <- factor(meta$batch)
res <- batch.shift(data, cls.bl, overall_average = TRUE)

man_qc: test data for QC-RLSC

Description

This HPLC data set includes 4 batches with missing values.

Usage

man_qc

Format

A list with data matrix and meta data:

data

A data frame with 462 replicates (row) and 656 features (column)

meta

A data frame with 2 columns:

  • batch: 4 batches

  • sample_type: QC and Sample

Examples

man_qc
t(sapply(man_qc, dim))
## Select data matrix and meta data
data <- man_qc$data
meta <- man_qc$meta
## Select batches and data types
cls.qc <- factor(meta$sample_type)
cls.bl <- factor(meta$batch)

Filtering variable based on the percentage of missing values

Description

This function calculates the percentage of missing values and keeps those features with missing values percentage less than the designed threshold.

Usage

mv.filter(x, thres = 0.3)

Arguments

x

a data matrix. The columns are features.

thres

threshold of missing values. Features less than this threshold will be kept. Value has to be between 0 and 1.

Value

a list of with contents:

See Also

Other missing value processing: mv.filter.qc(), mv.perc()

Examples

names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## check missing value rates
tail(sort(mv.perc(data)), 20)
## missing values filtering
tmp <- mv.filter(data, thres = 0.15)
data_f <- tmp$dat
## compare
dim(data_f)
dim(data)

Data filtering based on "qc" missing values

Description

Data filtering based on "qc" missing values

Usage

mv.filter.qc(x, y, thres = 0.3)

Arguments

x

a data matrix.

y

a character string with contents of "sample", "qc" and "blank".

thres

threshold of missing values. Features less than this threshold will be kept.

Value

a list of with contents:

See Also

Other missing value processing: mv.filter(), mv.perc()

Examples

names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## check missing value rates
tail(sort(mv.perc(data)), 20)
## missing values filtering based on QC
cls.qc <- factor(meta$sample_type)
tmp <- mv.filter.qc(data, cls.qc, thres = 0.15)
data_f <- tmp$dat
## compare
dim(data_f)
dim(data)

Missing value percentage

Description

Calculate missing value percentage.

Usage

mv.perc(x)

Arguments

x

an vector, matrix or data frame.

Value

missing value percentage.

See Also

Other missing value processing: mv.filter(), mv.filter.qc()

Examples

names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## check missing value rates
tail(sort(mv.perc(data)), 20)

Univariate outlier detection

Description

Perform outlier detection using univariate method.

Usage

outl.det.u(x, method = c("percentile", "median"))

Arguments

x

a numeric vector.

method

method for univariate outlier detection. Only percentile and median are supported.

Details

Value

a logical vector.

References

Wilcox R R, Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy, Springer 2010 (2nd edition), pages 31-35.

Examples

x <- c(2, 3, 4, 5, 6, 7, NA, 9, 50, 50)
outl.det.u(x, "percentile")

QC based robust LOESS signal correction (QC-RLSC)

Description

QC based robust LOESS (locally estimated scatterplot smoothing) signal correction (QC-RLSC)

Usage

qc.rlsc(x, y, method = c("subtract", "divide"), opti = TRUE, ...)

Arguments

x

A data frame with samples (row) and variables (column).

y

A vector with string of "qc" and "sample".

method

Data scaling method.

opti

A logical value indicating whether or not optimise 'span'

...

Other parameter for 'loess'.

Details

This function includes only information of sample types (QC or Sample) for signal correction. It does not require batch information. User may use batch elimination routine such as batch.shift() in this package or others to remove batch effects after signal correction.

If data matrix has missing values, user should filter the data based on missing values percentage. No missing values imputation is needed.

An option is also provided to optimise LOESS's span in a range between 0.05 to 0.95. The R codes are modified from https://bit.ly/3zBo3Qn.

Value

A corrected data frame.

References

Dunn et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols 6, 1060–1083 (2011)

See Also

Other QC-RLSC function: qc.rlsc.wrap()

Examples

names(man_qc)
data <- man_qc$data
meta <- man_qc$meta

cls.qc <- factor(meta$sample_type)
cls.bl <- factor(meta$batch)


## apply QC-RLSC with optimisation of 'span'
res_1 <- qc.rlsc(data, cls.qc, method = "subtract", opti = TRUE)

## apply QC-RLSC without optimisation of 'span'
res_2 <- qc.rlsc(data, cls.qc, method = "subtract", opti = FALSE)


Wrapper function for QC-RLSC

Description

Wrapper function for QC-RLSC

Usage

qc.rlsc.wrap(
  dat,
  cls.qc,
  cls.bl,
  method = c("subtract", "divide"),
  intra = FALSE,
  opti = TRUE,
  log10 = TRUE,
  outl = TRUE,
  shift = TRUE,
  ...
)

Arguments

dat

A data frame with samples (row) and variables (column).

cls.qc

A vector with string of "qc" and "sample".

cls.bl

A vector with string of batch indicators.

method

Data scaling method. Support "subtract" and "divide"

intra

A logical value indicating whether signal correction is performed inside each batch ("intra-batch") or not ("inter-batch").

opti

A logical value indicating whether or not 'span' parameters are optimised.

log10

A logical value indicating whether log10 transformation for the data set or not. If the transformation is applied, the reverse procedure will be performed.

outl

A logical value indicating whether or not QC outlier detection is employed. If TRUE, the QC outlier will be assigned as the median of QC.

shift

A logical value indicating whether or not batch shift is applied after signal correction.

...

Other parameter for 'loess'.

Value

A corrected data frame.

See Also

Other QC-RLSC function: qc.rlsc()

Examples

names(man_qc)
data <- man_qc$data
meta <- man_qc$meta

cls.qc <- factor(meta$sample_type)
cls.bl <- factor(meta$batch)

## apply  QC-RLSC wrapper function
method <- "divide"     # "subtract"
intra <- TRUE
opti <- TRUE
log10 <- TRUE
outl <- TRUE
shift <- TRUE


res <- qc.rlsc.wrap(data, cls.qc, cls.bl, method, intra, opti, log10,
                    outl, shift)