Type: | Package |
Title: | Quality Control–based Robust LOESS Signal Correction |
Version: | 0.1.3 |
Description: | An R implementation of quality control–based robust LOESS(local polynomial regression fitting) signal correction for metabolomics data analysis, described in Dunn, W., Broadhurst, D., Begley, P. et al. (2011) <doi:10.1038/nprot.2011.335>. The optimisation of LOESS's span parameter using generalized cross-validation (GCV) is provided as an option. In addition to signal correction, 'qcrlscR' includes some utility functions like batch shifting and data filtering. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Depends: | R (≥ 3.0.0) |
URL: | https://github.com/wanchanglin/qcrlscR |
BugReports: | https://github.com/wanchanglin/qcrlscR/issues |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
LazyLoad: | yes |
LazyData: | yes |
ZipData: | No |
NeedsCompilation: | no |
Packaged: | 2025-04-17 16:17:18 UTC; wanch |
Author: | Wanchang Lin [aut, cre], Warwick Dunn [aut] |
Maintainer: | Wanchang Lin <wanchanglin@hotmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-04-22 13:20:02 UTC |
qcrlscR: Quality Control–based Robust LOESS Signal Correction
Description
qcrlscR
implements quality control–based robust LOESS
signal correction for metabolomics data analysis.
Main functions
The qcrlscR
provides functions for metabolomics data signal correction.
It allows users to optimise span
for LOESS in the range of 0.05 and
0.95. This package also provides simple functions for
missing value filtering. If the data set used for signal correction has
large portion of missing values, users need to filter the data based on
percentage of missing values. A straightforward batch shifting function is
provided for batch elimination if the batch effects are still present
after QC-SLRC process. User can use other R packages's PCA plot
(un-supervised) or PLS and LDA plots (supervised) to assess the goodness
of signal correction.
Package context
This package does not use "tidyverse" and only uses basic R for
simplicity and easy maintenance. A vignette (in R and PDF format) is
located in \qcrlsc\examples
.
Author(s)
Maintainer: Wanchang Lin wanchanglin@hotmail.com
Authors:
Warwick Dunn
See Also
Useful links:
Batch shifting
Description
Remove batch effect withing each block.
Usage
batch.shift(x, y, method = "mean", overall_average = TRUE)
Arguments
x |
a data matrix. |
y |
a categorical data for batch/block information. |
method |
method for shifting. |
overall_average |
a logical value to indicate whether or not an overall average will be added after shifting. |
Value
a shifted data matrix.
References
Silvia Wagner, et.al, Tools in Metabonomics: An Integrated Validation Approach for LC-MS Metabolic Profiling of Mercapturic Acids in Human Urine Anal. Chem., 2007, 79 (7), pp 2918-2926, DOI: 10.1021/ac062153w
Examples
names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## batch shifting
cls.bl <- factor(meta$batch)
res <- batch.shift(data, cls.bl, overall_average = TRUE)
man_qc: test data for QC-RLSC
Description
This HPLC data set includes 4 batches with missing values.
Usage
man_qc
Format
A list with data matrix and meta data:
- data
A data frame with 462 replicates (row) and 656 features (column)
- meta
A data frame with 2 columns:
batch: 4 batches
sample_type: QC and Sample
Examples
man_qc
t(sapply(man_qc, dim))
## Select data matrix and meta data
data <- man_qc$data
meta <- man_qc$meta
## Select batches and data types
cls.qc <- factor(meta$sample_type)
cls.bl <- factor(meta$batch)
Filtering variable based on the percentage of missing values
Description
This function calculates the percentage of missing values and keeps those features with missing values percentage less than the designed threshold.
Usage
mv.filter(x, thres = 0.3)
Arguments
x |
a data matrix. The columns are features. |
thres |
threshold of missing values. Features less than this threshold will be kept. Value has to be between 0 and 1. |
Value
a list of with contents:
dat the filtered data matrix
idx a logical vector of index for keeping features.
See Also
Other missing value processing:
mv.filter.qc()
,
mv.perc()
Examples
names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## check missing value rates
tail(sort(mv.perc(data)), 20)
## missing values filtering
tmp <- mv.filter(data, thres = 0.15)
data_f <- tmp$dat
## compare
dim(data_f)
dim(data)
Data filtering based on "qc" missing values
Description
Data filtering based on "qc" missing values
Usage
mv.filter.qc(x, y, thres = 0.3)
Arguments
x |
a data matrix. |
y |
a character string with contents of "sample", "qc" and "blank". |
thres |
threshold of missing values. Features less than this threshold will be kept. |
Value
a list of with contents:
dat the filtered data matrix
idx a logical vector of index for keeping features.
See Also
Other missing value processing:
mv.filter()
,
mv.perc()
Examples
names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## check missing value rates
tail(sort(mv.perc(data)), 20)
## missing values filtering based on QC
cls.qc <- factor(meta$sample_type)
tmp <- mv.filter.qc(data, cls.qc, thres = 0.15)
data_f <- tmp$dat
## compare
dim(data_f)
dim(data)
Missing value percentage
Description
Calculate missing value percentage.
Usage
mv.perc(x)
Arguments
x |
an vector, matrix or data frame. |
Value
missing value percentage.
See Also
Other missing value processing:
mv.filter()
,
mv.filter.qc()
Examples
names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
## check missing value rates
tail(sort(mv.perc(data)), 20)
Univariate outlier detection
Description
Perform outlier detection using univariate method.
Usage
outl.det.u(x, method = c("percentile", "median"))
Arguments
x |
a numeric vector. |
method |
method for univariate outlier detection. Only |
Details
-
median
: the absolute difference between the observation and the sample median is larger than 2 times of the Median Absolute Deviation divided by 0.6745. -
percentile
: either smaller than the 1st quartile minus 1.5 times of IQR, or larger than the 3rd quartile plus 1.5 times of IQR.
Value
a logical vector.
References
Wilcox R R, Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy, Springer 2010 (2nd edition), pages 31-35.
Examples
x <- c(2, 3, 4, 5, 6, 7, NA, 9, 50, 50)
outl.det.u(x, "percentile")
QC based robust LOESS signal correction (QC-RLSC)
Description
QC based robust LOESS (locally estimated scatterplot smoothing) signal correction (QC-RLSC)
Usage
qc.rlsc(x, y, method = c("subtract", "divide"), opti = TRUE, ...)
Arguments
x |
A data frame with samples (row) and variables (column). |
y |
A vector with string of "qc" and "sample". |
method |
Data scaling method. |
opti |
A logical value indicating whether or not optimise 'span' |
... |
Other parameter for 'loess'. |
Details
This function includes only information of sample types (QC
or
Sample
) for signal correction. It does not require batch information.
User may use batch elimination routine such as batch.shift()
in this
package or others to remove batch effects after signal correction.
If data matrix has missing values, user should filter the data based on missing values percentage. No missing values imputation is needed.
An option is also provided to optimise LOESS's span
in a range
between 0.05 to 0.95. The R codes are modified from
https://bit.ly/3zBo3Qn.
Value
A corrected data frame.
References
Dunn et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols 6, 1060–1083 (2011)
See Also
Other QC-RLSC function:
qc.rlsc.wrap()
Examples
names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
cls.qc <- factor(meta$sample_type)
cls.bl <- factor(meta$batch)
## apply QC-RLSC with optimisation of 'span'
res_1 <- qc.rlsc(data, cls.qc, method = "subtract", opti = TRUE)
## apply QC-RLSC without optimisation of 'span'
res_2 <- qc.rlsc(data, cls.qc, method = "subtract", opti = FALSE)
Wrapper function for QC-RLSC
Description
Wrapper function for QC-RLSC
Usage
qc.rlsc.wrap(
dat,
cls.qc,
cls.bl,
method = c("subtract", "divide"),
intra = FALSE,
opti = TRUE,
log10 = TRUE,
outl = TRUE,
shift = TRUE,
...
)
Arguments
dat |
A data frame with samples (row) and variables (column). |
cls.qc |
A vector with string of "qc" and "sample". |
cls.bl |
A vector with string of batch indicators. |
method |
Data scaling method. Support "subtract" and "divide" |
intra |
A logical value indicating whether signal correction is performed inside each batch ("intra-batch") or not ("inter-batch"). |
opti |
A logical value indicating whether or not 'span' parameters are optimised. |
log10 |
A logical value indicating whether log10 transformation for the data set or not. If the transformation is applied, the reverse procedure will be performed. |
outl |
A logical value indicating whether or not QC outlier detection is employed. If TRUE, the QC outlier will be assigned as the median of QC. |
shift |
A logical value indicating whether or not batch shift is applied after signal correction. |
... |
Other parameter for 'loess'. |
Value
A corrected data frame.
See Also
Other QC-RLSC function:
qc.rlsc()
Examples
names(man_qc)
data <- man_qc$data
meta <- man_qc$meta
cls.qc <- factor(meta$sample_type)
cls.bl <- factor(meta$batch)
## apply QC-RLSC wrapper function
method <- "divide" # "subtract"
intra <- TRUE
opti <- TRUE
log10 <- TRUE
outl <- TRUE
shift <- TRUE
res <- qc.rlsc.wrap(data, cls.qc, cls.bl, method, intra, opti, log10,
outl, shift)