Type: Package
Title: Truncated Rank Correlation
Version: 0.2
Date: 2025-04-15
Maintainer: Donghyeon Yu <dyu@inha.ac.kr>
Description: A new measure of similarity between a pair of mass spectrometry (MS) experiments, called truncated rank correlation (TRC). To provide a robust metric of similarity in noisy high-dimensional data, TRC uses truncated top ranks (or top m-ranks) for calculating correlation. Truncated rank correlation as a robust measure of test-retest reliability in mass spectrometry data. For more details see Lim et al. (2019) <doi:10.1515/sagmb-2018-0056>.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
URL: https://sites.google.com/site/dhyeonyu/software
Packaged: 2025-04-22 21:27:10 UTC; dyu
NeedsCompilation: yes
Repository: CRAN
Date/Publication: 2025-04-24 17:20:02 UTC
Author: Johan Lim [aut], Donghyeon Yu [aut, cre], Hsun-chih Kuo [aut], Hyungwon Choi [aut], Scott Walmsley [aut]

Kendall's tau for two vector observations

Description

This function calculates the Kendall's tau for two vector observations for the purpose of checking inner calculation.

Usage

k_tau(X,Y)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

Details

Kendall's tau for two vector observations.

Value

tau

A calculated Kendall's tau value.

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

# Kendall's tau
ktau <- k_tau(S1,S2)
ktau


Procedure for estimating the null distribution of the TRC tau with the m value chosen by the proposed rule.

Description

Procedure for estimating the null distribution of the TRC tau with the m value chosen by the proposed rule.

Usage

null_perm(X,Y,nperm=1000,start=3,range_m=0.5,span=0.5,seed=21,all_m=FALSE)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

nperm

the number of permutations to estimate the null distribution (default: 1000).

start

A lower bound of a search region for the threshold rank m (default: 3).

range_m

A proportion of length of X for specifying the end of the search region for m (default: 0.8).

span

A parameter alpha which controls the degree of smoothing in loess function.

seed

An initial seed for the permutation.

all_m

a logical flag for returning permuted TRC tau values for all m values (default: FALSE).

Details

Null distributions of the TRC tau with a given m value, the Kendall's tau, and Pearson's correlation are estimated by the permuted samples.

Value

perm_trc

A vector of TRC tau values from the permuted samples with the m value chosen by the proposed rule.

hist_m

A vector of the chosen m values for permutations.

perm_ktau

A vector of Kendall's tau values from the permuted samples.

perm_rho

A vector of Pearson's correlation values from the permuted samples.

perm_trc_all_m

A matrix of permuted TRC tau values for all m values, in which each column stores the permuted TRC tau values for corresponding m value.

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

null_res = null_perm(S1,S2,nperm=1000,start=3,range_m=0.5,span=0.2,seed=21,all_m=FALSE)


Procedure for estimating the null distribution of the TRC tau with a given m value

Description

Procedure for estimating the null distribution of the TRC tau with a given m value.

Usage

null_perm_m0(X,Y,nperm=1000,m=5,seed=21)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

nperm

the number of permutations to estimate the null distribution (default: 1000).

m

A rank threshold for the calculation of TRC tau (default: 5).

seed

An initial seed for the permutation.

Details

Null distribution of the TRC tau with a given m value is estimated by the permuted samples.

Value

perm_tau

A vector of calculated TRC tau values from the permuted samples

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

null_res = null_perm_m0(S1,S2,nperm=1000,m=5,seed=21)


Pearson's correlation for two vector observations

Description

This function calculates the Pearson's correlation for two vector observations for the purpose of checking inner calculation.

Usage

rho(X,Y)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

Details

Pearson's correlation for two vector observations.

Value

rho

A calculated Pearson's correlation value.

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

# Pearson's correlation
pcor = rho(S1,S2)
pcor

Procedure for calculating p-values

Description

Procedure for calculating p-values of Pearson's rho, Kendall's tau, TRC tau for two-sided test for the null hypothesis correaltion is equal to 0 based on the estimated null distribution by permutation.

Usage

trc_cor_test(X,Y, nperm=10000,start=3,range_m=0.8, span=0.5, seed=21, m0=NULL)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

nperm

the number of permutations to estimate the null distribution (default: 10000).

start

A lower bound of a search region for the threshold rank m (default: 3).

range_m

A proportion of length of X for specifying the end of the search region for m (default: 0.8).

span

A parameter alpha which controls the degree of smoothing in loess function (default: 0.5).

seed

An initial seed for the permutation (default: 21).

m0

a specific m value for p-value of the TRC tau with m (defalut: NULL (not reported)).

Details

The p-values are caculated based on the estimated null distributions of the TRC tau with a given m value, the Kendall's tau, and Pearson's correlation with the permuted samples, respectively.

Value

measure

a vector of calculated Pearson's rho, Kendall's tau, and TRC tau with m chosen by the proposed rule if m0 = NULL; a vector of calculated Pearson's rho, Kendall's tau, TRC tau with m0, TRC tau with m chosen by the proposed rule if m0 is specified.

p_val

a vector of p-values for Pearson's rho, Kendall's tau, and TRC tau with m chosen by the proposed rule if m0 = NULL; a vector of p-values for Pearson's rho, Kendall's tau, TRC tau with m0, TRC tau with m chosen by the proposed rule if m0 is specified.

chs_m

the chosen m value by the proposed procedure.

mean_perm_trc

a mean value of the estimated null distribution of TRC tau by permutation.

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

trc_cor_test(S1,S2, nperm=1000,start=3,range_m=0.8, span=0.2, seed=21, m0=NULL)



Description

Procedure for the choice of m for the TRC tau.

Usage

trc_m_search(X,Y,start=3,range_m=0.8,span=0.3)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

start

A lower bound of a search region for the threshold rank m (default: 3).

range_m

A proportion of length of X for specifying the end of the search region for m (default: 0.8).

span

A parameter alpha which controls the degree of smoothing in loess function.

Details

The thresholding rank m is chosen by the proposed procedure in Lim et al. (2019).

Value

tau

A calculated TRC tau value with the chosen m value (chs_m).

chs_m

the chosen m value.

km_tau_vec

A vector of calculated k_m * TRC tau values for the given values of m [start, floor(range_m*n)]

km_tau_loess

A fitted values by the local regression with loess function for km_tau_vec .

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

# tau_m
trc_res = trc_m_search(S1,S2,start=3,range_m=0.8,span=0.2)
trc_res$tau
trc_res$chs_m


Trucated Rank Correlation

Description

TRC tau is a robust corrleation measure based on the truncated rank values.

Usage

trc_tau(X,Y,m=5)

Arguments

X

An observed data vector from the first condition.

Y

An observed data vector from the second condition.

m

A rank threshold for the calculation of TRC tau.

Details

Given a rank threshold m, trc_tau calculates the TRC tau value.

Value

tau

A calculated TRC tau value.

References

Lim, J., Yu, D., Kuo, H., Choi, H., and Walmsely, S. (2019). Truncated Rank Correlation as a robust measure of test-retest reliability in mass spectrometry data. Statistical Applications in Genetics and Molecular Biology, 18(4).

Examples

p = 100
sig_z = 1.15
sig_e = 1
mu_z = 2
mu_e = 8
m0 = 30

S1 = rnorm(p,mean=mu_e,sd=sig_e)
S2 = rnorm(p,mean=mu_e,sd=sig_e)
    
if(m0!=0)
{
   X = mu_z + rnorm(m0,mean=0,sd=sig_z)
   indx = 1:p
   s_indx = sort(sample(indx,m0))
   S1[s_indx] = S1[s_indx] + X
   S2[s_indx] = S2[s_indx] + X
}
      
S1 = exp(S1)
S2 = exp(S2)

tau0 = trc_tau(S1,S2,m=m0)
tau0