Additional functions

library(cardinalR)

These are helper functions included in the package.

Generating background noise

The gen_bkgnoise() function allows users to generate multivariate Gaussian noise to serve as background data in high-dimensional spaces.

# Example: Generate 4D background noise
bkg_data <- gen_bkgnoise(n = 500, p = 4, 
                         m = c(0, 0, 0, 0), s = c(2, 2, 2, 2))
head(bkg_data)
#> # A tibble: 6 × 4
#>       x1     x2     x3     x4
#>    <dbl>  <dbl>  <dbl>  <dbl>
#> 1  2.54   2.83  -2.84   2.19 
#> 2  1.81  -0.134  0.522  4.74 
#> 3 -1.74  -3.47   0.744  3.26 
#> 4  0.350  1.10  -0.416 -1.27 
#> 5  2.61   3.86   2.04   0.964
#> 6 -2.34   1.65  -0.111  1.27

The generated data has independent dimensions with specified means (m) and standard deviations (s).

Randomizing rows

randomize_rows() ensures the rows of the input data is randomized.

randomized_data <- randomize_rows(bkg_data)
head(randomized_data)
#> # A tibble: 6 × 4
#>      x1    x2      x3     x4
#>   <dbl> <dbl>   <dbl>  <dbl>
#> 1  1.17 2.23   0.537  -3.45 
#> 2 -1.11 2.57   1.83   -1.92 
#> 3  3.91 0.901  2.16   -0.724
#> 4  1.19 2.34   1.85    2.60 
#> 5 -2.87 2.40  -0.364  -2.14 
#> 6  2.98 1.48   0.0949  2.04

Relocating clusters

relocate_clusters() allows users to translate clusters in any dimension(s). This is achieved by centering each cluster (subtracting its mean) and then adding a translation vector from a provided matrix (vert_mat).

df <- tibble::tibble(
  x1 = rnorm(12),
  x2 = rnorm(12),
  x3 = rnorm(12),
  x4 = rnorm(12),
  cluster = rep(1:3, each = 4)
)

vert_mat <- matrix(c(
  5, 0, 0, 0,
  0, 5, 0, 0,
  0, 0, 5, 0
), nrow = 3, byrow = TRUE)

relocated_df <- relocate_clusters(df, vert_mat)
head(relocated_df)
#> # A tibble: 6 × 5
#>       x1     x2     x3     x4 cluster
#>    <dbl>  <dbl>  <dbl>  <dbl>   <int>
#> 1  0.533 -0.127  5.79   1.82        3
#> 2 -0.533  0.360  4.23  -1.08        3
#> 3  0.365  4.71  -0.483 -0.652       2
#> 4  0.928  0.307  3.71  -1.67        3
#> 5 -1.04   4.44  -0.122 -0.386       2
#> 6 -0.410  4.69  -0.407  2.13        2

Generating Rotation Matrices

The gen_rotation() function creates a rotation matrix in high-dimensional space for given planes and angles.


rotations_4d <- list(
  list(plane = c(1, 2), angle = 60),
  list(plane = c(3, 4), angle = 90)
)

rot_mat <- gen_rotation(p = 4, planes_angles = rotations_4d)
rot_mat
#>           [,1]       [,2]         [,3]          [,4]
#> [1,] 0.5000000 -0.8660254 0.000000e+00  0.000000e+00
#> [2,] 0.8660254  0.5000000 0.000000e+00  0.000000e+00
#> [3,] 0.0000000  0.0000000 6.123234e-17 -1.000000e+00
#> [4,] 0.0000000  0.0000000 1.000000e+00  6.123234e-17

Normalize data

When combining clusters or transforming data geometrically, magnitudes can differ drastically. The normalize_data() function rescales the entire dataset to fit within ([-1, 1]) based on its maximum absolute value.

norm_data <- normalize_data(bkg_data)
head(norm_data)
#>            x1          x2          x3         x4
#> 1  0.32966791  0.36746895 -0.36831134  0.2838941
#> 2  0.23503031 -0.01744658  0.06767172  0.6148489
#> 3 -0.22513820 -0.45013699  0.09646890  0.4227887
#> 4  0.04540999  0.14280257 -0.05402839 -0.1647246
#> 5  0.33877160  0.50095126  0.26458171  0.1250784
#> 6 -0.30422734  0.21405969 -0.01444044  0.1646372

Generating cluster locations

To place clusters in different positions, gen_clustloc() generates points forming a simplex-like arrangement ensuring each cluster center is equidistant from others as much as possible.


centers <- gen_clustloc(p = 4, k = 5)
head(centers)
#>            [,1]        [,2]      [,3]       [,4]        [,5]
#> [1,] -0.1871367  0.74955875 0.1813817 -0.7647630  0.02095925
#> [2,]  0.5307424 -0.04382745 0.4547328 -0.8621243 -0.07952343
#> [3,] -1.6393512 -1.59734549 0.0829876  1.4700328  1.68367622
#> [4,] -0.8793591 -0.05091480 0.6331562  0.1565104  0.14060736

Numeric generators

Two helper functions, gen_nproduct() and gen_nsum(), generate numeric vectors of positive integers that approximately satisfy a user-specified target product or sum, respectively.

The function gen_nsum(n, k) divides a total sum n into k positive integers. It first assigns an equal base value to each element and then randomly distributes any remainder, ensuring the elements sum exactly to n.

gen_nsum(n = 100, k = 3)
#> [1] 34 33 33

The function gen_nproduct(n, p) aims to produce p positive integers whose product is approximately n. It starts with all elements equal to the rounded \(p^{th}\) root of n and iteratively adjusts elements up or down in a randomized manner until the product is within a small tolerance of n. This accommodates the fact that exact integer solutions for a given product are often impossible.

gen_nproduct(n = 500, p = 4)
#> [1] 4 5 5 5