---
title: 'ClusteredNeuroVec: Parcel-based 4D Analysis'
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{ClusteredNeuroVec: Parcel-based 4D Analysis}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
params:
family: red
css: albers.css
resource_files:
- albers.css
- albers.js
includes:
in_header: |-
---
```{r setup, include=FALSE}
if (requireNamespace("ggplot2", quietly = TRUE)) ggplot2::theme_set(neuroim2::theme_neuro(base_family = params$family))
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(neuroim2)
set.seed(123)
```
## Overview
`ClusteredNeuroVec` provides an efficient representation for parcellated 4D neuroimaging data where voxels are grouped into clusters or parcels. Instead of storing a time-series for every voxel, it stores one time-series per cluster, making it ideal for analyses using brain atlases like Schaefer-Yeo parcellations.
## Why ClusteredNeuroVec?
Traditional neuroimaging analyses often involve:
- Reducing voxel-level data to parcel/ROI averages
- Working with brain atlases that group voxels into regions
- Performing searchlight analyses at the parcel level rather than voxel level
`ClusteredNeuroVec` makes these workflows more efficient while maintaining compatibility with standard `NeuroVec` operations.
## Creating a ClusteredNeuroVec
### From scratch with synthetic data
```{r basic-creation}
# Create a simple 3D space with mask
space <- NeuroSpace(c(10, 10, 10), spacing = c(2, 2, 2))
mask_data <- array(TRUE, c(10, 10, 10))
mask_data[1:3, 1:3, 1:3] <- FALSE # exclude corner
mask <- LogicalNeuroVol(mask_data, space)
# Create cluster assignments (e.g., 5 random clusters)
n_masked <- sum(mask_data)
cluster_ids <- sample(1:5, n_masked, replace = TRUE)
cvol <- ClusteredNeuroVol(mask, cluster_ids)
# Create synthetic 4D data
vec_space <- NeuroSpace(c(10, 10, 10, 20), spacing = c(2, 2, 2))
vec_data <- array(rnorm(10 * 10 * 10 * 20), dim = c(10, 10, 10, 20))
vec <- NeuroVec(vec_data, vec_space)
# Create ClusteredNeuroVec
cv <- ClusteredNeuroVec(vec, cvol)
print(cv)
```
### Key properties
```{r properties}
# Dimensions: still 4D (x, y, z, time)
dim(cv)
# Number of clusters
num_clusters(cv)
# Access cluster time-series matrix (T x K)
ts_matrix <- as.matrix(cv, by = "cluster")
dim(ts_matrix) # 20 time points x 5 clusters
```
## Array-like access
`ClusteredNeuroVec` behaves like a regular 4D array:
```{r array-access}
# Extract 3D volume at time point 1
vol_t1 <- cv[,,,1]
dim(vol_t1) # 10 x 10 x 10
# All voxels in the same cluster have the same value
# (they share the cluster's mean time-series)
# Get time-series at a specific voxel
ts <- series(cv, 5, 5, 5)
length(ts) # 20 time points
```
## Cluster searchlight analysis
Perform searchlight analysis at the cluster level using centroid distances:
```{r searchlight}
# K-nearest neighbor searchlight (10 nearest clusters)
windows_knn <- cluster_searchlight_series(cv, k = 10)
length(windows_knn) # One window per cluster
# Look at first window
win1 <- windows_knn[[1]]
dim(values(win1)) # 10 neighbors x 20 time points
# Radius-based searchlight (e.g., 15mm radius)
windows_radius <- cluster_searchlight_series(cv, radius = 15)
```
## Real-world example: Schaefer parcellation
```{r real-world, eval=FALSE}
# Load fMRI data
fmri_data <- read_vec("subject01_task.nii.gz")
# Load Schaefer atlas (example with 400 parcels)
atlas <- read_vol("Schaefer2018_400Parcels_7Networks.nii.gz")
mask <- atlas > 0
# Create ClusteredNeuroVol from atlas
cvol <- ClusteredNeuroVol(mask, as.integer(atlas[mask]))
# Create parcellated representation
cv <- ClusteredNeuroVec(fmri_data, cvol)
# Now you have 400 time-series (one per parcel) instead of ~200,000 voxels
parcels <- as.matrix(cv, by = "cluster")
dim(parcels) # T x 400
# Perform connectivity analysis at parcel level
cor_matrix <- cor(parcels)
dim(cor_matrix) # 400 x 400
```
## Integration with existing workflows
`ClusteredNeuroVec` integrates seamlessly with existing neuroim2 functions:
```{r integration}
# Use with split_reduce for custom aggregation
# (ClusteredNeuroVec already uses this internally)
# Scale time-series within each cluster
# (if scale_series is implemented for ClusteredNeuroVec)
# cv_scaled <- scale_series(cv, center = TRUE, scale = TRUE)
# Get cluster centroids for visualization
centers <- centroids(cv)
head(centers) # x, y, z coordinates
```
## Performance benefits
By storing only K time-series instead of N voxels:
- Memory usage: O(K × T) instead of O(N × T)
- Searchlight operations: O(K²) instead of O(N²)
- Typical reduction: 100-1000x fewer time-series
For a typical fMRI dataset:
- Voxel-level: ~200,000 voxels × 500 timepoints = 100M values
- Parcel-level: 400 parcels × 500 timepoints = 200K values
## Summary
`ClusteredNeuroVec` provides:
- Efficient storage for parcellated 4D data
- Full array-like access semantics
- Cluster-aware searchlight operations
- Seamless integration with existing neuroim2 workflows
It's ideal for:
- Atlas-based analyses
- Connectivity studies
- Parcellated machine learning
- Any workflow that aggregates voxels to regions