The XeniumIO package provides functions to import 10X
Genomics Xenium Analyzer data into R. The package is designed to work
with the output of the Xenium Analyzer, which is a software tool that
processes Visium spatial gene expression data. The package provides
functions to import the output of the Xenium Analyzer into R, and to
create a TENxXenium object that can be used with other
Bioconductor packages.
The 10X suite of packages support multiple file formats. The following table lists the supported file formats and the corresponding classes that are imported into R.
| Extension | Class | Imported as |
|---|---|---|
| .h5 | TENxH5 | SingleCellExperiment w/ TENxMatrix |
| .mtx / .mtx.gz | TENxMTX | SummarizedExperiment w/ dgCMatrix |
| .tar.gz | TENxFileList | SingleCellExperiment w/ dgCMatrix |
| peak_annotation.tsv | TENxPeaks | GRanges |
| fragments.tsv.gz | TENxFragments | RaggedExperiment |
| .tsv / .tsv.gz | TENxTSV | tibble |
| Extension | Class | Imported as |
|---|---|---|
| spatial.tar.gz | TENxSpatialList | DataFrame list * |
| .parquet | TENxSpatialParquet | tibble * |
| Extension | Class | Imported as |
|---|---|---|
| .zarr.zip | TENxZarr | (TBD) |
The TENxXenium class has a metadata slot
for the experiment.xenium file. The resources
slot is a TENxFileList or TENxH5 object
containing the cell feature matrix. The coordNames slot is
a vector specifying the names of the columns in the spatial data
containing the spatial coordinates. The sampleId slot is a
scalar specifying the sample identifier.
TENxXenium(
resources = "path/to/matrix/folder/or/file",
xeniumOut = "path/to/xeniumOut/folder",
sample_id = "sample01",
format = c("mtx", "h5"),
boundaries_format = c("parquet", "csv.gz"),
spatialCoordsNames = c("x_centroid", "y_centroid"),
...
)The format argument specifies the format of the
resources object, either “mtx” or “h5”. The
boundaries_format allows the user to choose whether to read
in the data using the parquet or csv.gz
format.
Note that the xeniumOut unzipped folder must contain the
following files:
*outs
├── cell_feature_matrix.h5
├── cell_feature_matrix.tar.gz
| ├── barcodes.tsv*
| ├── features.tsv*
| └── matrix.mtx*
├── cell_feature_matrix.zarr.zip
├── experiment.xenium
├── cells.csv.gz
├── cells.parquet
├── cells.zarr.zip
[...]
Note that currently the zarr format is not supported as
the infrastructure is currently under development.
The resources slot should either be the
TENxFileList from the mtx format or a
TENxH5 instance from an h5 file. The
boundaries can either be a TENxSpatialParquet instance or a
TENxSpatialCSV. These classes are automatically
instantiated by the constructor function.
## Class "TENxXenium" [package "XeniumIO"]
##
## Slots:
##
## Name: resources
## Class: TENxFileList_OR_TENxH5
##
## Name: boundaries
## Class: TENxSpatialParquet_OR_TENxSpatialCSV
##
## Name: coordNames
## Class: character
##
## Name: sampleId
## Class: character
##
## Name: colData
## Class: TENxSpatialParquet
##
## Name: metadata
## Class: XeniumFile
import methodThe import method for a TENxXenium instance
returns a SpatialExperiment class object. Dispatch is only
done on the con argument. See ?BiocIO::import
for details on the generic. The import function call is
meant to be a simple call without much input. For more details in the
package, see ?TENxXenium.
## Method Definition:
##
## function (con, format, text, ...)
## {
## sce <- import(con@resources, ...)
## metadata <- import(con@metadata)
## coldata <- import(con@colData)
## SpatialExperiment::SpatialExperiment(assays = list(counts = assay(sce)),
## rowData = rowData(sce), mainExpName = mainExpName(sce),
## altExps = altExps(sce), sample_id = con@sampleId, colData = as(coldata,
## "DataFrame"), spatialCoordsNames = con@coordNames,
## metadata = list(experiment.xenium = metadata, polygons = import(con@boundaries)))
## }
## <bytecode: 0x5585b99e7f40>
## <environment: namespace:XeniumIO>
##
## Signatures:
## con format text
## target "TENxXenium" "ANY" "ANY"
## defined "TENxXenium" "ANY" "ANY"
The following code snippet demonstrates how to import a Xenium
Analyzer output into R. The TENxXenium object is created by
specifying the path to the xeniumOut folder. The
TENxXenium object is then imported into R using the
import method for the TENxXenium class.
First, we cache the ~12 MB file to avoid downloading it multiple times (via BiocFileCache).
zipfile <- paste0(
"https://mghp.osn.xsede.org/bir190004-bucket01/BiocXenDemo/",
"Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny_outs.zip"
)
destfile <- XeniumIO:::.cache_url_file(zipfile)## adding rname 'https://mghp.osn.xsede.org/bir190004-bucket01/BiocXenDemo/Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny_outs.zip'
We then create an output folder for the contents of the zipped file.
We use the same name as the zip file but without the extension (with
tools::file_path_sans_ext).
outfold <- file.path(
tempdir(), tools::file_path_sans_ext(basename(zipfile))
)
if (!dir.exists(outfold))
dir.create(outfold, recursive = TRUE)We now unzip the file and use the outfold as the
exdir argument to unzip. The
outfold variable and folder will be used as the
xeniumOut argument in the TENxXenium
constructor. Note that we use the ref = "Gene Expression"
argument in the import method to pass down to the internal
splitAltExps function call. This will set the
mainExpName in the SpatialExperiment
object.
unzip(
zipfile = destfile, exdir = outfold, overwrite = FALSE
)
TENxXenium(xeniumOut = outfold) |>
import(ref = "Gene Expression")## class: SpatialExperiment
## dim: 5006 36
## metadata(2): experiment.xenium polygons
## assays(1): counts
## rownames(5006): ENSMUSG00000052595 ENSMUSG00000030111 ...
## ENSMUSG00000055670 ENSMUSG00000027596
## rowData names(3): ID Symbol Type
## colnames(36): aaamobki-1 aaclkaod-1 ... olbjkpjc-1 omjmdimk-1
## colData names(13): cell_id transcript_counts ... segmentation_method
## sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(5): Deprecated Codeword Genomic Control Negative Control
## Codeword Negative Control Probe Unassigned Codeword
## spatialCoords names(2) : x_centroid y_centroid
## imgData names(0):
Note that you may also use the swapAltExp function to
set a mainExpName in the SpatialExperiment but
this is not recommended. The operation returns a
SingleCellExperiment which has to be coerced back into a
SpatialExperiment. The coercion also loses some metadata
information particularly the spatialCoords.
TENxXenium(xeniumOut = outfold) |>
import() |>
swapAltExp(name = "Gene Expression") |>
as("SpatialExperiment")## class: SpatialExperiment
## dim: 5006 36
## metadata(1): TENxFileList
## assays(1): counts
## rownames(5006): ENSMUSG00000052595 ENSMUSG00000030111 ...
## ENSMUSG00000055670 ENSMUSG00000027596
## rowData names(3): ID Symbol Type
## colnames(36): aaamobki-1 aaclkaod-1 ... olbjkpjc-1 omjmdimk-1
## colData names(13): cell_id transcript_counts ... segmentation_method
## sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(5): Genomic Control Negative Control Codeword Negative
## Control Probe Unassigned Codeword Deprecated Codeword
## spatialCoords names(0) :
## imgData names(0):
The dataset was obtained from the 10X Genomics website under the X0A
v3.0 section and is a subset of the Xenium Prime 5K Mouse Pan Tissue
& Pathways Panel. The link to the data can be seen as the
url input above and shown below for completeness.
sessionInfo()
R version 4.5.2 (2025-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] XeniumIO_1.3.2 TENxIO_1.13.2
[3] SingleCellExperiment_1.33.0 SummarizedExperiment_1.41.0
[5] Biobase_2.71.0 GenomicRanges_1.63.1
[7] Seqinfo_1.1.0 IRanges_2.45.0
[9] S4Vectors_0.49.0 BiocGenerics_0.57.0
[11] generics_0.1.4 MatrixGenerics_1.23.0
[13] matrixStats_1.5.0 BiocStyle_2.39.0
loaded via a namespace (and not attached):
[1] httr2_1.2.2 rjson_0.2.23 xfun_0.54
[4] bslib_0.9.0 lattice_0.22-7 tzdb_0.5.0
[7] vctrs_0.6.5 tools_4.5.2 parallel_4.5.2
[10] curl_7.0.0 tibble_3.3.0 RSQLite_2.4.5
[13] blob_1.2.4 pkgconfig_2.0.3 BiocBaseUtils_1.13.0
[16] Matrix_1.7-4 dbplyr_2.5.1 assertthat_0.2.1
[19] lifecycle_1.0.4 compiler_4.5.2 codetools_0.2-20
[22] htmltools_0.5.9 sys_3.4.3 buildtools_1.0.0
[25] sass_0.4.10 yaml_2.3.12 crayon_1.5.3
[28] pillar_1.11.1 jquerylib_0.1.4 DelayedArray_0.37.0
[31] cachem_1.1.0 magick_2.9.0 abind_1.4-8
[34] tidyselect_1.2.1 digest_0.6.39 purrr_1.2.0
[37] dplyr_1.1.4 arrow_22.0.0 maketools_1.3.2
[40] fastmap_1.2.0 grid_4.5.2 cli_3.6.5
[43] SparseArray_1.11.9 magrittr_2.0.4 S4Arrays_1.11.1
[46] withr_3.0.2 readr_2.1.6 filelock_1.0.3
[49] rappdirs_0.3.3 bit64_4.6.0-1 rmarkdown_2.30
[52] XVector_0.51.0 bit_4.6.0 hms_1.1.4
[55] SpatialExperiment_1.21.0 memoise_2.0.1 evaluate_1.0.5
[58] knitr_1.50 BiocIO_1.21.0 BiocFileCache_3.1.0
[61] rlang_1.1.6 Rcpp_1.1.0.8 glue_1.8.0
[64] DBI_1.2.3 BiocManager_1.30.27 VisiumIO_1.7.5
[67] vroom_1.6.7 jsonlite_2.0.0 R6_2.6.1