1 Introduction

This vignette demonstrates a complete RNA-seq differential expression workflow using VISTA (Visualization and Integrated System for Transcriptomic Analysis). We’ll use the well-known airway dataset from Bioconductor, which contains RNA-seq data from airway smooth muscle cells treated with dexamethasone.

1.1 What is VISTA?

VISTA streamlines RNA-seq analysis by:

  • Unifying DE workflows: Wraps DESeq2 and edgeR with consistent output
  • Simplifying visualization: 28+ publication-ready plotting functions
  • Integrating enrichment: Built-in MSigDB, GO, and KEGG analysis
  • Ensuring reproducibility: S4 class structure with comprehensive metadata

1.2 Dataset Overview

The airway dataset (Himes et al. 2014) includes:

  • Samples: 8 human airway smooth muscle cell lines
  • Treatment: 4 treated with dexamethasone, 4 untreated
  • Sequencing: Illumina HiSeq 2000
  • Features: ~64,000 genes

Reference: Himes BE et al. (2014). “RNA-Seq transcriptome profiling identifies CRISPLD2 as a glucocorticoid responsive gene that modulates cytokine function in airway smooth muscle cells.” PLoS One 9(6): e99625.

2 Installation and Setup

if (!requireNamespace("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}

BiocManager::install(c("VISTA", "airway", "org.Hs.eg.db"))
install.packages("ggplot2")
# Load required packages
library(VISTA)
library(ggplot2)         # For plotting functions
library(airway)          # Dataset
library(org.Hs.eg.db)    # Human gene annotations
library(magrittr)    # %>% 

If required packages are missing, install them before rendering this vignette.

3 Data Preparation

3.1 Load the airway dataset

# Load the SummarizedExperiment object
data("airway", package = "airway")

# Examine the structure
airway
#> class: RangedSummarizedExperiment 
#> dim: 63677 8 
#> metadata(1): ''
#> assays(1): counts
#> rownames(63677): ENSG00000000003 ENSG00000000005 ... ENSG00000273492
#>   ENSG00000273493
#> rowData names(10): gene_id gene_name ... seq_coord_system symbol
#> colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
#> colData names(9): SampleName cell ... Sample BioSample

3.2 Extract counts and metadata

# Extract count matrix
counts_matrix <- assay(airway, "counts")

# Preview counts (first 5 genes, first 4 samples)
counts_matrix[1:5, 1:4]
#>                 SRR1039508 SRR1039509 SRR1039512 SRR1039513
#> ENSG00000000003        679        448        873        408
#> ENSG00000000005          0          0          0          0
#> ENSG00000000419        467        515        621        365
#> ENSG00000000457        260        211        263        164
#> ENSG00000000460         60         55         40         35

# Extract sample metadata
sample_metadata <- as.data.frame(colData(airway))
sample_metadata
#>            SampleName    cell   dex albut        Run avgLength Experiment
#> SRR1039508 GSM1275862  N61311 untrt untrt SRR1039508       126  SRX384345
#> SRR1039509 GSM1275863  N61311   trt untrt SRR1039509       126  SRX384346
#> SRR1039512 GSM1275866 N052611 untrt untrt SRR1039512       126  SRX384349
#> SRR1039513 GSM1275867 N052611   trt untrt SRR1039513        87  SRX384350
#> SRR1039516 GSM1275870 N080611 untrt untrt SRR1039516       120  SRX384353
#> SRR1039517 GSM1275871 N080611   trt untrt SRR1039517       126  SRX384354
#> SRR1039520 GSM1275874 N061011 untrt untrt SRR1039520       101  SRX384357
#> SRR1039521 GSM1275875 N061011   trt untrt SRR1039521        98  SRX384358
#>               Sample    BioSample
#> SRR1039508 SRS508568 SAMN02422669
#> SRR1039509 SRS508567 SAMN02422675
#> SRR1039512 SRS508571 SAMN02422678
#> SRR1039513 SRS508572 SAMN02422670
#> SRR1039516 SRS508575 SAMN02422682
#> SRR1039517 SRS508576 SAMN02422673
#> SRR1039520 SRS508579 SAMN02422683
#> SRR1039521 SRS508580 SAMN02422677

# Simplify treatment labels for clarity
sample_metadata$treatment <- ifelse(
  sample_metadata$dex == "trt",
  "Dexamethasone",
  "Untreated"
)

3.3 Prepare data for VISTA

VISTA includes helper functions that standardize counts and sample metadata before calling create_vista():

prepared_counts <- read_vista_counts(
  counts_matrix,
  format = "matrix"
)

prepared_samples <- read_vista_metadata(
  sample_metadata %>%
    tibble::as_tibble() %>%
    dplyr::rename("sample_names" = "Run")
)

matched_inputs <- match_vista_inputs(prepared_counts, prepared_samples)

# Preview the create_vista-ready count table
matched_inputs$counts[1:5, 1:5]
#>                         gene_id SRR1039508 SRR1039509 SRR1039512 SRR1039513
#> ENSG00000000003 ENSG00000000003        679        448        873        408
#> ENSG00000000005 ENSG00000000005          0          0          0          0
#> ENSG00000000419 ENSG00000000419        467        515        621        365
#> ENSG00000000457 ENSG00000000457        260        211        263        164
#> ENSG00000000460 ENSG00000000460         60         55         40         35

For a more complete guide covering file-derived sample-name repair and starter metadata generation with derive_vista_metadata(), see the pkgdown article Preparing Counts and Metadata for VISTA.

The matched sample sheet now has stable sample_names aligned to the count columns:

matched_inputs$sample_info[, c("sample_names", "cell", "treatment", "dex")]
#>            sample_names    cell     treatment   dex
#> SRR1039508   SRR1039508  N61311     Untreated untrt
#> SRR1039509   SRR1039509  N61311 Dexamethasone   trt
#> SRR1039512   SRR1039512 N052611     Untreated untrt
#> SRR1039513   SRR1039513 N052611 Dexamethasone   trt
#> SRR1039516   SRR1039516 N080611     Untreated untrt
#> SRR1039517   SRR1039517 N080611 Dexamethasone   trt
#> SRR1039520   SRR1039520 N061011     Untreated untrt
#> SRR1039521   SRR1039521 N061011 Dexamethasone   trt

4 Create VISTA Object

4.1 Using DESeq2 backend

The primary method for creating a VISTA object:

# Create VISTA object with DESeq2 backend
vista <- create_vista(
  counts = matched_inputs$counts,
  sample_info = matched_inputs$sample_info,
  column_geneid = matched_inputs$column_geneid,
  group_column = "treatment",
  group_numerator = "Dexamethasone",
  group_denominator = "Untreated",
  method = "deseq2",
  min_counts = 10,
  min_replicates = 2,
  log2fc_cutoff = 1.0,
  pval_cutoff = 0.05,
  p_value_type = "padj"
)

# Examine the VISTA object
vista
#> class: VISTA 
#> dim: 17199 8 
#> metadata(12): de_results de_summary ... design comparison
#> assays(1): norm_counts
#> rownames(17199): ENSG00000000003 ENSG00000000419 ... ENSG00000273487
#>   ENSG00000273488
#> rowData names(1): baseMean
#> colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
#> colData names(11): SampleName cell ... sizeFactor sample_names

The VISTA object stores:

  • Normalized counts in assays
  • Sample metadata in colData
  • Gene annotations in rowData
  • DE results in metadata

4.2 Validate object integrity

create_vista() runs validation by default (validate = TRUE). You can also run it explicitly:

validate_vista(vista, level = "full")

For advanced users importing a pre-built SummarizedExperiment, use as_vista() and then validate:

se <- SummarizedExperiment::SummarizedExperiment(
  assays = list(norm_counts = norm_counts(vista)),
  colData = S4Vectors::DataFrame(sample_info(vista), row.names = sample_info(vista)$sample_names),
  rowData = S4Vectors::DataFrame(row_data(vista), row.names = rownames(norm_counts(vista)))
)
vista2 <- as_vista(se, group_column = "treatment")
validate_vista(vista2, level = "full")

4.3 Alternative: Using edgeR backend

# Create VISTA object with edgeR backend
vista_edger <- create_vista(
  counts = count_data,
  sample_info = sample_info,
  column_geneid = "gene_id",
  group_column = "treatment",
  group_numerator = "Dexamethasone",
  group_denominator = "Untreated",
  method = "edger",  # Use edgeR instead of DESeq2
  min_counts = 10,
  min_replicates = 2,
  log2fc_cutoff = 1.0,
  pval_cutoff = 0.05,
  p_value_type = "padj"
)

4.4 Alternative: Using limma-voom backend

# Create VISTA object with limma-voom backend
vista_limma <- create_vista(
  counts = count_data,
  sample_info = sample_info,
  column_geneid = "gene_id",
  group_column = "treatment",
  group_numerator = "Dexamethasone",
  group_denominator = "Untreated",
  method = "limma",
  min_counts = 10,
  min_replicates = 2,
  log2fc_cutoff = 1.0,
  pval_cutoff = 0.05,
  p_value_type = "padj"
)

4.5 Advanced: covariates, design formula, and consensus mode

# Covariate-adjusted model (additive design)
vista_cov <- create_vista(
  counts = count_data,
  sample_info = sample_info,
  column_geneid = "gene_id",
  group_column = "treatment",
  group_numerator = "Dexamethasone",
  group_denominator = "Untreated",
  method = "deseq2",
  covariates = c("cell")
)

# Equivalent explicit model formula
vista_formula <- create_vista(
  counts = count_data,
  sample_info = sample_info,
  column_geneid = "gene_id",
  group_column = "treatment",
  group_numerator = "Dexamethasone",
  group_denominator = "Untreated",
  method = "deseq2",
  design_formula = "~ cell + treatment"
)

# Run both DESeq2 and edgeR and keep consensus as active source
vista_both <- create_vista(
  counts = count_data,
  sample_info = sample_info,
  column_geneid = "gene_id",
  group_column = "treatment",
  group_numerator = "Dexamethasone",
  group_denominator = "Untreated",
  method = "both",
  consensus_mode = "intersection",  # or "union"
  result_source = "consensus"       # or "deseq2"/"edger"
)

# Access source-specific outputs
comparisons(vista_both, source = "consensus")
comparisons(vista_both, source = "deseq2")
comparisons(vista_both, source = "edger")

# Switch the active source used by plotting functions
vista_both <- set_de_source(vista_both, "edger")

4.6 Add gene annotations

Enhance the object with gene symbols and descriptions:

vista <- set_rowdata(
  vista,
  orgdb = org.Hs.eg.db,
  columns = c("SYMBOL", "GENENAME", "ENTREZID"),
  keytype = "ENSEMBL"
)

# View updated gene annotations
head(rowData(vista))
#> DataFrame with 6 rows and 4 columns
#>                  baseMean      SYMBOL               GENENAME    ENTREZID
#>                 <numeric> <character>            <character> <character>
#> ENSG00000000003  709.7752      TSPAN6          tetraspanin 6        7105
#> ENSG00000000419  521.1362        DPM1 dolichyl-phosphate m..        8813
#> ENSG00000000457  237.5619       SCYL3 SCY1 like pseudokina..       57147
#> ENSG00000000460   58.0338       FIRRM FIGNL1 interacting r..       55732
#> ENSG00000000971 5826.6231         CFH    complement factor H        3075
#> ENSG00000001036 1284.4108       FUCA2   alpha-L-fucosidase 2        2519

5 Explore the Results

5.1 Access differential expression results

# Get comparison names
comp_names <- names(comparisons(vista))
comp_names
#> [1] "Dexamethasone_VS_Untreated"

# Get DE results for the comparison
de_results <- comparisons(vista)[[1]]
head(de_results)
#>                         gene_id   baseMean      log2fc      lfcSE       stat
#> ENSG00000000003 ENSG00000000003  709.77518 -0.38027500 0.17108885 -2.2226756
#> ENSG00000000419 ENSG00000000419  521.13616  0.20227695 0.09533472  2.1217555
#> ENSG00000000457 ENSG00000000457  237.56187  0.03272066 0.12278673  0.2664837
#> ENSG00000000460 ENSG00000000460   58.03383 -0.11851609 0.30815777 -0.3845955
#> ENSG00000000971 ENSG00000000971 5826.62312  0.43942357 0.25944519  1.6937048
#> ENSG00000001036 ENSG00000001036 1284.41081 -0.24322719 0.11682555 -2.0819691
#>                     pvalue      padj regulation
#> ENSG00000000003 0.02623769 0.1206592      Other
#> ENSG00000000419 0.03385828 0.1448222      Other
#> ENSG00000000457 0.78986672 0.9229371      Other
#> ENSG00000000460 0.70053714 0.8844114      Other
#> ENSG00000000971 0.09032139 0.2843077      Other
#> ENSG00000001036 0.03734529 0.1541257      Other

# Get DEG summary
deg_summary(vista)
#> $Dexamethasone_VS_Untreated
#>   regulation     n
#> 1       Down   388
#> 2      Other 16346
#> 3         Up   465

# Get analysis cutoffs
cutoffs(vista)
#> $log2fc
#> [1] 1
#> 
#> $pval
#> [1] 0.05
#> 
#> $p_value_type
#> [1] "padj"
#> 
#> $method
#> [1] "deseq2"
#> 
#> $min_counts
#> [1] 10
#> 
#> $min_replicates
#> [1] 2
#> 
#> $covariates
#> character(0)
#> 
#> $design_formula
#> NULL
#> 
#> $consensus_mode
#> NULL
#> 
#> $consensus_log2fc
#> NULL
#> 
#> $active_source
#> [1] "deseq2"

5.2 Count significant genes

# Extract upregulated genes
up_genes <- get_genes_by_regulation(
  vista,
  sample_comparisons = comp_names[1],
  regulation = "Up",
  #top_n = 50
)

# Extract downregulated genes
down_genes <- get_genes_by_regulation(
  vista,
  sample_comparisons = comp_names[1],
  regulation = "Down",
  #top_n = 50
)

# Summary
cat("Upregulated genes:", length(up_genes[[1]]), "\n")
#> Upregulated genes: 465
cat("Downregulated genes:", length(down_genes[[1]]), "\n")
#> Downregulated genes: 388

6 Quality Control Visualizations

6.1 Sample Correlation Heatmap

Check sample relationships and potential batch effects.

6.1.1 Basic correlation heatmap

  get_corr_heatmap(vista, label_size = 18,base_size = 18, viridis_direction = -1)

6.1.2 Customize color scheme

# Reverse viridis color direction
get_corr_heatmap(
  vista,
  viridis_direction = -1,
  viridis_option = "plasma",
  label_size = 18,
  base_size = 18
)

6.1.3 Show correlation values

# Display correlation coefficients
get_corr_heatmap(
  vista,
  viridis_direction = -1,
  show_corr_values = TRUE,
  col_corr_values = 'white',
  viridis_option = "mako",
  label_size = 14,
  base_size = 14
) 

6.2 Principal Component Analysis (PCA)

Visualize sample clustering and variation.

6.2.1 Basic PCA with labels

get_pca_plot(
  vista,
  label = TRUE,label_size = 5
)

6.2.2 PCA colored by different metadata

# Shape points by cell line
get_pca_plot(
  vista,
  label = TRUE,
  label_size = 5,
  shape_by = "cell"
)

6.2.3 PCA with top variable genes

# Use top 500 most variable genes
get_pca_plot(
  vista,
  top_n_genes = 500,
  show_clusters = TRUE,
  shape_by = "cell"
)

6.2.4 PCA with custom circle size

# Larger points for better visibility
get_pca_plot(
  vista,
  label = TRUE,
  point_size = 15,label_size = 5
)

6.2.5 PCA without labels

# Clean plot without sample labels
get_pca_plot(
  vista,
  label = FALSE,
  point_size = 12
)

6.3 Multidimensional Scaling (MDS)

Alternative dimensionality reduction method.

6.3.1 Basic MDS plot

get_mds_plot(
  vista,
  label = TRUE
)

6.3.2 MDS with top variable genes

get_mds_plot(
  vista,
  top_n_genes = 500,
  label = TRUE
)

6.3.3 MDS with custom shapes

# Shape points by cell line
get_mds_plot(
  vista,
  label = TRUE,
  shape_by = "cell"
)

6.4 Uniform Manifold Approximation and Projection (UMAP)

Non-linear sample embedding for exploratory structure.

6.4.1 Basic UMAP plot

get_umap_plot(
  vista,
  label = TRUE
)

6.4.2 UMAP colored by a user-defined metadata column

get_umap_plot(
  vista,
  color_by = "cell",
  shape_by = "treatment",
  label = TRUE
)

7 Differential Expression Visualizations

7.1 DEG Count Summary

7.1.1 Basic count barplot

get_deg_count_barplot(vista)

7.1.2 Faceted by regulation

get_deg_count_barplot(
  vista,
  facet_by = "regulation"
)

7.2 Volcano Plot

Classic volcano plot showing log2FC vs -log10(p-value).

7.2.1 Basic volcano plot

get_volcano_plot(
  vista,
  sample_comparison = comp_names[1]
)

7.2.2 Customize cutoffs and labels

get_volcano_plot(
  vista,
  sample_comparison = comp_names[1],
  log2fc_cutoff = 1.5,
  pval_cutoff = 0.01,
  label_size = 3,
  display_id = "SYMBOL"
)

7.2.3 Custom colors

# Custom colors for up/down regulated genes
get_volcano_plot(
  vista,
  sample_comparison = comp_names[1],
  col_up = "red",
  col_down = "blue",
  display_id = "SYMBOL"
)

7.3 MA Plot

Mean expression vs log2 fold-change.

7.3.1 Basic MA plot

get_ma_plot(
  vista,
  sample_comparison = comp_names[1]
)

7.3.2 Label top genes

get_ma_plot(
  vista,
  sample_comparison = comp_names[1],
  label_n = 10,
  point_size = 2,
  display_id = "SYMBOL"
)

7.3.3 Custom cutoffs

get_ma_plot(
  vista,
  sample_comparison = comp_names[1],
  label_n = 5,
  display_id = "SYMBOL",
  point_size = 1.5,
  alpha = 0.8
)

8 Expression Pattern Analysis

8.1 Prepare gene sets

# Get top 50 DEGs by adjusted p-value
de_table <- comparisons(vista)[[1]]

top_degs <- get_genes_by_regulation(vista, names(comparisons(vista))[1],regulation = "Both",top_n = 50)[[1]]  # top 50 by abs fold change 

top_up <- get_genes_by_regulation(vista, names(comparisons(vista))[1],regulation = "Up",top_n = 50)[[1]]  

top_down <- get_genes_by_regulation(vista, names(comparisons(vista))[1],regulation = "Down",top_n = 50)[[1]]  


cat("Top upregulated genes:\n")
#> Top upregulated genes:
print(top_up)
#>  [1] "ENSG00000179593" "ENSG00000109906" "ENSG00000250978" "ENSG00000132518"
#>  [5] "ENSG00000171819" "ENSG00000127954" "ENSG00000249364" "ENSG00000137673"
#>  [9] "ENSG00000100033" "ENSG00000168481" "ENSG00000168309" "ENSG00000264868"
#> [13] "ENSG00000152583" "ENSG00000163884" "ENSG00000177575" "ENSG00000127324"
#> [17] "ENSG00000101342" "ENSG00000270689" "ENSG00000268894" "ENSG00000152779"
#> [21] "ENSG00000128045" "ENSG00000096060" "ENSG00000152463" "ENSG00000173838"
#> [25] "ENSG00000273259" "ENSG00000101347" "ENSG00000187288" "ENSG00000219565"
#> [29] "ENSG00000211445" "ENSG00000143127" "ENSG00000128917" "ENSG00000170214"
#> [33] "ENSG00000163083" "ENSG00000178723" "ENSG00000248187" "ENSG00000157152"
#> [37] "ENSG00000170323" "ENSG00000231246" "ENSG00000233117" "ENSG00000157514"
#> [41] "ENSG00000189221" "ENSG00000165995" "ENSG00000182836" "ENSG00000112936"
#> [45] "ENSG00000269289" "ENSG00000174697" "ENSG00000179094" "ENSG00000187193"
#> [49] "ENSG00000006788" "ENSG00000102760"
cat("\nTop downregulated genes:\n")
#> 
#> Top downregulated genes:
print(top_down)
#>  [1] "ENSG00000128285" "ENSG00000267339" "ENSG00000019186" "ENSG00000183454"
#>  [5] "ENSG00000146006" "ENSG00000122679" "ENSG00000155897" "ENSG00000143494"
#>  [9] "ENSG00000141469" "ENSG00000108700" "ENSG00000162692" "ENSG00000175489"
#> [13] "ENSG00000183092" "ENSG00000250657" "ENSG00000136267" "ENSG00000214814"
#> [17] "ENSG00000261121" "ENSG00000105989" "ENSG00000122877" "ENSG00000188176"
#> [21] "ENSG00000131771" "ENSG00000165272" "ENSG00000184564" "ENSG00000079101"
#> [25] "ENSG00000188501" "ENSG00000119714" "ENSG00000223811" "ENSG00000130487"
#> [29] "ENSG00000166670" "ENSG00000165388" "ENSG00000013293" "ENSG00000123405"
#> [33] "ENSG00000145777" "ENSG00000140600" "ENSG00000124134" "ENSG00000146250"
#> [37] "ENSG00000116991" "ENSG00000126878" "ENSG00000197046" "ENSG00000128165"
#> [41] "ENSG00000084710" "ENSG00000173110" "ENSG00000123689" "ENSG00000106003"
#> [45] "ENSG00000181634" "ENSG00000154864" "ENSG00000182732" "ENSG00000136999"
#> [49] "ENSG00000015520" "ENSG00000095585"

8.2 Expression Heatmaps

8.2.1 Basic heatmap

get_expression_heatmap(vista)

8.2.2 Heatmap with explicit gene set

get_expression_heatmap(
  vista,
  sample_group = levels(colData(vista)$treatment),
  genes = top_degs,
  display_id = "SYMBOL",
  show_row_names = TRUE
)

8.2.3 Heatmap with k-means clustering

get_expression_heatmap(
  vista,
  sample_group = levels(colData(vista)$treatment),
  genes = top_degs,
  display_id = "SYMBOL",
  kmeans_k = 3,
  show_row_names = TRUE
)

8.2.4 Heatmap with column annotations

get_expression_heatmap(
  vista,
  sample_group = levels(colData(vista)$treatment),
  genes = top_degs,
  show_row_names = TRUE,
  display_id = "SYMBOL",
  kmeans_k = 3,
  cluster_row_slice = FALSE,
  summarise_replicates = FALSE,
  annotate_columns = TRUE
)

8.2.5 Heatmap with multiple column annotations and cluster_by

# Use multiple sample-level columns in top annotation.
# By default, columns are split by the first annotation column.


get_expression_heatmap(
  vista,
  sample_group = levels(colData(vista)$treatment),
  genes = top_degs,
  show_row_names = FALSE,
  display_id = "SYMBOL",
  summarise_replicates = FALSE,
  annotate_columns = c("treatment", "cell"),
  cluster_by = "cell"
)

8.2.6 Heatmap showing each replicate

get_expression_heatmap(
  vista,
  sample_group = levels(colData(vista)$treatment),
  genes = top_degs,
  display_id = "SYMBOL",
  summarise_replicates = FALSE,
  show_row_names = FALSE,
  annotate_columns = TRUE,
  kmeans_k = 2
)

8.3 Expression Barplots

8.3.1 Basic barplot

get_expression_barplot(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL",
)+theme_minimal(base_size = 16)

8.3.2 Log-transformed with statistics

# Add statistical comparisons between groups
get_expression_barplot(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL",
  log_transform = TRUE,
  stats_group = TRUE  # Enable statistical annotations
) + theme_minimal(base_size = 16)

8.3.3 Per-sample barplot for selected genes

get_expression_barplot(
  vista,
  genes = top_up[1:2],
  display_id = "SYMBOL",
  by = "sample",
  sample_order = "group"
)+ theme(text = element_text(size = 16))

8.3.4 Compare up and down regulated genes

# Compare expression of both up- and down-regulated genes
selected_genes <- c(top_up[1:3], top_down[1:3])
get_expression_barplot(
  vista,
  genes = selected_genes,
  display_id = "SYMBOL",
  log_transform = TRUE,
  stats_group = TRUE
)+ theme(text = element_text(size = 16))

8.4 Expression Boxplots

8.4.1 Basic boxplot

get_expression_boxplot(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL"
)+ theme(text = element_text(size = 16))

8.4.2 Boxplot without faceting

# All genes overlaid on same plot
get_expression_boxplot(
  vista,
  genes = top_up[1:3],
  display_id = "SYMBOL",
  facet_by = "none"
) + theme(text = element_text(size = 16))

8.4.3 Boxplot with faceting by gene

# Each gene in separate panel - must specify facet_by = "gene"
get_expression_boxplot(
  vista,
  genes = top_up[1:3],
  display_id = "SYMBOL",
  facet_by = "gene",
  facet_scales = "free_y"
) + theme(text = element_text(size = 16))

8.4.4 Boxplot with gene facets AND statistics

# Each gene in separate panel WITH statistical comparisons
get_expression_boxplot(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL",
  log_transform = TRUE,
  facet_by = "gene",
  facet_scales = "free_y",
  stats_group = TRUE,  # Add statistics to each gene panel
  p.label = "p.signif"
)+ theme(text = element_text(size = 16))

8.4.5 Pooled genes with statistics

# Pool all genes together for group comparison with statistical test
get_expression_boxplot(
  vista,
  genes = top_up[1:5],
  display_id = "SYMBOL",
  log_transform = TRUE,
  pool_genes = TRUE,
  facet_by = "none",
  stats_group = TRUE,  # Required for statistical annotations
  p.label = "p.signif"
)+ theme(text = element_text(size = 16))

8.4.6 Log-transformed with p-values

# Show statistical comparisons between treatment groups
get_expression_boxplot(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL",
  log_transform = TRUE,
  stats_group = TRUE,  # Enable statistical annotations
  p.label = "p.signif"
)+ theme(text = element_text(size = 16))

8.5 Expression Violin Plots

8.5.1 Basic violin plot

get_expression_violinplot(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL"
)+ theme(text = element_text(size = 16))

8.5.2 Violin with log2 transformation

get_expression_violinplot(
  vista,
  genes = top_up[1:4],,
  display_id = "SYMBOL",
  value_transform = "none"
)+ theme(text = element_text(size = 16))

8.5.3 Violin with z-score transformation

get_expression_violinplot(
  vista,
  genes = top_up[1:4],
  value_transform = "zscore"
)+ theme(text = element_text(size = 16))

8.6 Additional Expression Plots

8.6.1 Density plot

get_expression_density(
  vista,
  genes = top_up[1:50],
  log_transform = TRUE
)+ theme(text = element_text(size = 16))

8.6.2 Scatter plot (sample vs sample)

# Compare two samples
samples <- colnames(vista)
get_expression_scatter(
  vista,
  sample_x = samples[1],
  sample_y = samples[2],
  log_transform = TRUE,
  label_n = 50,display_id = "SYMBOL",label_size = 4
)+ theme(text = element_text(size = 16))

8.6.3 Line plot (expression across samples)

get_expression_lineplot(
  vista,
  genes = top_up[1:3],
  log_transform = TRUE,display_id = "SYMBOL",
  by = "sample",facet_by = "none",
  group_column = "treatment",sample_group = c("Untreated","Dexamethasone")
)

8.6.4 Lollipop plot

get_expression_lollipop(
  vista,
  genes = top_up[1:4],
  display_id = "SYMBOL",
  log_transform = TRUE
)

8.6.5 Per-sample lollipop plot

get_expression_lollipop(
  vista,
  genes = top_up[1:2],
  display_id = "SYMBOL",
  by = "sample",
  sample_order = "expression"
)

8.6.6 Joyplot by treatment group

# Ridges by treatment group - shows distribution for each group
get_expression_joyplot(
  vista,
  genes = top_up[1:5],
  log_transform = TRUE,
  y_by = "group",      # Each treatment group gets a ridge
  color_by = "group"   # Color by treatment group
)

8.6.7 Joyplot by sample

# Ridges by individual sample - shows distribution for each sample
get_expression_joyplot(
  vista,
  genes = top_up[1:3],
  log_transform = TRUE,
  y_by = "sample",     # Each sample gets a ridge
  color_by = "group"   # Color by treatment group
)

8.6.8 Raincloud plot (expression)

get_expression_raincloud(
  vista,
  genes = top_up,
  value_transform = "log2",
  summarise = TRUE,
  facet_by = "none",
  id.long.var = "gene",
  stats_group = TRUE
)

9 Functional Enrichment Analysis

9.1 MSigDB Enrichment

9.1.1 Hallmark gene sets - Upregulated

msig_up <- get_msigdb_enrichment(
  vista,
  sample_comparison = comp_names[1],
  regulation = "Up",
  msigdb_category = "H",  # Hallmark gene sets
  species = "Homo sapiens",
  from_type = "ENSEMBL"
)

# View top enriched pathways
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  head(msig_up$enrich@result[, c("Description", "pvalue", "p.adjust", "Count")])
}
#>                                                                           Description
#> HALLMARK_TNFA_SIGNALING_VIA_NFKB                     HALLMARK_TNFA_SIGNALING_VIA_NFKB
#> HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION
#> HALLMARK_ADIPOGENESIS                                           HALLMARK_ADIPOGENESIS
#> HALLMARK_UV_RESPONSE_DN                                       HALLMARK_UV_RESPONSE_DN
#> HALLMARK_HYPOXIA                                                     HALLMARK_HYPOXIA
#> HALLMARK_P53_PATHWAY                                             HALLMARK_P53_PATHWAY
#>                                                  pvalue     p.adjust Count
#> HALLMARK_TNFA_SIGNALING_VIA_NFKB           3.153757e-13 1.576878e-11    28
#> HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION 1.136118e-06 2.840294e-05    19
#> HALLMARK_ADIPOGENESIS                      2.272727e-04 3.787878e-03    15
#> HALLMARK_UV_RESPONSE_DN                    3.664291e-04 4.580364e-03    12
#> HALLMARK_HYPOXIA                           7.260526e-04 6.050438e-03    14
#> HALLMARK_P53_PATHWAY                       7.260526e-04 6.050438e-03    14

9.1.2 Hallmark gene sets - Downregulated

msig_down <- get_msigdb_enrichment(
  vista,
  sample_comparison = comp_names[1],
  regulation = "Down",
  msigdb_category = "H",
  species = "Homo sapiens",
  from_type = "ENSEMBL"
)

if (!is.null(msig_down$enrich) && nrow(msig_down$enrich@result) > 0) {
  head(msig_down$enrich@result[, c("Description", "pvalue", "p.adjust", "Count")])
}
#>                                                                           Description
#> HALLMARK_P53_PATHWAY                                             HALLMARK_P53_PATHWAY
#> HALLMARK_TNFA_SIGNALING_VIA_NFKB                     HALLMARK_TNFA_SIGNALING_VIA_NFKB
#> HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION
#> HALLMARK_MTORC1_SIGNALING                                   HALLMARK_MTORC1_SIGNALING
#> HALLMARK_MYOGENESIS                                               HALLMARK_MYOGENESIS
#> HALLMARK_APOPTOSIS                                                 HALLMARK_APOPTOSIS
#>                                                  pvalue     p.adjust Count
#> HALLMARK_P53_PATHWAY                       1.669778e-06 8.348888e-05    17
#> HALLMARK_TNFA_SIGNALING_VIA_NFKB           4.166736e-04 1.041684e-02    13
#> HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION 1.377933e-03 2.296555e-02    12
#> HALLMARK_MTORC1_SIGNALING                  4.197724e-03 4.197724e-02    11
#> HALLMARK_MYOGENESIS                        4.197724e-03 4.197724e-02    11
#> HALLMARK_APOPTOSIS                         8.312106e-03 6.597627e-02     9

9.2 Enrichment Visualizations

9.2.1 VISTA dotplot (default)

# VISTA's wrapper function
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  get_enrichment_plot(msig_up$enrich, top_n = 10)
}

9.2.2 Barplot (clusterProfiler native)

# Use generic barplot with enrichResult method
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  barplot(msig_up$enrich, showCategory = 10)
}

9.2.3 Dotplot with customization

# Customized dotplot with more categories
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  enrichplot::dotplot(msig_up$enrich, showCategory = 20, font.size = 12)
}

9.2.4 Network plot (clusterProfiler native)

# Gene-concept network showing gene-pathway relationships
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  enrichplot::cnetplot(msig_up$enrich, showCategory = 5)
}

9.2.5 Chord diagram (gene–pathway relationships)

The chord diagram reveals which hub genes drive multiple enriched pathways and how much redundancy exists across terms. Chords can be coloured by fold-change when a VISTA object is supplied.

# Pathway-coloured chord diagram (no VISTA object needed)
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  get_enrichment_chord(
    msig_up,
    top_n    = 6,
    color_by = "pathway",
    title    = "Gene-Pathway Membership (Up-regulated, Hallmark)"
  )
}

# Fold-change coloured chords
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  get_enrichment_chord(
    msig_up,
    vista  = vista,
    sample_comparison = names(comparisons(vista))[1],
    top_n      = 6,
    color_by   = "foldchange",
    display_id = "SYMBOL",
    title      = "Gene-Pathway Chord (coloured by log2FC)"
  )
}

# Show only hub genes shared across 2+ pathways
if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  chord_result <- get_enrichment_chord(
    msig_up,
    vista    = vista,
    sample_comparison   = names(comparisons(vista))[1],
    top_n        = 8,
    min_pathways = 2,
    color_by     = "regulation",
    display_id   = "SYMBOL",
    title        = "Hub Genes Bridging Multiple Pathways"
  )

  # Inspect hub genes returned invisibly
  if (length(chord_result$hub_genes)) {
    cat("Hub genes:", paste(head(chord_result$hub_genes, 10), collapse = ", "), "\n")
  }
}

#> Hub genes: ENSG00000003402, ENSG00000060718, ENSG00000067082, ENSG00000099860, ENSG00000102804, ENSG00000118689, ENSG00000120129, ENSG00000122641, ENSG00000130066, ENSG00000131979

9.3 Pathway-Specific Expression Heatmaps

Use enrichment output to extract pathway genes and visualize their expression directly.

9.3.1 Extract genes from top pathways

if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  pathway_gene_list <- get_pathway_genes(
    msig_up$enrich,
    top_n = 3,
    return_type = "list"
  )

  # Preview first few genes per pathway
  lapply(pathway_gene_list, head, n = 5)
}
#> $HALLMARK_TNFA_SIGNALING_VIA_NFKB
#> [1] "ENSG00000102804" "ENSG00000143878" "ENSG00000142871" "ENSG00000119508"
#> [5] "ENSG00000179094"
#> 
#> $HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION
#> [1] "ENSG00000143878" "ENSG00000142871" "ENSG00000187498" "ENSG00000166825"
#> [5] "ENSG00000107796"
#> 
#> $HALLMARK_ADIPOGENESIS
#> [1] "ENSG00000148175" "ENSG00000241399" "ENSG00000187498" "ENSG00000127083"
#> [5] "ENSG00000095637"

9.3.2 Heatmap of genes from top enriched pathways

if (!is.null(msig_up$enrich) && nrow(msig_up$enrich@result) > 0) {
  get_pathway_heatmap(
    x = vista,
    enrichment = msig_up$enrich,
    sample_group = c("Untreated", "Dexamethasone"),
    top_n = 2,
    gene_mode = "union",
    max_genes = 60,
    value_transform = "zscore",
    display_id = "SYMBOL",
    annotate_columns = TRUE,
    summarise_replicates = FALSE,
    show_row_names = FALSE
  )
}

9.4 GO Enrichment

9.4.1 Biological Process

go_bp <- get_go_enrichment(
  vista,
  sample_comparison = comp_names[1],
  regulation = "Up",
  ont = "BP",  # Biological Process
  species = "Homo sapiens",
  from_type = "ENSEMBL"
)

if (!is.null(go_bp$enrich) && nrow(go_bp$enrich@result) > 0) {
  head(go_bp$enrich@result[, c("Description", "pvalue", "p.adjust", "Count")], n = 10)
}
#>                                                                                    Description
#> GO:0030198                                                   extracellular matrix organization
#> GO:0043062                                                extracellular structure organization
#> GO:0045229                                       external encapsulating structure organization
#> GO:0060986                                                         endocrine hormone secretion
#> GO:0032970                                          regulation of actin filament-based process
#> GO:0071375                                       cellular response to peptide hormone stimulus
#> GO:0050886                                                                   endocrine process
#> GO:0003012                                                               muscle system process
#> GO:0035360 positive regulation of peroxisome proliferator activated receptor signaling pathway
#> GO:0043500                                                                   muscle adaptation
#>                  pvalue    p.adjust Count
#> GO:0030198 3.993655e-07 0.000877259    20
#> GO:0043062 4.228448e-07 0.000877259    20
#> GO:0045229 4.475811e-07 0.000877259    20
#> GO:0060986 3.589831e-06 0.004386731     8
#> GO:0032970 4.017769e-06 0.004386731    22
#> GO:0071375 4.476256e-06 0.004386731    20
#> GO:0050886 7.259704e-06 0.006098152    10
#> GO:0003012 1.573717e-05 0.011566821    24
#> GO:0035360 2.458483e-05 0.016062090     4
#> GO:0043500 3.708999e-05 0.021236982    10

9.4.2 GO Visualization

if (!is.null(go_bp$enrich) && nrow(go_bp$enrich@result) > 0) {
  get_enrichment_plot(go_bp$enrich, top_n = 20)
}

9.5 Gene Set Enrichment Analysis (GSEA)

GSEA uses ranked gene lists to identify pathways enriched at the top or bottom of the ranking. VISTA automatically prepares the ranked list from your differential expression results.

9.5.1 GSEA with MSigDB Hallmark gene sets

# Run GSEA using VISTA's native function
gsea_results <- get_gsea(
  vista,
  sample_comparison = comp_names[1],
  set_type = "msigdb",
  from_type = "ENSEMBL",
  species = "Homo sapiens",
  msigdb_category = "H",  # Hallmark gene sets
  pvalueCutoff = 0.05,
  pAdjustMethod = "BH"
)

# Show results
if (!is.null(gsea_results$enrich) && nrow(gsea_results$enrich@result) > 0) {
  head(gsea_results$enrich@result[, c("Description", "NES", "pvalue", "p.adjust")], n = 10)
}
#>                                                                       Description
#> HALLMARK_ADIPOGENESIS                                       HALLMARK_ADIPOGENESIS
#> HALLMARK_ANDROGEN_RESPONSE                             HALLMARK_ANDROGEN_RESPONSE
#> HALLMARK_TNFA_SIGNALING_VIA_NFKB                 HALLMARK_TNFA_SIGNALING_VIA_NFKB
#> HALLMARK_XENOBIOTIC_METABOLISM                     HALLMARK_XENOBIOTIC_METABOLISM
#> HALLMARK_APICAL_JUNCTION                                 HALLMARK_APICAL_JUNCTION
#> HALLMARK_OXIDATIVE_PHOSPHORYLATION             HALLMARK_OXIDATIVE_PHOSPHORYLATION
#> HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY
#> HALLMARK_IL2_STAT5_SIGNALING                         HALLMARK_IL2_STAT5_SIGNALING
#> HALLMARK_UV_RESPONSE_DN                                   HALLMARK_UV_RESPONSE_DN
#> HALLMARK_UV_RESPONSE_UP                                   HALLMARK_UV_RESPONSE_UP
#>                                               NES       pvalue     p.adjust
#> HALLMARK_ADIPOGENESIS                    2.009082 1.955439e-07 9.777196e-06
#> HALLMARK_ANDROGEN_RESPONSE               1.637375 2.058698e-03 3.097810e-02
#> HALLMARK_TNFA_SIGNALING_VIA_NFKB         1.600438 5.869613e-04 1.467403e-02
#> HALLMARK_XENOBIOTIC_METABOLISM           1.522777 3.097810e-03 3.097810e-02
#> HALLMARK_APICAL_JUNCTION                 1.507349 7.012408e-03 4.011578e-02
#> HALLMARK_OXIDATIVE_PHOSPHORYLATION       1.490505 2.547099e-03 3.097810e-02
#> HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY 1.477932 4.362144e-02 1.557909e-01
#> HALLMARK_IL2_STAT5_SIGNALING             1.470051 7.220840e-03 4.011578e-02
#> HALLMARK_UV_RESPONSE_DN                  1.437179 1.334729e-02 6.066949e-02
#> HALLMARK_UV_RESPONSE_UP                  1.437035 1.262173e-02 6.066949e-02

9.5.2 GSEA with GO Biological Process

# Run GSEA with GO terms
gsea_go <- get_gsea(
  vista,
  sample_comparison = comp_names[1],
  set_type = "go",
  from_type = "ENSEMBL",
  orgdb = org.Hs.eg.db,
  ont = "BP",  # Biological Process
  pvalueCutoff = 0.05
)

# Show results
if (!is.null(gsea_go$enrich) && nrow(gsea_go$enrich@result) > 0) {
  head(gsea_go$enrich@result[, c("Description", "NES", "pvalue", "p.adjust")], n = 10)
}
#>                                                                           Description
#> GO:0014888                                                 striated muscle adaptation
#> GO:0097501                                               stress response to metal ion
#> GO:0071276                                           cellular response to cadmium ion
#> GO:0035357               peroxisome proliferator activated receptor signaling pathway
#> GO:0071294                                              cellular response to zinc ion
#> GO:0035358 regulation of peroxisome proliferator activated receptor signaling pathway
#> GO:0071280                                            cellular response to copper ion
#> GO:0014896                                                         muscle hypertrophy
#> GO:0014897                                                striated muscle hypertrophy
#> GO:0003300                                                 cardiac muscle hypertrophy
#>                 NES       pvalue    p.adjust
#> GO:0014888 2.159841 4.762650e-06 0.006984229
#> GO:0097501 2.096181 4.178383e-06 0.006984229
#> GO:0071276 2.054801 1.421299e-05 0.012523671
#> GO:0035357 2.050339 1.247851e-04 0.042759711
#> GO:0071294 2.027774 3.458939e-06 0.006984229
#> GO:0035358 2.016158 2.092395e-04 0.044503071
#> GO:0071280 2.010250 1.083187e-04 0.039300585
#> GO:0014896 1.993604 5.614769e-05 0.026639921
#> GO:0014897 1.988450 8.102969e-05 0.033319408
#> GO:0003300 1.984857 1.999583e-04 0.044503071

9.5.3 GSEA enrichment overview

# Show all significant pathways using VISTA's visualization
if (!is.null(gsea_results$enrich) && nrow(gsea_results$enrich@result) > 0) {
  get_enrichment_plot(gsea_results$enrich, top_n = 15)
}

9.5.4 GSEA plot for top pathway

# Show enrichment plot for the top pathway
if (!is.null(gsea_results$enrich) && nrow(gsea_results$enrich@result) > 0) {
  # Create GSEA enrichment plot with running enrichment score
  enrichplot::gseaplot2(
    gsea_results$enrich,
    geneSetID = 1,  # Top pathway
    title = gsea_results$enrich@result$Description[1],
    pvalue_table = TRUE
  )
}

9.5.5 GSEA plot for multiple pathways

# Show top 3 pathways together
if (!is.null(gsea_results$enrich) && nrow(gsea_results$enrich@result) > 0) {
  enrichplot::gseaplot2(
    gsea_results$enrich,
    geneSetID = 1:3,  # Top 3 pathways
    pvalue_table = TRUE,
    ES_geom = "line"
  )
}

9.5.6 GSEA with GO visualization

# Visualize GO GSEA results
if (!is.null(gsea_go$enrich) && nrow(gsea_go$enrich@result) > 0) {
  get_enrichment_plot(gsea_go$enrich, top_n = 15)
}

9.6 KEGG Pathway Enrichment

9.6.1 KEGG upregulated genes

kegg_up <- get_kegg_enrichment(
  vista,
  sample_comparison = comp_names[1],
  regulation = "Up",
  species = "Homo sapiens",
  from_type = "ENSEMBL"
)

if (!is.null(kegg_up$enrich) && nrow(kegg_up$enrich@result) > 0) {
  head(kegg_up$enrich@result[, c("Description", "pvalue", "p.adjust", "Count")], n = 10)
}

9.6.2 KEGG downregulated genes

kegg_down <- get_kegg_enrichment(
  vista,
  sample_comparison = comp_names[1],
  regulation = "Down",
  species = "Homo sapiens",
  from_type = "ENSEMBL"
)

if (!is.null(kegg_down$enrich) && nrow(kegg_down$enrich@result) > 0) {
  head(kegg_down$enrich@result[, c("Description", "pvalue", "p.adjust", "Count")])
}

9.6.3 KEGG Visualization

if (!is.null(kegg_up$enrich) && nrow(kegg_up$enrich@result) > 0) {
  get_enrichment_plot(kegg_up$enrich, top_n = 15)
}

10 Fold-Change Analysis

10.1 Fold-change Matrix

Useful for comparing multiple comparisons:

fc_matrix <- get_foldchange_matrix(vista)
head(fc_matrix, n = 10)
#>                 Dexamethasone_VS_Untreated
#> ENSG00000000003                -0.38027500
#> ENSG00000000419                 0.20227695
#> ENSG00000000457                 0.03272066
#> ENSG00000000460                -0.11851609
#> ENSG00000000971                 0.43942357
#> ENSG00000001036                -0.24322719
#> ENSG00000001084                -0.03060707
#> ENSG00000001167                -0.49118392
#> ENSG00000001460                -0.13476909
#> ENSG00000001461                -0.04363732

10.2 Fold-change Barplot and Lollipop

10.2.1 Per-gene fold-change barplot

get_foldchange_barplot(
  vista,
  genes = top_up[1:3],
  sample_comparisons = comp_names,
  display_id = "SYMBOL",
  facet_by = "gene"
)

10.2.2 Per-gene fold-change lollipop

get_foldchange_lollipop(
  vista,
  sample_comparison = comp_names[1],
  genes = top_up[1:3],
  display_id = "SYMBOL",
  facet_by = "gene"
)

10.3 Fold-change Raincloud

get_foldchange_raincloud(
  vista,
  sample_comparisons = comp_names,
  facet_by = "none",
  id.long.var = "gene_id",
  stats_group = TRUE
)

10.4 Fold-change Heatmap

10.4.1 Basic FC heatmap

get_foldchange_heatmap(vista)

10.4.2 FC heatmap for selected genes

# Select genes with large fold-changes
fc_genes <- rownames(fc_matrix)[abs(fc_matrix[, 1]) > 2][1:30]

get_foldchange_heatmap(
  vista,
  sample_comparisons = comp_names,
  genes = fc_genes,
  display_id = "SYMBOL"
)

10.4.3 FC heatmap with gene names

get_foldchange_heatmap(
  vista,
  sample_comparisons = comp_names,
  genes = fc_genes[1:25],
  show_row_names = TRUE,
  display_id = "SYMBOL"
)

10.4.4 FC heatmap for specific gene set

# Use top upregulated genes
get_foldchange_heatmap(
  vista,
  sample_comparisons = comp_names,
  genes = top_up,
  show_row_names = TRUE,
  display_id = "SYMBOL"
)

11 Export Results

11.1 Export DE results to file

# Export complete DE table with annotations
de_annotated <- merge(
  comparisons(vista)[[1]],
  as.data.frame(rowData(vista)),
  by.x = "gene_id",
  by.y = "row.names"
)

# Write to CSV
write.csv(
  de_annotated,
  file = "airway_dexamethasone_vs_untreated_DE_results.csv",
  row.names = FALSE
)

# Export significant genes only
sig_genes <- de_annotated[de_annotated$regulation %in% c("Up", "Down"), ]
write.csv(
  sig_genes,
  file = "airway_significant_DEGs.csv",
  row.names = FALSE
)

11.2 Save VISTA object

# Save the complete VISTA object for later use
saveRDS(vista, file = "airway_vistaect.rds")

# Load it back
# vista <- readRDS("airway_vistaect.rds")

12 Summary

12.1 Workflow Completed

In this workflow, we:

  1. ✅ Loaded the airway RNA-seq dataset
  2. ✅ Created a VISTA object with DESeq2 analysis
  3. ✅ Added gene annotations from org.Hs.eg.db
  4. ✅ Performed quality control (PCA, MDS, UMAP, correlation)
  5. ✅ Visualized differential expression (volcano, MA plots)
  6. ✅ Analyzed expression patterns (heatmaps, barplots, boxplots, violin/raincloud plots, and more)
  7. ✅ Performed functional enrichment (MSigDB, GO, KEGG)
  8. ✅ Explored fold-change patterns
  9. ✅ Generated publication-ready visualizations

12.2 Key Features Demonstrated

  • Single-function workflow: create_vista() handles DE analysis
  • Consistent interface: All plot functions follow the same pattern
  • Flexible visualizations: Easy to customize colors, labels, thresholds
  • Multiple plot types: 29+ plotting functions for every analysis need
  • Integrated enrichment: No need to wrangle gene IDs manually
  • Publication-ready: All plots are ggplot2/ComplexHeatmap objects

12.3 Plotting Functions Used

12.3.1 QC Plots

  • get_corr_heatmap() - Sample correlation
  • get_pca_plot() - Principal component analysis
  • get_mds_plot() - Multidimensional scaling
  • get_umap_plot() - Nonlinear sample embedding

12.3.2 DE Visualization

  • get_deg_count_barplot() - DEG summary counts
  • get_volcano_plot() - Volcano plots
  • get_ma_plot() - MA plots

12.3.3 Expression Plots

  • get_expression_heatmap() - Expression heatmaps
  • get_expression_barplot() - Expression barplots
  • get_expression_boxplot() - Expression boxplots
  • get_expression_violinplot() - Violin plots
  • get_expression_density() - Density plots
  • get_expression_scatter() - Sample-vs-sample scatter
  • get_expression_lineplot() - Expression across samples
  • get_expression_lollipop() - Lollipop plots
  • get_expression_joyplot() - Ridgeline plots

12.3.4 Enrichment Plots

  • get_enrichment_plot() - Generic enrichment visualization
  • get_msigdb_enrichment() - MSigDB enrichment
  • get_go_enrichment() - GO enrichment
  • get_kegg_enrichment() - KEGG pathway enrichment
  • get_pathway_genes() - Extract genes driving enriched pathways
  • get_pathway_heatmap() - Plot pathway-derived expression heatmaps
  • get_enrichment_chord() - Chord diagram of gene-pathway relationships

12.3.5 Fold-Change

  • get_foldchange_matrix() - Extract FC matrix
  • get_foldchange_heatmap() - Visualize FC patterns

12.4 Next Steps

  • Try with your own data
  • Explore edgeR backend: method = "edger"
  • Explore limma-voom backend: method = "limma"
  • Test multiple comparisons simultaneously
  • Customize plots with ggplot2 themes
  • Generate automated reports with run_vista_report()
  • Integrate with downstream tools

13 Session Information

sessionInfo()
#> R version 4.6.0 RC (2026-04-17 r89917)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.24-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] magrittr_2.0.5              org.Hs.eg.db_3.23.1        
#>  [3] AnnotationDbi_1.73.1        airway_1.31.0              
#>  [5] SummarizedExperiment_1.41.1 Biobase_2.71.0             
#>  [7] GenomicRanges_1.63.2        Seqinfo_1.1.0              
#>  [9] IRanges_2.45.0              S4Vectors_0.49.2           
#> [11] BiocGenerics_0.57.1         generics_0.1.4             
#> [13] MatrixGenerics_1.23.0       matrixStats_1.5.0          
#> [15] ggplot2_4.0.2               VISTA_0.99.8               
#> [17] BiocStyle_2.39.0           
#> 
#> loaded via a namespace (and not attached):
#>   [1] splines_4.6.0           ggplotify_0.1.3         ggpp_0.6.0             
#>   [4] tibble_3.3.1            polyclip_1.10-7         enrichit_0.1.4         
#>   [7] lifecycle_1.0.5         httr2_1.2.2             rstatix_0.7.3          
#>  [10] doParallel_1.0.17       edgeR_4.9.8             processx_3.8.7         
#>  [13] lattice_0.22-9          MASS_7.3-65             backports_1.5.1        
#>  [16] limma_3.67.1            sass_0.4.10             rmarkdown_2.31         
#>  [19] jquerylib_0.1.4         yaml_2.3.12             otel_0.2.0             
#>  [22] ggtangle_0.1.1          EnhancedVolcano_1.29.1  DBI_1.3.0              
#>  [25] RColorBrewer_1.1-3      abind_1.4-8             purrr_1.2.2            
#>  [28] msigdbr_26.1.0          yulab.utils_0.2.4       tweenr_2.0.3           
#>  [31] rappdirs_0.3.4          aisdk_1.1.0             circlize_0.4.18        
#>  [34] gdtools_0.5.0           enrichplot_1.31.5       ggrepel_0.9.8          
#>  [37] tidytree_0.4.7          RSpectra_0.16-2         codetools_0.2-20       
#>  [40] DelayedArray_0.37.1     DOSE_4.5.1              ggforce_0.5.0          
#>  [43] shape_1.4.6.1           tidyselect_1.2.1        aplot_0.2.9            
#>  [46] farver_2.1.2            jsonlite_2.0.0          GetoptLong_1.1.1       
#>  [49] Formula_1.2-5           ggridges_0.5.7          iterators_1.0.14       
#>  [52] systemfonts_1.3.2       foreach_1.5.2           tools_4.6.0            
#>  [55] ggnewscale_0.5.2        treeio_1.35.0           Rcpp_1.1.1-1           
#>  [58] glue_1.8.1              gridExtra_2.3           SparseArray_1.11.13    
#>  [61] xfun_0.57               DESeq2_1.51.7           qvalue_2.43.0          
#>  [64] dplyr_1.2.1             withr_3.0.2             BiocManager_1.30.27    
#>  [67] fastmap_1.2.0           GGally_2.4.0            ggpointdensity_0.2.1   
#>  [70] callr_3.7.6             digest_0.6.39           R6_2.6.1               
#>  [73] gridGraphics_0.5-1      colorspace_2.1-2        Cairo_1.7-0            
#>  [76] GO.db_3.23.1            ggrain_0.1.2            dichromat_2.0-0.1      
#>  [79] RSQLite_2.4.6           tidyr_1.3.2             fontLiberation_0.1.0   
#>  [82] FNN_1.1.4.1             httr_1.4.8              htmlwidgets_1.6.4      
#>  [85] S4Arrays_1.11.1         scatterpie_0.2.6        ggstats_0.13.0         
#>  [88] uwot_0.2.4              pkgconfig_2.0.3         gtable_0.3.6           
#>  [91] blob_1.3.0              ComplexHeatmap_2.27.1   S7_0.2.1-1             
#>  [94] XVector_0.51.0          clusterProfiler_4.19.7  htmltools_0.5.9        
#>  [97] fontBitstreamVera_0.1.1 carData_3.0-6           bookdown_0.46          
#> [100] clue_0.3-68             scales_1.4.0            png_0.1-9              
#> [103] ggfun_0.2.0             knitr_1.51              rjson_0.2.23           
#> [106] reshape2_1.4.5          nlme_3.1-169            curl_7.0.0             
#> [109] GlobalOptions_0.1.4     cachem_1.1.0            stringr_1.6.0          
#> [112] parallel_4.6.0          pillar_1.11.1           grid_4.6.0             
#> [115] vctrs_0.7.3             ggpubr_0.6.3            car_3.1-5              
#> [118] tidydr_0.0.6            cluster_2.1.8.2         evaluate_1.0.5         
#> [121] tinytex_0.59            magick_2.9.1            cli_3.6.6              
#> [124] locfit_1.5-9.12         compiler_4.6.0          rlang_1.2.0            
#> [127] crayon_1.5.3            ggsignif_0.6.4          labeling_0.4.3         
#> [130] ps_1.9.3                forcats_1.0.1           plyr_1.8.9             
#> [133] fs_2.1.0                ggiraph_0.9.6           stringi_1.8.7          
#> [136] viridisLite_0.4.3       BiocParallel_1.45.0     assertthat_0.2.1       
#> [139] babelgene_22.9          Biostrings_2.79.5       lazyeval_0.2.3         
#> [142] GOSemSim_2.37.2         fontquiver_0.2.1        Matrix_1.7-5           
#> [145] patchwork_1.3.2         bit64_4.6.0-1           KEGGREST_1.51.1        
#> [148] statmod_1.5.1           igraph_2.2.3            broom_1.0.12           
#> [151] memoise_2.0.1           bslib_0.10.0            ggtree_4.1.2           
#> [154] bit_4.6.0               polynom_1.4-1           ape_5.8-1              
#> [157] gson_0.1.0

14 References

Appendix

  • Himes BE et al. (2014). RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells. PLoS One, 9(6), e99625.
  • Love MI, Huber W, Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15, 550.
  • Robinson MD, McCarthy DJ, Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139-140.
  • Subramanian A et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 102(43), 15545-15550.
  • Yu G et al. (2012). clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS, 16(5), 284-287.