--- title: "Introduction to sumExtras" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to sumExtras} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) # Set gtsummary print engine for proper rendering options(gtsummary.print_engine = "gt") ``` ```{r setup} #| eval: false library(sumExtras) library(gtsummary) library(dplyr) # Apply the recommended JAMA theme use_jama_theme() ``` ```{r setup2} #| warning: false #| message: false #| echo: false library(sumExtras) library(gtsummary) library(dplyr) # Apply the recommended JAMA theme use_jama_theme() ``` ## Overview If you've worked with gtsummary before, you're familiar with the typical workflow of building summary tables: creating a base table with `tbl_summary()`, then progressively adding features like overall columns, p-values, and formatting tweaks. While gtsummary's modular approach provides flexibility, the same sequence of functions appears repeatedly in analysis scripts. sumExtras streamlines this process by providing convenience functions that apply commonly-used formatting patterns in a single step. The package handles three main pain points: 1. **Repetitive styling workflows** - Combining multiple formatting steps into one function call 2. **Inconsistent missing value displays** - Standardizing how NA values appear across tables 3. **Manual variable labeling** - Automating label assignment from data dictionaries This vignette will get you started with the core functions. For more specialized workflows, see: - `vignette("labeling")` - Comprehensive guide to automatic variable labeling across tables and plots - `vignette("styling")` - Advanced table styling and formatting techniques ## The `extras()` Function The signature function of this package, `extras()`, consolidates the most common table enhancements into a single step. At minimum, it adds bold labels, removes the "Characteristic" header, and standardizes missing value display. With default settings, it also adds an overall column and p-values. ### Basic Usage :::::: {style="display: flex; gap: 15px; margin-bottom: 20px; align-items: flex-start;"} ::: {style="flex: 1; max-width: 48%;"} #### Standard gtsummary workflow ```{r extras-comparison-standard, eval=FALSE} trial |> tbl_summary(by = trt) |> add_overall() |> add_p() |> bold_labels() |> modify_header(label ~ "") ``` ::: ::: {style="flex: 1; max-width: 48%;"} #### Equivalent using extras() ```{r extras-comparison-extras, eval=FALSE} trial |> tbl_summary(by = trt) |> extras() ``` ::: :::::: ```{r build-extras-comparison, echo=FALSE} table_standard <- trial |> tbl_summary(by = trt) |> add_overall() |> add_p() |> bold_labels() |> modify_header(label ~ "") table_extras <- trial |> tbl_summary(by = trt) |> extras() ``` :::::: {style="display: flex; gap: 15px; margin-bottom: 30px; align-items: flex-start;"} ::: {style="flex: 1; max-width: 48%;"} ```{r render-standard, echo=FALSE} table_standard ``` ::: ::: {style="flex: 1; max-width: 48%;"} ```{r render-extras, echo=FALSE} table_extras ``` ::: :::::: Both approaches produce the same result, but `extras()` requires less code and ensures consistency across your analysis. ### Customizing Output You can control which features are applied using the function arguments: ```{r} # Table without p-values trial |> tbl_summary(by = trt) |> extras(pval = FALSE) # Table without overall column trial |> tbl_summary(by = trt) |> extras(overall = FALSE) # Overall column as last column (default is to set it as first) trial |> tbl_summary(by = trt) |> extras(last = TRUE) ``` For projects with consistent table formatting requirements, you can define styling parameters once and reuse them: ```{r} # Define standard table settings for a project standard_table_args <- list( pval = TRUE, overall = TRUE, last = TRUE ) # Apply consistently across multiple tables trial |> select(age, grade, stage, trt) |> tbl_summary(by = trt) |> extras(.args = standard_table_args) ``` ## Cleaning Missing Values One subtle but important aspect of table presentation is how missing or undefined values are displayed. gtsummary tables can show various representations of missing data: "0 (NA%)", "NA (NA)", "NA, NA", etc. These inconsistencies create visual clutter and make tables harder to scan. The `clean_table()` function (which is called automatically by `extras()`) standardizes all zero (`0 (0%)`) or missing value representations to "---": ```{r create-missing-data, echo=FALSE} # Create data with some missing patterns trial_missing <- trial |> mutate( age = if_else(trt == 'Drug B', NA_real_, age), marker = if_else(trt == 'Drug A', NA_real_, marker) ) ``` :::::: {style="display: flex; gap: 15px; margin-bottom: 20px; align-items: flex-start;"} ::: {style="flex: 1; max-width: 48%;"} #### Without cleaning ```{r clean-comparison-without, eval=FALSE} trial_missing |> tbl_summary(by = trt) ``` ::: ::: {style="flex: 1; max-width: 48%;"} #### With clean_table() ```{r clean-comparison-with, eval=FALSE} trial_missing |> tbl_summary(by = trt) |> clean_table() ``` ::: :::::: ```{r build-clean-comparison, echo=FALSE} table_without_clean <- trial_missing |> tbl_summary(by = trt) table_with_clean <- trial_missing |> tbl_summary(by = trt) |> clean_table() ``` :::::: {style="display: flex; gap: 15px; margin-bottom: 30px; align-items: flex-start;"} ::: {style="flex: 1; max-width: 48%;"} ```{r render-without-clean, echo=FALSE} table_without_clean ``` ::: ::: {style="flex: 1; max-width: 48%;"} ```{r render-with-clean, echo=FALSE} table_with_clean ``` ::: :::::: You can also use `clean_table()` independently if you prefer to build tables step-by-step: ```{r} #| warning: false #| message: false trial_missing |> tbl_summary(by = trt) |> add_overall() |> add_p() |> clean_table() ``` ## Quick Start: Automatic Labeling One of the most time-consuming aspects of creating publication-ready tables is labeling variables with human-readable descriptions. sumExtras provides a streamlined labeling system using data dictionaries: ```{r} # Create a simple dictionary dictionary <- tibble::tribble( ~Variable, ~Description, "trt", "Chemotherapy Treatment", "age", "Age at Enrollment (years)", "marker", "Marker Level (ng/mL)", "stage", "T Stage", "grade", "Tumor Grade" ) # Apply labels automatically trial |> tbl_summary(by = trt, include = c(age, grade, marker)) |> add_auto_labels(dictionary = dictionary) |> extras() ``` The `add_auto_labels()` function is intelligent and flexible: - Pass a dictionary explicitly, or let it find one in your environment automatically - Works with pre-labeled data (from haven, Hmisc, or manual labeling) - Manual labels in `tbl_summary()` always override automatic ones - Compatible with regression tables via `tbl_regression()` ### What About More Complex Labeling Workflows? The labeling system is much more powerful than this basic example suggests. You can: - Use one dictionary for both gtsummary tables and ggplot2 visualizations - Control label priority when you have multiple label sources - Set up cross-package workflows with `apply_labels_from_dictionary()` - Understand how R's label attribute system works under the hood For comprehensive coverage of these workflows and real-world examples, see **`vignette("labeling")`**. ## Basic Theme Setup sumExtras is designed to work best with the JAMA compact theme. Use `use_jama_theme()` to apply this theme to all gtsummary tables in your session: ```{r} # Apply JAMA compact theme (typically done once at the beginning) use_jama_theme() ``` This is equivalent to calling `gtsummary::set_gtsummary_theme(gtsummary::theme_gtsummary_compact("jama"))` but provides a more convenient interface. You can reset to the default gtsummary theme at any time with `gtsummary::reset_gtsummary_theme()`. For information about matching gt table styles, creating styled group headers, and advanced formatting techniques, see **`vignette("styling")`**. ## Putting It All Together Here's a simple workflow demonstrating how these core functions work together: ```{r} # 1. Define your dictionary (typically done once per project) my_dictionary <- tibble::tribble( ~Variable, ~Description, "trt", "Chemotherapy Treatment", "age", "Age at Enrollment (years)", "marker", "Marker Level (ng/mL)", "stage", "T Stage", "grade", "Tumor Grade", "response", "Tumor Response" ) # 2. Set the recommended theme (once per session) use_jama_theme() # 3. Create a clean, labeled table with one function call trial |> select(age, marker, stage, grade, response, trt) |> tbl_summary( by = trt, missing = "no" ) |> add_auto_labels(dictionary = my_dictionary) |> extras() ``` That's it! With just a few lines of code, you have a publication-ready table with automatic labeling, clean missing values, bold labels, an overall column, and p-values. ## Next Steps This vignette covered the essential functions to get you started quickly. For more advanced usage: - **`vignette("labeling")`** - Learn about the complete labeling system, including cross-package workflows with ggplot2, controlling label priority, working with pre-labeled data, and real-world analysis examples - **`vignette("styling")`** - Explore advanced styling techniques including group headers, background colors, text formatting, and creating visually polished tables For detailed information about individual functions, see their help documentation: - `?extras` - Main styling function - `?clean_table` - Missing value standardization - `?add_auto_labels` - Automatic variable labeling - `?use_jama_theme` - Apply JAMA compact theme The package is designed to reduce repetitive code while maintaining the flexibility of gtsummary's modular approach. Use as much or as little as fits your workflow.