--- title: "Getting Started with splineplot" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with splineplot} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, dpi = 100 ) ``` ```{r setup} library(splineplot) library(mgcv) library(survival) library(splines) library(ggplot2) ``` ## Introduction The `splineplot` package provides a unified interface for visualizing spline effects from various regression models. This vignette will guide you through the basic usage of the package. ## Preparing Your Data First, let's create some sample data to work with: ```{r data} set.seed(42) n <- 500 # Continuous predictor age <- rnorm(n, mean = 50, sd = 10) # Non-linear effect true_effect <- -0.05*(age - 50) + 0.001*(age - 50)^3/100 # Various outcomes time_to_event <- rexp(n, rate = exp(true_effect)) event_status <- rbinom(n, 1, 0.8) binary_outcome <- rbinom(n, 1, plogis(true_effect)) count_outcome <- rpois(n, lambda = exp(true_effect/2)) continuous_outcome <- true_effect + rnorm(n, 0, 0.5) # Create data frame data <- data.frame( age = age, time = time_to_event, status = event_status, binary = binary_outcome, count = count_outcome, continuous = continuous_outcome ) ``` ## GAM Models ### Cox Proportional Hazards GAM with Cox family is useful for flexible modeling of survival data: ```{r gam-cox} # Fit GAM Cox model using weights gam_cox <- gam(time ~ s(age), family = cox.ph(), weights = status, data = data) # Create spline plot splineplot(gam_cox, data, ylim = c(0.5, 2.0), xlab = "Age (years)", ylab = "Hazard Ratio") ``` The plot shows: - The smooth effect of age on hazard - 95% confidence intervals (dotted lines) - A reference point (diamond) where HR = 1 - Histogram showing the distribution of data ### Logistic Regression For binary outcomes: ```{r gam-logistic} gam_logit <- gam(binary ~ s(age), family = binomial(), data = data) splineplot(gam_logit, data, ylim = c(0.5, 2.0), ylab = "Odds Ratio") ``` ### Poisson Regression For count data: ```{r gam-poisson} gam_poisson <- gam(count ~ s(age), family = poisson(), data = data) splineplot(gam_poisson, data, ylab = "Rate Ratio") ``` ## GLM with Splines When you prefer parametric splines over GAM smooths: ### Natural Splines (ns) ```{r glm-ns} glm_ns <- glm(binary ~ ns(age, df = 4), family = binomial(), data = data) splineplot(glm_ns, data, ylim = c(0.5, 2.0)) ``` ### B-splines (bs) ```{r glm-bs} glm_bs <- glm(count ~ bs(age, df = 4), family = poisson(), data = data) splineplot(glm_bs, data) ``` ## Cox Models with Splines For survival analysis without GAM: ```{r cox-ns} cox_ns <- coxph(Surv(time, status) ~ ns(age, df = 4), data = data) splineplot(cox_ns, data, ylim = c(0.5, 2.0)) ``` ## Customizing Your Plots ### Reference Values By default, the reference value is the median. You can change this: ```{r custom-ref} splineplot(gam_cox, data, refx = 45, # Set reference at age 45 ylim = c(0.5, 2.0)) ``` ### Confidence Interval Styles Choose between dotted lines (default) or ribbon style: ```{r ci-styles} # Ribbon style confidence intervals splineplot(gam_logit, data, ribbon_ci = TRUE, ylim = c(0.5, 2.0)) ``` ### Histogram Options You can toggle the histogram display: ```{r histogram} splineplot(gam_cox, data, show_hist = FALSE, ylim = c(0.5, 2.0)) ``` ### Log Scale For odds ratios, rate ratios, or hazard ratios, you might prefer log scale: ```{r log-scale} splineplot(gam_logit, data, log_scale = TRUE) ``` ## Interaction Terms The package automatically detects and visualizes interaction terms: ```{r interaction} # Add a grouping variable data$group <- factor(sample(c("Male", "Female"), n, replace = TRUE)) # Fit model with interaction gam_interact <- gam(time ~ s(age, by = group), family = cox.ph(), weights = status, data = data) # Plot shows separate curves for each group splineplot(gam_interact, data, ylim = c(0.5, 2.0)) ``` ## Tips for Best Results 1. **Model Choice**: Use GAM for maximum flexibility, GLM with splines for parametric approach 2. **Degrees of Freedom**: Higher df allows more flexibility but may overfit 3. **Reference Point**: Choose a meaningful reference value for interpretation 4. **Confidence Intervals**: Ribbon style is visually appealing, dotted lines show precision better 5. **Sample Size**: Ensure adequate sample size for stable spline estimates ## Conclusion The `splineplot` package simplifies the visualization of non-linear effects across different model types. It handles the complexity of extracting and transforming model predictions while providing a consistent, publication-ready output.