Title: | Quantitative Text Kit |
Version: | 1.1.1 |
Description: | Support package for the textbook "An Introduction to Quantitative Text Analysis for Linguists: Reproducible Research Using R" (Francom, 2024) <doi:10.4324/9781003393764>. Includes functions to acquire, clean, and analyze text data as well as functions to document and share the results of text analysis. The package is designed to be used in conjunction with the book, but can also be used as a standalone package for text analysis. |
License: | GPL (≥ 3) |
URL: | https://cran.r-project.org/package=qtkit |
BugReports: | https://github.com/qtalr/qtkit/issues |
SystemRequirements: | Chromium-based browser (e.g., Chrome, Chromium, or Brave) |
Depends: | R (≥ 4.1) |
Imports: | chromote, dplyr, ggplot2, gutenbergr, kableExtra, knitr, Matrix, openai, rlang, xml2 |
Suggests: | httptest, rmarkdown, testthat (≥ 3.0.0), webshot2, fs, tibble, glue, readr |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-US |
RoxygenNote: | 7.3.1 |
VignetteBuilder: | knitr |
Author: | Jerid Francom |
Maintainer: | Jerid Francom <francojc@wfu.edu> |
NeedsCompilation: | no |
Packaged: | 2025-01-14 04:29:08 UTC; francojc |
Repository: | CRAN |
Date/Publication: | 2025-01-14 06:50:02 UTC |
Add Package Citations to BibTeX File
Description
Adds citation information for R packages to a BibTeX file. Uses the
knitr::write_bib
function to generate and append package citations
in BibTeX format.
Usage
add_pkg_to_bib(pkg_name, bib_file = "packages.bib")
Arguments
pkg_name |
Character string. The name of the R package to add to the BibTeX file. |
bib_file |
Character string. The path and name of the BibTeX file to write to. Default is "packages.bib". |
Details
The function will create the BibTeX file if it doesn't exist, or append to it if it does. It includes citations for both the specified package and all currently loaded packages.
Value
Invisible NULL. The function is called for its side effect of writing to the BibTeX file.
Examples
# Create a temporary BibTeX file
my_bib_file <- tempfile(fileext = ".bib")
# Add citations for dplyr package
add_pkg_to_bib("dplyr", my_bib_file)
# View the contents of the BibTeX file
readLines(my_bib_file) |> cat(sep = "\n")
Calculate Association Metrics for Bigrams
Description
This function calculates various association metrics (PMI, Dice's Coefficient, G-score) for bigrams in a given corpus.
Usage
calc_assoc_metrics(
data,
doc_index,
token_index,
type,
association = "all",
verbose = FALSE
)
Arguments
data |
A data frame containing the corpus. |
doc_index |
Column in 'data' which represents the document index. |
token_index |
Column in 'data' which represents the token index. |
type |
Column in 'data' which represents the tokens or terms. |
association |
A character vector specifying which metrics to calculate. Can be any combination of 'pmi', 'dice_coeff', 'g_score', or 'all'. Default is 'all'. |
verbose |
A logical value indicating whether to keep the intermediate probability columns. Default is FALSE. |
Value
A data frame with one row per bigram and columns for each calculated metric.
Examples
data_path <- system.file("extdata", "bigrams_data.rds", package = "qtkit")
data <- readRDS(data_path)
calc_assoc_metrics(data, doc_index, token_index, type)
Calculate Document Frequency
Description
Computes the document frequency (DF) for each term in a term-document matrix. DF is the number of documents in which each term appears at least once.
Usage
calc_df(tdm)
Arguments
tdm |
A term-document matrix |
Value
A numeric vector of document frequencies for each term
Calculate Gries' Deviation of Proportions
Description
Computes the Deviation of Proportions (DP) measure developed by Stefan Th. Gries. DP measures how evenly distributed a term is across all parts of the corpus. The normalized version (DP_norm) is returned, which ranges from 0 (evenly distributed) to 1 (extremely clumped distribution).
Usage
calc_dp(tdm)
Arguments
tdm |
A term-document matrix |
Value
A numeric vector of normalized DP values for each term
References
Gries, S. T. (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403-437.
Calculate Inverse Document Frequency
Description
Computes the inverse document frequency (IDF) for each term in a term-document matrix. IDF is calculated as log(N/df) where N is the total number of documents and df is the document frequency of the term.
Usage
calc_idf(tdm)
Arguments
tdm |
A term-document matrix |
Value
A numeric vector of inverse document frequencies for each term
Calculate Normalized Entropy for Categorical Variables
Description
Computes the normalized entropy (uncertainty measure) for categorical variables, providing a standardized measure of dispersion or randomness in the data.
Usage
calc_normalized_entropy(x)
Arguments
x |
A character vector or factor containing categorical data. |
Details
The function:
Handles both character vectors and factors as input
Treats NA values as a separate category
Normalizes entropy to range (0,1) where:
0 indicates complete certainty (one category dominates)
1 indicates maximum uncertainty (equal distribution)
The calculation process:
Computes category proportions
Calculates raw entropy using Shannon's formula
Normalizes by dividing by maximum possible entropy
Value
A numeric value between 0 and 1 representing the normalized entropy:
Values closer to 0 indicate less diversity/uncertainty
Values closer to 1 indicate more diversity/uncertainty
Examples
# Calculate entropy for a simple categorical vector
x <- c("A", "B", "B", "C", "C", "C", "D", "D", "D", "D")
calc_normalized_entropy(x)
# Handle missing values
y <- c("A", "B", NA, "C", "C", NA, "D", "D")
calc_normalized_entropy(y)
# Works with factors too
z <- factor(c("Low", "Med", "Med", "High", "High", "High"))
calc_normalized_entropy(z)
Calculate Observed Relative Frequency
Description
Computes the observed relative frequency (ORF) for each term in a term-document matrix. ORF is the relative frequency expressed as a percentage (RF * 100).
Usage
calc_orf(tdm)
Arguments
tdm |
A term-document matrix |
Value
A numeric vector of observed relative frequencies (as percentages) for each term
Internal Functions for Calculating Dispersion and Frequency Metrics
Description
A collection of internal helper functions that calculate various dispersion
and frequency metrics from term-document matrices. These functions support
the main calc_type_metrics
function by providing specialized calculations
for different statistical measures.
Computes the relative frequency (RF) for each term in a term-document matrix, representing how often each term occurs relative to the total corpus size.
Usage
calc_rf(tdm)
Arguments
tdm |
A sparse term-document matrix (Matrix package format) |
Details
The package implements these metrics:
Dispersion measures:
Document Frequency (DF): Count of documents containing each term
Inverse Document Frequency (IDF): Log-scaled inverse of DF, emphasizing rare terms
Deviation of Proportions (DP): Gries' measure of distributional evenness ranging from 0 (perfectly even) to 1 (completely clumped)
Frequency measures:
Relative Frequency (RF): Term frequency normalized by total corpus size
Observed Relative Frequency (ORF): RF expressed as percentage (RF * 100)
Implementation notes:
All functions expect a sparse term-document matrix input
Matrix operations are optimized using the Matrix package
NA values are handled appropriately for each metric
Results are returned as numeric vectors
The calculation process:
Sums occurrences of each term across all documents
Divides by total corpus size (sum of all terms)
Returns proportions between 0 and 1
Value
A numeric vector where each element represents a term's relative frequency in the corpus (range: 0-1)
References
Gries, S. T. (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403-437.
Calculate Frequency and Dispersion Metrics for Text Types
Description
Calculates various frequency and dispersion metrics for types (terms/tokens) in tokenized text data. Provides a comprehensive analysis of how types are distributed across documents in a corpus.
Usage
calc_type_metrics(data, type, document, frequency = NULL, dispersion = NULL)
Arguments
data |
Data frame. Contains the tokenized text data with document IDs and types/terms. |
type |
Symbol. Column in |
document |
Symbol. Column in |
frequency |
Character vector. Frequency metrics to calculate: - NULL (default): Returns only type counts - 'all': All available metrics - 'rf': Relative frequency - 'orf': Observed relative frequency (per 100) |
dispersion |
Character vector. Dispersion metrics to calculate: - NULL (default): Returns only type counts - 'all': All available metrics - 'df': Document frequency - 'idf': Inverse document frequency - 'dp': Gries' deviation of proportions |
Details
The function creates a term-document matrix internally and calculates the requested metrics. Frequency metrics show how often types occur, while dispersion metrics show how evenly they are distributed across documents.
The 'dp' metric (Gries' Deviation of Proportions) ranges from 0 (perfectly even distribution) to 1 (completely clumped distribution).
Value
Data frame containing requested metrics:
type: Unique types from input data
n: Raw frequency count
rf: Relative frequency (if requested)
orf: Observed relative frequency per 100 (if requested)
df: Document frequency (if requested)
idf: Inverse document frequency (if requested)
dp: Deviation of proportions (if requested)
References
Gries, Stefan Th. (2023). Statistical Methods in Corpus Linguistics. In Readings in Corpus Linguistics: A Teaching and Research Guide for Scholars in Nigeria and Beyond, pp. 78-114.
Examples
data_path <- system.file("extdata", "types_data.rds", package = "qtkit")
df <- readRDS(data_path)
calc_type_metrics(
data = df,
type = letter,
document = doc_id,
frequency = c("rf", "orf"),
dispersion = "dp"
)
Calculate Probabilities for Bigrams
Description
Helper function that calculates joint and marginal probabilities for bigrams in the input data using dplyr. It processes the data to create bigrams and computes their probabilities along with individual token probabilities.
Usage
calculate_bigram_probabilities(data, doc_index, token_index, type)
Arguments
data |
A data frame containing the corpus |
doc_index |
Column name for document index |
token_index |
Column name for token position |
type |
Column name for the actual tokens/terms |
Value
A data frame containing:
x: First token in bigram
y: Second token in bigram
p_xy: Joint probability of the bigram
p_x: Marginal probability of first token
p_y: Marginal probability of second token
Calculate Association Metrics
Description
Helper function that computes various association metrics for bigrams based on their probability distributions. Supports PMI (Pointwise Mutual Information), Dice's Coefficient, and G-score calculations.
Usage
calculate_metrics(bigram_probs, association)
Arguments
bigram_probs |
A data frame containing bigram probability data with columns:
|
association |
Character vector specifying which metrics to calculate |
Value
A data frame containing the original probability columns plus requested association metrics:
pmi: Pointwise Mutual Information
dice_coeff: Dice's Coefficient
g_score: G-score
Clean Downloaded File Names
Description
Helper function that removes spaces from filenames in the target directory, replacing them with underscores.
Usage
clean_filenames(target_dir)
Arguments
target_dir |
Character string of the target directory path |
Value
Invisible NULL, called for side effects
Check if Permission Confirmation is Needed
Description
Helper function that determines whether to prompt the user for permission confirmation based on the confirmed parameter.
Usage
confirm_if_needed(confirmed)
Arguments
confirmed |
Logical indicating if permission is pre-confirmed |
Value
Logical indicating if permission is granted
Create Data Dictionary
Description
This function takes a data frame and creates a data dictionary. The data dictionary includes the variable name, a human-readable name, the variable type, and a description. If a model is specified, the function uses OpenAI's API to generate the information based on the characteristics of the data frame.
Usage
create_data_dictionary(
data,
file_path,
model = NULL,
sample_n = 5,
grouping = NULL,
force = FALSE
)
Arguments
data |
A data frame to create a data dictionary for. |
file_path |
The file path to save the data dictionary to. |
model |
The ID of the OpenAI chat completion models to use for
generating descriptions (see |
sample_n |
The number of rows to sample from the data frame to use as input for the model. Default NULL. |
grouping |
A character vector of column names to group by when sampling rows from the data frame for the model. Default NULL. |
force |
If TRUE, overwrite the file at |
Value
A data frame containing the variable name, human-readable name, variable type, and description for each variable in the input data frame.
Create Data Origin Documentation
Description
Creates a standardized data origin documentation file in CSV format, containing essential metadata about a dataset's source, format, and usage rights.
Usage
create_data_origin(file_path, return = FALSE, force = FALSE)
Arguments
file_path |
Character string. Path where the CSV file should be saved. |
return |
Logical. If TRUE, returns the data frame in addition to saving. Default is FALSE. |
force |
Logical. If TRUE, overwrites existing file at path. Default is FALSE. |
Details
Generates a template with the following metadata fields:
Resource name
Data source (URL/DOI)
Sampling frame (language, modality, genre)
Collection dates
Data format
Schema description
License information
Attribution requirements
Value
If return=TRUE, returns a data frame containing the data origin template. Otherwise returns invisible(NULL).
Examples
tmp_file <- tempfile(fileext = ".csv")
create_data_origin(tmp_file)
read.csv(tmp_file)
Curate ENNTT Data
Description
This function processes and curates ENNTT (European Parliament) data from a specified directory. It handles both .dat files (containing XML metadata) and .tok files '(containing text content).
Usage
curate_enntt_data(dir_path)
Arguments
dir_path |
A string. The path to the directory containing the ENNTT data files. Must be an existing directory. |
Details
The function expects a directory containing paired .dat and .tok files with matching names, as found in the raw ENNTT data https://github.com/senisioi/enntt-release. The .dat files should contain XML-formatted metadata with attributes:
session_id: Unique identifier for the parliamentary session
mepid: Member of European Parliament ID
state: Country or state representation
seq_speaker_id: Sequential ID within the session
The .tok files should contain the corresponding text content, one entry per line.
Value
A tibble containing the curated ENNTT data with columns:
session_id: Parliamentary session identifier
speaker_id: Speaker's MEP ID
state: Representative's state/country
session_seq: Sequential position in session
text: Speech content
type: Corpus type identifier
Examples
# Example using simulated data bundled with the package
example_data <- system.file("extdata", "simul_enntt", package = "qtkit")
curated_data <- curate_enntt_data(example_data)
str(curated_data)
Curate Single ENNTT File Pair
Description
Curate Single ENNTT File Pair
Usage
curate_enntt_file(dir_path, corpus_type)
Arguments
dir_path |
Directory containing the files |
corpus_type |
Type identifier for the corpus |
Value
Data frame of curated data
Curate SWDA data
Description
Process and curate Switchboard Dialog Act (SWDA) data by reading all .utt files from a specified directory and converting them into a structured format.
Usage
curate_swda_data(dir_path)
Arguments
dir_path |
Character string. Path to the directory containing .utt files. Must be an existing directory. |
Details
The function expects a directory containing .utt files or subdirectories with .utt files, as found in the raw SWDA data (Linguistic Data Consortium. LDC97S62: Switchboard Dialog Act Corpus.)
Value
A data frame containing the curated SWDA data with columns:
doc_id: Document identifier
damsl_tag: Dialog act annotation
speaker_id: Unique speaker identifier
speaker: Speaker designation (A or B)
turn_num: Turn number in conversation
utterance_num: Utterance number
utterance_text: Actual spoken text
Examples
# Example using simulated data bundled with the package
example_data <- system.file("extdata", "simul_swda", package = "qtkit")
swda_data <- curate_swda_data(example_data)
str(swda_data)
Process a single SWDA utterance file
Description
Process a single SWDA utterance file
Usage
curate_swda_file(file_path)
Arguments
file_path |
Character string. Path to the .utt file |
Value
A data frame containing processed data from the file
Download and Decompress Archive File
Description
Helper function that downloads an archive file to a temporary location and decompresses it to the target directory.
Usage
download_and_decompress(url, target_dir, ext)
Arguments
url |
Character string of the archive file URL |
target_dir |
Character string of the target directory path |
ext |
Character string of the file extension |
Value
No return value, called for side effects
Extract Attributes from XML Line Node
Description
Extract Attributes from XML Line Node
Usage
extract_dat_attrs(line_node)
Arguments
line_node |
XML node containing line attributes |
Value
Data frame of extracted attributes
Extract speaker information from document lines
Description
Extract speaker information from document lines
Usage
extract_speaker_info(doc_lines)
Arguments
doc_lines |
Character vector of file lines |
Value
Named list of speaker information
Extract and process utterances from document lines
Description
Extract and process utterances from document lines
Usage
extract_utterances(doc_lines, speaker_info)
Arguments
doc_lines |
Character vector of file lines |
speaker_info |
List of speaker information |
Value
Data frame of processed utterances
Find ENNTT Files
Description
Find ENNTT Files
Usage
find_enntt_files(dir_path)
Arguments
dir_path |
Directory to search for ENNTT files |
Value
Vector of unique corpus types
Detect Statistical Outliers Using IQR Method
Description
Identifies statistical outliers in a numeric variable using the Interquartile Range (IQR) method. Provides detailed diagnostics about the outlier detection process.
Usage
find_outliers(data, variable_name, verbose = TRUE)
Arguments
data |
Data frame containing the variable to analyze. |
variable_name |
Unquoted name of the numeric variable to check for outliers. |
verbose |
Logical. If TRUE, prints diagnostic information about quartiles, fences, and number of outliers found. Default is TRUE. |
Details
The function uses the standard IQR method for outlier detection:
Calculates Q1 (25th percentile) and Q3 (75th percentile)
Computes IQR = Q3 - Q1
Defines outliers as values outside (Q1 - 1.5IQR, Q3 + 1.5IQR)
Value
If outliers are found:
Data frame containing rows with outlier values
Prints diagnostic information about quartiles and fences
If no outliers:
Returns NULL
Prints confirmation message
Diagnostic Output
Variable name
Q1 and Q3 values
IQR value
Upper and lower fence values
Number of outliers found
Examples
data(mtcars)
find_outliers(mtcars, mpg)
find_outliers(mtcars, wt, verbose = FALSE)
Download and Extract Archive Files
Description
Downloads compressed archive files from a URL and extracts their contents to a specified directory. Supports multiple archive formats and handles permission confirmation.
Usage
get_archive_data(url, target_dir, force = FALSE, confirmed = FALSE)
Arguments
url |
Character string. Full URL to the compressed archive file. |
target_dir |
Character string. Directory where the archive contents should be extracted. |
force |
Logical. If TRUE, overwrites existing data in target directory. Default is FALSE. |
confirmed |
Logical. If TRUE, skips permission confirmation prompt. Useful for reproducible workflows. Default is FALSE. |
Details
Supported archive formats:
ZIP (.zip)
Gzip (.gz)
Tar (.tar)
Compressed tar (.tgz)
The function includes safety features:
Permission confirmation for data usage
Directory existence checks
Archive format validation
Automatic file cleanup
Value
Invisible NULL. Called for side effects:
Downloads archive file
Creates target directory if needed
Extracts archive contents
Cleans up temporary files
Examples
## Not run:
data_dir <- file.path(tempdir(), "data")
url <-
"https://raw.githubusercontent.com/qtalr/qtkit/main/inst/extdata/test_data.zip"
get_archive_data(
url = url,
target_dir = data_dir,
confirmed = TRUE
)
## End(Not run)
Get Works from Project Gutenberg
Description
Retrieves works from Project Gutenberg based on specified criteria and saves the data to a CSV file. This function is a wrapper for the gutenbergr package.
Usage
get_gutenberg_data(
target_dir,
lcc_subject,
birth_year = NULL,
death_year = NULL,
n_works = 100,
force = FALSE,
confirmed = FALSE
)
Arguments
target_dir |
The directory where the CSV file will be saved. |
lcc_subject |
A character vector specifying the Library of Congress Classification (LCC) subjects to filter the works. |
birth_year |
An optional integer specifying the minimum birth year of authors to include. |
death_year |
An optional integer specifying the maximum death year of authors to include. |
n_works |
An integer specifying the number of works to retrieve. Default is 100. |
force |
A logical value indicating whether to overwrite existing data if it already exists. |
confirmed |
If |
Details
This function retrieves Gutenberg works based on the specified LCC subjects and optional author birth and death years. It checks if the data already exists in the target directory and provides an option to overwrite it. The function also creates the target directory if it doesn't exist. If the number of works is greater than 1000 and the 'confirmed' parameter is not set to TRUE, it prompts the user for confirmation. The retrieved works are filtered based on public domain rights in the USA and availability of text. The resulting works are downloaded and saved as a CSV file in the target directory.
For more information on Library of Congress Classification (LCC) subjects, refer to the https://www.loc.gov/catdir/cpso/lcco/ Library of Congress Classification Guide.
Value
A message indicating whether the data was acquired or already existed on disk, writes the data files to disk in the specified target directory.
Examples
## Not run:
data_dir <- file.path(tempdir(), "data")
get_gutenberg_data(
target_dir = data_dir,
lcc_subject = "JC",
n_works = 5,
confirmed = TRUE
)
## End(Not run)
Process speaker turn information
Description
Process speaker turn information
Usage
process_speaker_info(speaker_turn, speaker_a_id, speaker_b_id)
Arguments
speaker_turn |
Vector of speaker turns |
speaker_a_id |
ID for speaker A |
speaker_b_id |
ID for speaker B |
Value
List with processed speaker information
Validate Directory Path
Description
Validate Directory Path
Usage
validate_dir_path(dir_path)
Arguments
dir_path |
Directory path to validate |
Validate Archive File Extension
Description
Helper function that checks if the file extension is supported (zip, gz, tar, or tgz).
Usage
validate_file_extension(ext)
Arguments
ext |
Character string of the file extension |
Details
Stops execution if extension is not supported
Value
No return value, called for side effects
Validate Inputs for Association Metrics Calculation
Description
Helper function that validates the input parameters for the calc_assoc_metrics function. Checks data frame structure, column existence, and association metric specifications.
Usage
validate_inputs_cam(data, doc_index, token_index, type, association)
Arguments
data |
A data frame to validate |
doc_index |
Column name for document index |
token_index |
Column name for token position |
type |
Column name for the tokens/terms |
association |
Character vector of requested association metrics |
Details
Stops execution with error message if:
data is not a data frame
required columns are missing
association contains invalid metric names
Value
No return value, called for side effects
Validate Inputs for Type Metrics Calculation
Description
Helper function that validates the input parameters for the calc_type_metrics function. Checks data frame structure, column existence, and metric specifications.
Usage
validate_inputs_ctm(data, type, document, frequency, dispersion)
Arguments
data |
A data frame to validate |
type |
Column name for the type/term variable |
document |
Column name for the document ID variable |
frequency |
Character vector of requested frequency metrics |
dispersion |
Character vector of requested dispersion metrics |
Details
Stops execution with error message if:
data is not a data frame
required columns are missing
frequency contains invalid metric names
dispersion contains invalid metric names
Value
No return value, called for side effects
Save ggplot Objects to Files
Description
A wrapper around ggsave
that facilitates saving ggplot objects within knitr
documents. Automatically handles file naming and directory creation, with
support for multiple output formats.
Usage
write_gg(
gg_obj = NULL,
file = NULL,
target_dir = NULL,
device = "pdf",
theme = NULL,
...
)
Arguments
gg_obj |
The ggplot to be written. If not specified, the last ggplot created will be written. |
file |
The name of the file to be written. If not specified, the label of the code block will be used. |
target_dir |
The directory where the file will be written. If not specified, the current working directory will be used. |
device |
The device to be used for saving the ggplot. Options include "pdf" (default), "png", "jpeg", "tiff", and "svg". |
theme |
The ggplot2 theme to be applied to the ggplot. Default is the theme specified in the ggplot2 options. |
... |
Additional arguments to be passed to the |
Details
This function extends ggplot2::ggsave
by:
Using knitr code block labels for automatic file naming
Creating target directories if they don't exist
Supporting multiple output formats (PDF, PNG, JPEG, TIFF, SVG)
Applying custom themes to plots before saving
Value
The path of the written file.
Examples
## Not run:
library(ggplot2)
plot_dir <- file.path(tempdir(), "plot")
# Write a ggplot object as a PDF file
p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point()
write_gg(
gg_obj = p,
file = "plot_file",
target_dir = plot_dir,
device = "pdf"
)
unlink(plot_dir)
## End(Not run)
Write a kable object to a file
Description
A wrapper around kableExtra::save_kable
that facilitates saving kable objects
within knitr documents. Automatically handles file naming, directory creation,
and supports multiple output formats with Bootstrap theming options.
Usage
write_kbl(
kbl_obj,
file = NULL,
target_dir = NULL,
device = "pdf",
bs_theme = "bootstrap",
...
)
Arguments
kbl_obj |
The knitr_kable object to be written. |
file |
The name of the file to be written. If not specified, the name will be based on the current knitr code block label. |
target_dir |
The directory where the file will be written. If not specified, the current working directory will be used. |
device |
The device to be used for saving the file. Options include "pdf" (default), "html", "latex", "png", and "jpeg". Note that a Chromium-based browser (e.g., Google Chrome, Chromium, Microsoft Edge or Brave) is required on your system for all options except "latex'. If a suitable browser is not available, the function will stop and return an error message. |
bs_theme |
The Bootstrap theme to be applied to the kable object (only applicable for HTML output). Default is "bootstrap". |
... |
Additional arguments to be passed to the |
Details
The function extends save_kable
functionality by:
Using knitr code block labels for automatic file naming
Creating target directories if they don't exist
Supporting multiple output formats (PDF, HTML, LaTeX, PNG, JPEG)
Applying Bootstrap themes for HTML output
Preserving table styling and formatting
For HTML output, the function supports all Bootstrap themes available in kableExtra. The default theme is "bootstrap".
Value
The path of the written file.
Examples
## Not run:
library(knitr)
table_dir <- file.path(tempdir(), "table")
mtcars_kbl <- kable(
x = mtcars[1:5, ],
format = "html"
)
# Write a kable object as a PDF file
write_kbl(
kbl_obj = mtcars_kbl,
file = "kable_pdf",
target_dir = table_dir,
device = "pdf"
)
# Write a kable as an HTML file with a custom Bootstrap theme
write_kbl(
kbl_obj = mtcars_kbl,
file = "kable_html",
target_dir = table_dir,
device = "html",
bs_theme = "flatly"
)
unlink(table_dir)
## End(Not run)
Write an R object as a file
Description
A wrapper around dput
that facilitates saving R objects within knitr documents.
Automatically handles file naming and directory creation, with support for
preserving object structure and attributes.
Usage
write_obj(obj, file = NULL, target_dir = NULL, ...)
Arguments
obj |
The R object to be written. |
file |
The name of the file to be written. If not specified, the label of the code block will be used. |
target_dir |
The directory where the file will be written. If not specified, the current working directory will be used. |
... |
Additional arguments to be passed to |
Details
This function extends dput
functionality by:
Using knitr code block labels for automatic file naming
Creating target directories if they don't exist
Preserving complex object structures and attributes
Supporting all R object types (vectors, lists, data frames, etc.)
Objects saved with this function can be read back using the standard
dget
function.
Value
The path of the written file.
Examples
## Not run:
obj_dir <- file.path(tempdir(), "obj")
# Write a data frame as a file
write_obj(
obj = mtcars,
file = "mtcars_data",
target_dir = obj_dir
)
# Read the file back into an R session
my_mtcars <- dget(file.path(obj_dir, "mtcars_data"))
unlink(obj_dir)
## End(Not run)