| Title: | Fast Tidying of Data | 
| Version: | 0.4.0 | 
| Description: | Tidying functions built on 'data.table' to provide quick and efficient data manipulation with minimal overhead. | 
| Depends: | R (≥ 3.1) | 
| Imports: | data.table (≥ 1.13.4), cpp11 | 
| Suggests: | covr, dplyr, magrittr, remotes, spelling, testthat (≥ 3.0.0), tidyr, knitr | 
| LinkingTo: | cpp11 (≥ 0.2.6) | 
| Encoding: | UTF-8 | 
| Language: | en-US | 
| License: | GPL-3 | 
| RoxygenNote: | 7.3.0 | 
| Config/testthat/edition: | 3 | 
| Config/testthat/parallel: | true | 
| NeedsCompilation: | yes | 
| Packaged: | 2024-02-02 08:53:21 UTC; tysonbarrett | 
| Author: | Tyson Barrett | 
| Maintainer: | Tyson Barrett <t.barrett88@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-02-02 09:30:03 UTC | 
tidyfast: Fast Tidying of Data
Description
Tidying functions built on 'data.table' to provide quick and efficient data manipulation with minimal overhead.
Author(s)
Maintainer: Tyson Barrett t.barrett88@gmail.com (ORCID)
Other contributors:
- Mark Fairbanks [contributor] 
- Ivan Leung [contributor] 
- Indrajeet Patil patilindrajeet.science@gmail.com (ORCID) (@patilindrajeets) [contributor] 
Case When with data.table
Description
Does what dplyr::case_when() does, with the same syntax, but with
data.table::fcase() under the hood.
Usage
dt_case_when(...)
Arguments
| ... | statements of the form:  | 
Value
Vector of the same size as the input vector
Examples
x <- rnorm(100)
dt_case_when(
  x < median(x) ~ "low",
  x >= median(x) ~ "high",
  is.na(x) ~ "other"
)
library(data.table)
temp <- data.table(
  pseudo_id = c(1, 2, 3, 4, 5),
  x = sample(1:5, 5, replace = TRUE)
)
temp[, y := dt_case_when(
  pseudo_id == 1 ~ x * 1,
  pseudo_id == 2 ~ x * 2,
  pseudo_id == 3 ~ x * 3,
  pseudo_id == 4 ~ x * 4,
  pseudo_id == 5 ~ x * 5
)]
Count
Description
Count the numbers of observations within groups
Usage
dt_count(dt_, ..., na.rm = FALSE, wt = NULL)
Arguments
| dt_ | the data table to uncount | 
| ... | groups | 
| na.rm | should any rows with missingness be removed before the count? Default is  | 
| wt | the wt assigned to the counts (same number of rows as the data) | 
Value
A data.table with counts for each group (or combination of groups)
Examples
library(data.table)
dt <- data.table(
  x = rnorm(1e5),
  y = runif(1e5),
  grp = sample(1L:3L, 1e5, replace = TRUE),
  wt = runif(1e5, 1, 100)
)
dt_count(dt, grp)
dt_count(dt, grp, na.rm = TRUE)
dt_count(dt, grp, na.rm = TRUE, wt = wt)
Fill with data.table
Description
Fills in values, similar to tidyr::fill(), by within data.table. This function relies on the
Rcpp functions that drive tidyr::fill() but applies them within data.table.
Usage
dt_fill(
  dt_,
  ...,
  id = NULL,
  .direction = c("down", "up", "downup", "updown"),
  immutable = TRUE
)
Arguments
| dt_ | the data table (or if not a data.table then it is coerced with as.data.table) | 
| ... | the columns to fill | 
| id | the grouping variable(s) to fill within | 
| .direction | either "down" or "up" (down fills values down, up fills values up), or "downup" (down first then up) or "updown" (up first then down) | 
| immutable | If  | 
Value
A data.table with listed columns having values filled in
Examples
set.seed(84322)
library(data.table)
x <- 1:10
dt <- data.table(
  v1 = x,
  v2 = shift(x),
  v3 = shift(x, -1L),
  v4 = sample(c(rep(NA, 10), x), 10),
  grp = sample(1:3, 10, replace = TRUE)
)
dt_fill(dt, v2, v3, v4, id = grp, .direction = "downup")
dt_fill(dt, v2, v3, v4, id = grp)
dt_fill(dt, .direction = "up")
Hoist: Fast Unnesting of Vectors
Description
Quickly unnest vectors nested in list columns. Still experimental (has some potentially unexpected behavior in some situations)!
Usage
dt_hoist(dt_, ...)
Arguments
| dt_ | the data table to unnest | 
| ... | the columns to unnest (must all be the sample length when unnested); use bare names of the variables | 
Examples
library(data.table)
dt <- data.table(
  x = rnorm(1e5),
  y = runif(1e5),
  nested1 = lapply(1:10, sample, 10, replace = TRUE),
  nested2 = lapply(c("thing1", "thing2"), sample, 10, replace = TRUE),
  id = 1:1e5
)
dt_hoist(dt, nested1, nested2)
Fast Nesting
Description
Quickly nest data tables (similar to dplyr::group_nest()).
Usage
dt_nest(dt_, ..., .key = "data")
Arguments
| dt_ | the data table to nest | 
| ... | the variables to group by | 
| .key | the name of the list column; default is "data" | 
Value
A data.table with a list column containing data.tables
Examples
library(data.table)
dt <- data.table(
  x = rnorm(1e5),
  y = runif(1e5),
  grp = sample(1L:3L, 1e5, replace = TRUE)
)
dt_nest(dt, grp)
Pivot data from wide to long
Description
dt_pivot_wider() "widens" data, increasing the number of columns and
decreasing the number of rows. The inverse transformation is
dt_pivot_longer(). Syntax based on the tidyr equivalents.
Usage
dt_pivot_longer(
  dt_,
  cols = NULL,
  names_to = "name",
  values_to = "value",
  values_drop_na = FALSE,
  ...
)
Arguments
| dt_ | The data table to pivot longer | 
| cols | Column selection. If empty, uses all columns. Can use -colname to unselect column(s) | 
| names_to | Name of the new "names" column. Must be a string. | 
| values_to | Name of the new "values" column. Must be a string. | 
| values_drop_na | If TRUE, rows will be dropped that contain NAs. | 
| ... | Additional arguments to pass to 'melt.data.table()' | 
Value
A reshaped data.table into longer format
Examples
library(data.table)
example_dt <- data.table(x = c(1, 2, 3), y = c(4, 5, 6), z = c("a", "b", "c"))
dt_pivot_longer(example_dt,
  cols = c(x, y),
  names_to = "stuff",
  values_to = "things"
)
dt_pivot_longer(example_dt,
  cols = -z,
  names_to = "stuff",
  values_to = "things"
)
Pivot data from long to wide
Description
dt_pivot_wider() "widens" data, increasing the number of columns and
decreasing the number of rows. The inverse transformation is
dt_pivot_longer(). Syntax based on the tidyr equivalents.
Usage
dt_pivot_wider(dt_, id_cols = NULL, names_from, names_sep = "_", values_from)
Arguments
| dt_ | the data table to widen | 
| id_cols | A set of columns that uniquely identifies each observation. Defaults to all columns in the data table except for the columns specified in  | 
| names_from | A pair of arguments describing which column (or columns) to get the name of the output column ( | 
| names_sep | the separator between the names of the columns | 
| values_from | A pair of arguments describing which column (or columns) to get the name of the output column ( | 
Value
A reshaped data.table into wider format
Examples
library(data.table)
example_dt <- data.table(
  z = rep(c("a", "b", "c"), 2),
  stuff = c(rep("x", 3), rep("y", 3)),
  things = 1:6
)
dt_pivot_wider(example_dt, names_from = stuff, values_from = things)
dt_pivot_wider(example_dt, names_from = stuff, values_from = things, id_cols = z)
Set Print Method
Description
The function allows the user to define options relating to the print method for data.table.
Usage
dt_print_options(
  class = TRUE,
  topn = 5,
  rownames = TRUE,
  nrows = 100,
  trunc.cols = TRUE
)
Arguments
| class | should the variable class be printed? ( | 
| topn | the number of rows to print (both head and tail) if  | 
| rownames | should rownames be printed? ( | 
| nrows | total number of rows to print ( | 
| trunc.cols | if  | 
Value
None. This function is used for its side effect of changing options.
Examples
dt_print_options(
  class = TRUE,
  topn = 5,
  rownames = TRUE,
  nrows = 100,
  trunc.cols = TRUE
)
Separate columns with data.table
Description
Separates a column of data into others, by splitting based a separator or regular expression
Usage
dt_separate(
  dt_,
  col,
  into,
  sep = ".",
  remove = TRUE,
  fill = NA,
  fixed = TRUE,
  immutable = TRUE,
  dev = FALSE,
  ...
)
Arguments
| dt_ | the data table (or if not a data.table then it is coerced with as.data.table) | 
| col | the column to separate | 
| into | the names of the new columns created from splitting  | 
| sep | the regular expression stating how  | 
| remove | should  | 
| fill | if empty, fill is inserted. Default is  | 
| fixed | logical. If TRUE match split exactly, otherwise use regular expressions. Has priority over perl. | 
| immutable | If  | 
| dev | If  | 
| ... | arguments passed to  | 
Value
A data.table with a column split into multiple columns.
Examples
library(data.table)
d <- data.table(
  x = c("A.B", "A", "B", "B.A"),
  y = 1:4
)
# defaults
dt_separate(d, x, c("c1", "c2"))
# can keep the original column with `remove = FALSE`
dt_separate(d, x, c("c1", "c2"), remove = FALSE)
# need to assign when `immutable = TRUE`
separated <- dt_separate(d, x, c("c1", "c2"), immutable = TRUE)
separated
# don't need to assign when `immutable = FALSE` (default)
dt_separate(d, x, c("c1", "c2"), immutable = FALSE)
d
Select helpers
Description
These functions allow you to select variables based on their names.
-  dt_starts_with(): Starts with a prefix
-  dt_starts_with(): Ends with a suffix
-  dt_contains(): Contains a literal string
-  dt_everything(): Matches all variables
Usage
dt_starts_with(match)
dt_contains(match)
dt_ends_with(match)
dt_everything()
Arguments
| match | a character string to match to variable names | 
Value
None. To be used within the dt_pivot_* functions.
Examples
library(data.table)
# example of using it with `dt_pivot_longer()`
df <- data.table(row = 1, var = c("x", "y"), a = 1:2, b = 3:4)
pv <- dt_pivot_wider(df,
  names_from = var,
  values_from = c(dt_starts_with("a"), dt_ends_with("b"))
)
Uncount
Description
Uncount a counted data table
Usage
dt_uncount(dt_, weights, .remove = TRUE, .id = NULL)
Arguments
| dt_ | the data table to uncount | 
| weights | the counts for each | 
| .remove | should the weights variable be removed? | 
| .id | an optional new id variable, providing a unique id for each row | 
Value
A data.table with a row for each uncounted column.
Examples
library(data.table)
dt_count <- data.table(
  x = LETTERS[1:3],
  w = c(2, 1, 4)
)
uncount <- dt_uncount(dt_count, w, .id = "id")
uncount[] # note that `[]` forces the printing
Unnest: Fast Unnesting of Data Tables
Description
Quickly unnest data tables, particularly those nested by dt_nest().
Usage
dt_unnest(dt_, col, keep = TRUE)
Arguments
| dt_ | the data table to unnest | 
| col | the column to unnest | 
| keep | whether to keep the nested column, default is  | 
Examples
library(data.table)
dt <- data.table(
  x = rnorm(1e5),
  y = runif(1e5),
  grp = sample(1L:3L, 1e5, replace = TRUE)
)
nested <- dt_nest(dt, grp)
dt_unnest(nested, col = data)
fcase from data.table
Description
See data.table::fcase() for details.