
data.table provides a high-performance version of base R’s
data.frame with syntax and feature enhancements for ease of
use, convenience and programming speed.
data.table??fread,
see also convenience
features for small data?fwriteIRanges::findOverlaps), non-equi
joins (i.e. joins using operators
>, >=, <, <=), aggregate on
join (by=.EACHI), update on
join?dcast
(pivot/wider/spread) and ?melt
(unpivot/longer/gather)list are supportedinstall.packages("data.table")
# latest development version (only if newer available)
data.table::update_dev_pkg()
# latest development version (force install)
install.packages("data.table", repos="https://rdatatable.gitlab.io/data.table")See the Installation wiki for more details.
Use data.table subset [ operator the same
way you would use data.frame one, but…
DT$ (like
subset() and with() but built-in)j
argument, not just list of columnsby to compute j expression
by grouplibrary(data.table)
DT = as.data.table(iris)
# FROM[WHERE, SELECT, GROUP BY]
# DT [i, j, by]
DT[Petal.Width > 1.0, mean(Petal.Length), by = Species]
# Species V1
#1: versicolor 4.362791
#2: virginica 5.552000example(data.table)data.table is widely used by the R community. It is
being directly used by hundreds of CRAN and Bioconductor packages, and
indirectly by thousands. It is one of the top
most starred R packages on GitHub, and was highly rated by the Depsy project. If you
need help, the data.table community is active on StackOverflow.
A list of packages that significantly support, extend, or make use of
data.table can be found in the Seal
of Approval document.
Guidelines for filing issues / pull requests: Contribution Guidelines.