Maintainer: | Mark van der Loo <mark.vanderloo@gmail.com> |
License: | GPL-3 |
Title: | Modify Data Using Externally Defined Modification Rules |
Type: | Package |
LazyLoad: | yes |
Description: | Data cleaning scripts typically contain a lot of 'if this change that' type of statements. Such statements are typically condensed expert knowledge. With this package, such 'data modifying rules' are taken out of the code and become in stead parameters to the work flow. This allows one to maintain, document, and reason about data modification rules as separate entities. |
Version: | 0.9.0 |
Depends: | methods |
URL: | https://github.com/data-cleaning/dcmodify |
BugReports: | https://github.com/data-cleaning/dcmodify/issues |
Encoding: | UTF-8 |
Imports: | yaml, validate (≥ 1.1.3), lumberjack (≥ 1.3.1), settings, utils, |
Suggests: | simplermarkdown, tinytest, |
VignetteBuilder: | simplermarkdown |
Collate: | 'dplyr_verbs.R' 'guard.R' 'modifier.R' 'modify.R' 'validate.R' |
RoxygenNote: | 7.3.1 |
NeedsCompilation: | no |
Packaged: | 2024-03-27 15:07:31 UTC; mark |
Author: | Mark van der Loo |
Repository: | CRAN |
Date/Publication: | 2024-03-28 08:20:06 UTC |
Data Modification By Modifying Rules
Description
Data often contain errors and missing data. Experts can often correct commonly occuring errors based on simple conditional rules. This package facilitates the expression, management, and application of such rules on data sets.
The general workflow in dcmodify
follows the following pattern.
Define or read a set of rules with
modifier
.-
modify
data with the modification rules. Examine the results either graphically or by summary.
There are several convenience functions that allow one to define modification rules from the commandline, through a (freeform or yaml) file and to investigate and maintain the rules themselves. Please have a look at the introductory vignette
vignette("introduction",package="dcmodify")
Author(s)
Maintainer: Mark van der Loo mark.vanderloo@gmail.com (ORCID)
Authors:
Edwin de Jonge (ORCID)
Other contributors:
Sjabbo Schaveling [contributor]
Floris Ruijter [contributor]
See Also
Useful links:
Report bugs at https://github.com/data-cleaning/dcmodify/issues
Create or read a set of data modification rules
Description
Create or read a set of data modification rules
Usage
modifier(..., .file, .data)
Arguments
... |
A comma-separated list of modification rules. |
.file |
(optional) A character vector of file locations. |
.data |
(optional) A |
Value
An object of class modifier
.
Examples
m <- modifier( if (height < mean(height)) height <- 2*height
, if ( weight > mean(weight) ) weight <- weight/2 )
modify(women,m)
Store modification rules
Description
Store modification rules
Modify a data set
Description
Modify a data set
Usage
modify(dat, x, ref, ...)
## S4 method for signature 'data.frame,modifier,environment'
modify(dat, x, ref, logger = NULL, ...)
## S4 method for signature 'data.frame,modifier,ANY'
modify(dat, x, logger = NULL, ...)
## S4 method for signature 'data.frame,modifier,data.frame'
modify(dat, x, ref, logger = NULL, ...)
## S4 method for signature 'data.frame,modifier,list'
modify(dat, x, ref, logger = NULL, ...)
Arguments
dat |
A |
x |
A |
ref |
A |
... |
Extra arguments. |
logger |
Optional. A |
Examples
m <- modifier( if (height < mean(height)) height <- 2*height
, if ( weight > mean(weight) ) weight <- weight/2 )
modify(women,m)
Shortcut to modify data
Description
Shortcut to modify data
Usage
modify_so(dat, ...)
Arguments
dat |
A |
... |
A comma-separated list of modifying rules. |
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
- validate
as_yaml
,description
,description<-
,do_by
,export_yaml
,label
,label<-
,max_by
,mean_by
,min_by
,origin
,origin<-
,sum_by