--- title: "Piped Mode" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Piped Mode} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, echo=FALSE} library(epitraxr) ``` # Overview While the package functions can all be called individually (standard mode, described briefly in `vignette("epitraxr")`), we **recommend using the piped mode** of the epitraxr package because it results in much cleaner, more maintainable code. Instead of calling and saving the results of each generated report, you use the pipe operator `|>` to chain together multiple reports and all the reports are saved to a single `epitrax` object. You can then either manipulate all the reports from the `epitrax` object, or add to your pipe one of epitraxr's export functions to write reports to one of the supported formats (e.g., CSV). In the epitraxr package, functions that expect a "piped" input are identified by the prefix `epitrax_`. Within this family, report generators are typically prefixed with either `epitrax_preport_` or `epitrax_ireport_` (corresponding to public and internal reports respectively), but can also have the prefix `epitrax_report` (can do both public and internal reports). Export functions are prefixed with `epitrax_write_` (e.g., `epitrax_write_csvs()`). This vignette will walk you through each step of an epitraxr pipe then show the completed pipe running from end to end. ## Pipe Setup ### Create an `epitrax` object The first step in piped mode is always to create an `epitrax` object with the `create_epitrax_from_file()` function. This object will contain the data, configuration options, and report settings needed for all reports in the pipe. The `epitrax` object is passed through each function in the pipe. When a report is generated, it is appended to the appropriate list (public or internal) in the `epitrax` object before the object is passed to the next function in the pipe. ```{r get-epitrax} data_fp <- "vignette-data/epitrax_data.csv" epitrax <- create_epitrax_from_file(filepath = data_fp) names(epitrax) ``` The `create_epitrax_from_file()` function reads the data in the provided data file, validates, and formats it. It then adds the data to the `epitrax` object as `epitrax$data`. The function also extracts key information and summary statistics from the data and adds those to the epitrax object as well: - `epitrax$diseases`: All diseases found in the data - `epitrax$yrs`: Years included in the data - `epitrax$report_year` and `epitrax$report_month`: The year and month treated as the "current" date for reports. Default to the latest year/month in the data. - `epitrax$internal_reports` and `epitrax$public_reports`: Lists to hold generated reports. Initially empty. **Note:** All further functions in the pipe will expect an object of class `epitrax` as their first argument. Thus, `create_epitrax_from_file()` is the start of the pipe. ### Add Disease Lists The next step is adding two disease lists, one for internal reports and one for public reports. If a given disease is not in the EpiTrax data, that means there were no reported cases of that disease in those years. That is still useful data that you may want to include in your reports. The epitraxr package uses two lists because public reports typically include a subset of diseases, while internal reports typically include all tracked diseases. Add the disease lists to the `epitrax` object using the `epitrax_set_report_diseases()` function. ```{r add-diseases} disease_lists = list( internal = "vignette-data/ireport_diseases.csv", public = "vignette-data/preport_diseases.csv" ) epitrax <- epitrax_set_report_diseases(epitrax, disease_list_files = disease_lists) names(epitrax) ``` The `epitrax` object now contains `report_diseases` with `report_diseases$internal` and `report_diseases$public` holding the individual lists. ### Add Config The last step is adding a configuration options. These can be read from a list (`epitrax_set_config_from_list()`) or from a file (`epitrax_set_config_from_file()`). Configuration options provide report generators with important values, such as your area's current and previous population (used for converting counts to rates per 100k) and the trend threshold (used to determine if current counts are above or below historical counts). ```{r set-config} config_file <- "vignette-data/config.yaml" epitrax <- epitrax_set_config_from_file(epitrax, filepath = config_file) names(epitrax) ``` The `epitrax` object now contains the `config` details: ```{r show-config} epitrax$config ``` ### Convenient Setup Since these three operations must always occur before the report generators can be run, epitraxr has the convenience function `setup_epitrax()`. ```{r conveniencefun} epitrax <- setup_epitrax( filepath = data_fp, config_file = config_file, disease_list_files = disease_lists ) names(epitrax) ``` ## Running Report Generators At this point, the `epitrax` object is ready to be piped into report generators. To start, run `epitrax_ireport_annual_counts()` and `epitrax_ireport_monthly_counts_all_yrs()`, then inspect the list of reports: ```{r ireports-noargs} epitrax <- epitrax_ireport_annual_counts(epitrax) epitrax <- epitrax_ireport_monthly_counts_all_yrs(epitrax) names(epitrax$internal_reports) ``` Call a few more report generators: ```{r morereports} epitrax <- epitrax_ireport_monthly_avgs(epitrax) epitrax <- epitrax_ireport_ytd_counts_for_month(epitrax) epitrax <- epitrax_preport_month_crosssections(epitrax) epitrax <- epitrax_preport_ytd_rates(epitrax) ``` The object now contains these internal reports: ```{r inspect-ireports} names(epitrax$internal_reports) ``` And these public reports: ```{r inspect-preports} names(epitrax$public_reports) ``` As you can see, each report generator simply appends the created reports to the appropriate list. ## Exporting Reports While you may want to process the reports contained in the `epitrax` object in R, you will often export the generated reports to one of the formats supported by epitraxr. ### Setup Filesystem To use export functions in epitraxr, you need to provide folder paths for internal and public reports. These are organized as a list. The `setup_filesystem()` function creates the folders (if they don't already exist) and optionally clears out any old reports from previous runs: ```{r set-fsys} tmpdir <- tempdir() fsys <- list( internal = file.path(tmpdir, "internal_reports"), public = file.path(tmpdir, "public_reports") ) fsys <- setup_filesystem(folders = fsys, clear.reports = TRUE) ``` You can skip the `setup_filesystem()` function, if you know your folders are created and ready to receive reports. You will pass this `fsys` list to epitraxr export functions. ### Export to CSV The most common export format is CSV using the `epitrax_write_csvs()` function. ```{r export-csv} epitrax <- epitrax_write_csvs(epitrax, fsys = fsys) list.files(fsys$internal) list.files(fsys$public) ``` Typically, export functions are called at the end of the pipe. However, since export functions do not modify the `epitrax` object, you can safely insert these functions anywhere in the pipe. ```{r cleanup1, include = FALSE} unlink(unlist(fsys, use.names = FALSE), recursive = TRUE) ``` ## Full Pipe: Putting It All Together Here is the full pipe described above: ```{r full-pipe} # Data and config files data_fp <- "vignette-data/epitrax_data.csv" disease_lists = list( internal = "vignette-data/ireport_diseases.csv", public = "vignette-data/preport_diseases.csv" ) config_file <- "vignette-data/config.yaml" # Setup filesystem tmpdir <- tempdir() fsys <- list( internal = file.path(tmpdir, "internal_reports"), public = file.path(tmpdir, "public_reports") ) fsys <- setup_filesystem(folders = fsys, clear.reports = TRUE) # Run report generation pipe epitrax <- setup_epitrax( filepath = data_fp, config_file = config_file, disease_list_files = disease_lists ) |> epitrax_ireport_annual_counts() |> epitrax_ireport_monthly_counts_all_yrs() |> epitrax_ireport_monthly_avgs() |> epitrax_ireport_ytd_counts_for_month() |> epitrax_preport_month_crosssections() |> epitrax_preport_ytd_rates() |> epitrax_write_csvs(fsys = fsys) length(epitrax$internal_reports) list.files(fsys$internal) length(epitrax$public_reports) list.files(fsys$public) ``` ```{r cleanup2, include = FALSE} unlink(unlist(fsys, use.names = FALSE), recursive = TRUE) ```