--- title: "Convenient fetching of EZbakR outputs: EZget()" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{EZget} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction This vignette shows how to use the `EZget()` function provided by `EZbakR`. In cases where you have multiple tables of a particular type in your `EZbakRData` object, this can greatly facilitate extracting the table of interest. As a part of this vignette, I will also describe how an `EZbakRData` object is organized. ```{r setup} library(EZbakR) ``` ## EZbakRData objects Let's first analyze some simulated data to generate an `EZbakRData` object that we can explore the contents of: ```{r} simdata <- EZSimulate(nfeatures = 300, nreps = 2) # Make initial EZbakRData object ezbdo <- EZbakRData(simdata$cB, simdata$metadf) # Estimate fractions twice, and don't overwrite the first analysis # Second run will use different model; see EstimateFractions vignette for details ezbdo <- EstimateFractions(ezbdo) ezbdo <- EstimateFractions(ezbdo, strategy = 'hierarchical', overwrite = FALSE) # Estimate kinetic parameters with three different strategies # See EstimateKinetics vignettes for details. ezbdo <- EstimateKinetics(ezbdo, repeatID = 1) ezbdo <- EstimateKinetics(ezbdo, repeatID = 1, strategy = "shortfeed") ezbdo <- EstimateKinetics(ezbdo, repeatID = 2, strategy = "shortfeed") ``` An `EZbakRData` object is a list that can contain the following items: 1. **cB**: The cB table you provided upon object creation. 2. **metadf**: The metadf table you provided upon object creation. 3. **fractions**: List of fractions estimates generated by `EstimateFractions()`. 4. **kinetics**: List of kinetic parameter estimates generated by `EstimateKinetics()`. 5. **averages**: List of parameter replicate averages generated by `AverageAndRegularize()`. 6. **comparisons**: List of comparisons of parameter averages, generated by `CompareParameters()`. 7. **dynamics**: List of dynamical systems model parameter estimated, generateld by `EZDynamics()`. 8. **readcounts**: List of tables of read counts generated by various EZbakR functions. 9. **metadata**: List with elements corresponding to the lists of tables described above. Describes various features of the tables so that they can be fetched with `EZget()`. As an `EZbakRData` object is a list, its elements can be accessed in a few ways: ```{r} # `$` notation: ezbdo$fractions$feature # `[[]]` notation with element names ezbdo[['fractions']][['feature']] # `[[]]` notation with numeric indices ezbdo[[4]][[1]] ``` ## Using EZget `EZget()` provides an alternative strategy for getting a particular table. It has two required arguments: 1. `obj`: The `EZbakRData` object you would like to get a table from. 2. `type`: The type of table you are looking for. Options are "fractions", "kinetics", "readcounts", "averages", and "comparisons", the lists of tables described above. Most of the remaining parameters are search criteria that you specify. The full list can be seen in the function docs (`?EZget()`). These all except strings or vectors of strings as input, and all metadata will be checked to see if the provided string is contained in the respective metadata slot. For example, we can extract the kinetics table generated from the standard analysis like so: ```{r} kinetics <- EZget(ezbdo, type = "kinetics", kstrat = "standard") ``` In some cases, multiple tables with the exact same metadata exist. For example, the metadata for `fractions` tables is: * The feature columns by which reads were grouped. This is "feature" for both of our `fractions` tables. * The mutational populations analyzed. This is "TC" for both of our `fractions` tables. * The fraction_design table used. This is the standard fraction_design for a single mutation type analysis for both of our `fractions` tables. Since we set `overwrite = FALSE` in our second run of `EstimateFractions`, these tables were both saved. What distinguishes them is a final piece of metadata saved for all tables: `repeatID`. This is a numerical ID that distinguishes multiple instances of the same table. The ID is 1 for the first such object created, 2 for the second, etc. Thus, the analysis with the standard mixture model has a `repeatID` of 1, and the analysis with the hierarchical mixture model has a `repeatID` of 2. We can thus access the latter as such: ```{r} h_fxn <- EZget(ezbdo, type = 'fractions', repeatID = 2) ``` There are three parameters that tune `EZget()`'s behavior. These are: 1. `returnNameOnly`: If TRUE, then only the names of the tables consistent with the search criterion you specify will be returned. This will throw a warning if there is more than one table that passes your criteria, but it will not error in this case. If `returnNameOnly` is `FALSE`, then an error is thrown if there is more than one table that matches your search criteria. 2. `exactMatch`: The `features` and `populations` arguments are the two arguments that can be vectors of strings. Setting `exactMatch` to TRUE will force the provided `features` and `populations` vectors to exactly match those in a table's metadata for that table to be returned. The alternative (default) behavior, is that the provided `feature(s)` and `population(s)` only have to all be contained in a table's metadata. 3. `alwaysCheck`: If only a single table of the relevant `type` is present in your `EZbakRData` object, `EZget()` automatically returns that table without checking to see if the search criteria match. If you set `alwaysCheck` to TRUE, then the table is searched for as normal and will only be returned if its metadata match the search criteria.