--- title: "Workflow of the rPACI package" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Workflow of the rPACI package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} references: - id: dramoslopez2011 title: Placido-based indices of corneal irregularity author: - family: Ramos-López given: Darío - family: Martínez-Finkelshtein given: Andrei - family: Castro-Luna given: Gracia M. - family: Piñero given: David - family: Alió given: Jorge L. container-title: Optometry and Vision Science volume: 88 URL: 'https://doi.org/10.1097/opx.0b013e3182279ff8' DOI: 10.1097/opx.0b013e3182279ff8 issue: 10 publisher: American Academy of Optometry page: 1220-1231 type: article-journal issued: year: 2011 month: 10 - id: dramoslopez2013 title: Screening subclinical keratoconus with Placido-based corneal indices author: - family: Ramos-López given: Darío - family: Martínez-Finkelshtein given: Andrei - family: Castro-Luna given: Gracia M. - family: Burguera-Giménez given: Neus - family: Vega-Estrada given: Alfredo - family: Piñero given: David - family: Alió given: Jorge L. container-title: Optometry and Vision Science volume: 90 URL: 'https://doi.org/10.1097/opx.0b013e3182843f2a' DOI: 10.1097/opx.0b013e3182843f2a issue: 4 publisher: American Academy of Optometry page: 335-343 type: article-journal issued: year: 2013 month: 4 - id: castroluna2020 title: Robust keratoconus detection with Bayesian network classifier for Placido based corneal indices author: - family: Castro-Luna given: Gracia M. - family: Martínez-Finkelshtein given: Andrei - family: Ramos-López given: Darío container-title: Contact Lens and Anterior Eye volume: 43 URL: 'https://doi.org/10.1016/j.clae.2019.12.006' DOI: 10.1016/j.clae.2019.12.006 issue: 4 publisher: Elsevier page: 366-372 type: article-journal issued: year: 2020 --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` The **rPACI** package is developed to facilitate the calculation and interpretation of several indices for detecting corneal irregularities, and especially keratoconus. These indices were introduced in @dramoslopez2011, @dramoslopez2013 and @castroluna2020, and proved to be effective in detecting keratoconus and early keratoconus. ## Package installation **rPACI** depends on the **bnlearn** package and uses functions from the **ggplot2**, **tidyr** and **ggpubr** packages as well. To install and load the package and its dependencies, open an R session and write the following code: ```{r, eval = F} install.packages("rPACI", dependencies = TRUE) library("rPACI") ``` ```{r, echo = F, warning = F, message=F} library("rPACI") ``` ## Available datasets The **rPACI** package contains 11 datasets, real and simulated, with two different formats. File name | Source | File format | Directory | Corneal condition ------- | ------- | ------- | ------- | ------- ds1.txt | Real measurement| rPACI | system.file("extdata", package="rPACI") | Normal ds2.txt | Simulated data | rPACI | system.file("extdata", package="rPACI") | Normal K01.txt | Real measurement | CSO | system.file("extdata", package="rPACI") | Keratoconus K02.txt | Real measurement | CSO | system.file("extdata", package="rPACI") | Keratoconus K03.txt | Real measurement | CSO | system.file("extdata", package="rPACI") | Keratoconus N01.txt | Real measurement | CSO | system.file("extdata", package="rPACI") | Normal N02.txt | Real measurement | CSO | system.file("extdata", package="rPACI") | Normal S01.txt | Real measurement | CSO | system.file("extdata", package="rPACI") | Keratoconus suspect simulated_2020_05_31.txt | Simulated data | rPACI| system.file("extdata/evolution/", package="rPACI") | Normal simulated_2020_10_31.txt | Simulated data | rPACI | system.file("extdata/evolution/", package="rPACI") | Keratoconus suspect simulated_2021_04_30.txt | Simulated data | rPACI | system.file("extdata/evolution/", package="rPACI") | Normal ## Reading data The package includes three functions to read data: `readCSO()`, `readrPACI()` and `readFile()`. The two former ones read data in specific formats (explained below), while the latter one is a wrapper function able to read both formats. In general, we recommend using the more general wrapper function `readFile()` to read any file format. ### readCSO() To begin with, external files with a corneal topography in the format exported by some Placido disk topographers, especially those from CSO (a commercial brand), can be loaded using the function `readCSO()`. We recommend reading the [Corneal topographers and data format](topographersDataFormat.html) vignette. This reading function allows to specify different parameters: * `filepath`: The name of the file to be read. * `ringsTotal`: The total (maximum) number of rings that may be available in the measurement (including incomplete rings or missing data; it depends on the particular device; by default 24). * `pointsPerRings`: The number of points per ring that are digitized in the measurement (it depends on the particular device; by default 256). * `ringsToUse`: The effective number of innermost rings to use (as long as they are complete if `onlyCompleteRings = TRUE`, otherwise it will be the actual number of complete rings; by default 15). * `onlyCompleteRings`: A boolean value indicating whether to use only rings with complete data or not (by default, TRUE). * `NAvalues`: A numerical value or vector indicating how NA values are codified in the file (by default c(-1, -1000)). In the following example, a sample file with the corneal topography of a normal eye (included in the package as `N01.txt`) is used: ```{r, eval = T} dataset1 = readCSO(filepath = system.file("extdata","N01.txt", package="rPACI")) ``` This function produces a `data.frame` in the usual format used by `rPACI`, i.e., a data frame with three columns (x and y coordinates of each point and its ring index) and a row per data point, according to the function parameters (by default, 24*256 = 6144 rows or data points). ```{r, eval = T} head(dataset1) ``` ### readrPACI() On the other hand, external files with a corneal topography in the format exported by rPACI can also be loaded using the function `readrPACI()`. This function has two arguments: the path to the file (`filepath`) and the character used as column separator in the file (`sep`). In the following example, a sample file with the rPACI format (included in the package) is loaded: ```{r, eval = F} dataset2 = readrPACI(filepath = system.file("extdata","ds1.txt", package="rPACI")) ``` This function returns a `data.frame` with the same characteristics as `readCSO()`. ### readFile() Finally, the general wrapper function `readFile()` can be used to read any of the aforementioned file formats. Internally, it determines the format of the specified file and applies either `readCSO()` or `readrPACI()` if possible, or else it throws an error (if none can be applied, which occurs when the file format does not fit any of these two available formats). This is the reading function recommended by default, as it is able to read any supported file format. The `readFile()` function takes the file path as a mandatory argument and, optionally, any of the arguments available for the other two reading functions. As a result, this function creates a `data.frame` in the usual format used by `rPACI`, as the other two functions do. In the following example, a sample file included in the package is loaded using the `readFile()` wrapper function: ```{r, eval = T} dataset3 = readFile(filepath = system.file("extdata","N01.txt", package="rPACI")) head(dataset3) ``` ## Saving data The function `writerPACI()` writes to disk a corneal topography dataset in the format used by `rPACI`. The file will be saved in structured plain text, which will possibly have a header, followed by a block of three separated columns, according to the usual format used by `rPACI`, i.e., a list with three columns (x and y coordinates of each point, and its ring index) and a row per data point. If the given `data.frame` (named `dataset`) was produced using the function `simulateData()`, the resulting text file will also include the `Parameters` attribute (`attr(dataset,'Parameters')`) in its header, i.e., the list of parameters used for the simulation. A file stored with `writerPACI()` can later be read using the general reader function `readFile()` or the specific reader function `readrPACI()`. The following piece of code shows how to use the `writerPACI()` function. ````{r, eval = F} # Simulating an elliptic dataset, with ellipses axis ratio of 0.8 and an orientation of 45 degrees. dataset = simulateData(rings = 18, pointsPerRing = 300, ellipticAxesRatio = 0.8, ellipticRotation = 45) # Now the dataset can be saved to file using 'writerPACI' (check the working directory before saving): writerPACI(dataset, "datasetFile.txt") ```` See more details about the file structure in the [Corneal topographers and data format](topographersDataFormat.html) vignette and about simulation parameters in the [Simulating corneal datasets](simulating.html) vignette. ## Computing the Placido irregularity indices When a corneal dataframe is available in the global environment, the [Placido irregularity indices](indicesDefinition.html) of this data set can be computed by calling function `computePlacidoIndices()`, which takes the object returned by `readFile()` (or `readCSO()` or `readrPACI()`) as its only argument: ```{r, eval = T} results_N = computePlacidoIndices(dataset1) results_N ``` The object returned by `computePlacidoIndices()` is of class `data.frame` and contains 12 columns and 1 row (because we are analyzing only one eye). The first column of the returned `data.frame` is the diagnosis, based on the GLPI index, which can be either `Irregular cornea` (GLPI $\geq$ 70), `Suspect cornea` (30 $\leq$ GLPI <70) or `Normal cornea` (GLPI <30). The next column is the Naive Bayes Index (NBI), which ranges between 0 and 100 and can be interpreted as the probability of suffering from keratoconus. The remaining columns correspond to the primary indices, defined in the [Mathematical definition of the indices](indicesDefinition.html) vignette. ## Plotting the results The results of the previous analysis can be plotted using function `plotSingleCornea()`, which takes 3 arguments: `dataset`, a `data.frame` containing the corneal topography data, i.e., the object returned by `readFile()` (or `readCSO()` or `readrPACI()`); `PlacidoIndices`, a `data.frame` containing the computed Placido indices, i.e., the object returned by `computePlacidoIndices()`; and, optionally, `filename`, a character vector to be displayed on the plot (for instance, the filename of the corneal topography dataset). ```{r, eval = T, fig.width = 7, fig.height = 4} plotSingleCornea(dataset1, results_N, filename = "N01.txt") ``` The left-hand side of the figure shows the input data, whereas the right-hand side shows two charts: the GLPI index plot, which visually indicates the value taken by this index on a colored scale of possible values, and the PI indices distribution, which shows the distribution of the $PI_1$, $PI_2$, $PI_3$ and $SL$ indices in a boxplot placed on a scale of possible values. The colors of the charts indicate whether the indices fall within the *normal cornea* region (green color), *suspicious cornea* region (orange color), or *irregular cornea* region (red color). In the example shown, we can clearly see that the eye is diagnosed as normal. ## Using the wrapper function to analyze one external file Alternatively to use the `readFile()`, `computePlacidoIndices()` and `plotSingleCornea()` functions, the wrapper function `analyzeFile()` can be used instead, which takes 2 arguments: `path`, which is a character string indicating the location of the corneal topography file, and `drawplot`, which is a logical argument indicating whether the results should be plotted or not. This function returns a `data.frame` containing the same information as the object returned by `computePlacidoIndices()` and, optionally, the plot returned by `plotSingleCornea()` if the argument `drawplot` is `TRUE`. ```{r, eval = T, fig.width = 7, fig.height = 4} res_K = analyzeFile(system.file("extdata","K01.txt", package="rPACI"), drawplot=TRUE) res_K ``` In this example, the results indicate that the patient's cornea is irregular. In this case, the plot shows a GLPI index of 100 and a distribution of PI indices ranging from 100 to 130. ## Using the wrapper function to analyze one dataset The function `analyzeFile()` reads a file stored in our computer. But, what if we already have a corneal data frame loaded in our global environment? We can use the wrapper function `analyzeDataset()` instead. This function is a wrapper of `computePlacidoIndices()` and `plotSingleCornea()`, so that it returns the same objects as `analyzeFile()`. This is a convenient function, for instance, when we use the `simulateData()` function (for more details on this function, see the [Simulating corneal datasets](simulating.html) vignette) or have previously loaded the data using the `readFile()` function. `analyzeDataset()` takes two arguments: `dataset`, which is a corneal topography dataset in the format supported by **rPACI**; and `drawplot`, which is an optional logical parameter indicating whether a plot of results should be displayed or not (by default, TRUE). ```{r, eval = T, fig.width = 7, fig.height = 4} # Generate a sample dataset dataset = simulateData(rings = 12, maximumMireDisplacement = 0.2, mireDisplacementAngle = 50) # Analyze the dataset res_dataset = analyzeDataset(dataset = dataset) res_dataset ``` ## Analyzing multiple patients simultaneously So far, `analyzeFile()` and `analyzeDataset()` can only be used to analyze one dataset (i.e., one patient's eye) at a time. In order to analyze multiple files simultaneously, the function `analyzeFolder()` can be used. This function takes 4 arguments: `path`, to indicate the location of the folder containing the files; `fileExtension`, which indicates the extension of the files, is set to 'txt' by default; `individualPlots`, which is an optional logical argument indicating whether the plot for each file should be displayed or not; and `summaryPlot`, which is an optional logical argument indicating whether a summary plot of all files analyzed should be displayed or not. If the argument `summaryPlot()` is `TRUE`, then a barplot showing the absolute frequency of each possible value of diagnosis is depicted. ```{r, eval = T, fig.width = 7, fig.height = 4} resultsAll = analyzeFolder(path = system.file("extdata", package="rPACI"), individualPlots = T, summaryPlot = T) resultsAll ``` This function returns a `data.frame` containing 13 columns and as many rows as files analyzed. The first 12 columns correspond to the diagnosis and the indices, as in the object returned by `computePlacidoIndices()`. The last column corresponds to the file name so that a specific patient can be easily found. To see the diagnosis for each analyzed file, the first and last columns can be selected: ```{r, eval = T} resultsAll[,c(13,1)] ``` Note that the rows are sorted from *Irregular* to *Normal* cornea. In this example, three eyes are diagnosed as *irregular*, one as *suspect*, and four as *normal*. This is an easy and straightforward way to check if any patient potentially suffers from keratoconus. ## Analyzing one patient over time In order to examine the evolution of a patient's eye over time, the function `analyzeEvolution()` can be used. This function takes two arguments: `data`, which can be either 1) the path of a folder that contains corneal topography files, with format supported by **rPACI**, or 2) a list containing properly formatted data (loaded from a file using the function `readFile()` (or `readCSO()` or `readrPACI()`), simulated using `simulateData()`, or by other ways, as long as it meets the dataset requirements). If `data` is a path to a folder, the second argument, `fileExtension`, must be specified and all the files (with the given extension) in that folder will be assumed to be corneal topography files of a patient's eye and, therefore, will be loaded. Moreover, it will be assumed that the temporal arrangement is the alphabetical order of the filenames. Therefore, it is advised to use a proper file name, for instance using this date format: 'YYYY-MM-DD.txt'. On the other hand, if the data are stored in a list, it will be assumed that the temporal order corresponds with the index of each dataset in the list. The next example simulates three patient's measurements over time using the function `simulateData()` and then analyzes these files: ````{r, eval = F, fig.width = 7, fig.height = 4} # Simulate the patient's measures over time dataT1 = simulateData(rings = 12, maximumMireDisplacement = 0.15, mireDisplacementAngle = 10) dataT2 = simulateData(rings = 12, maximumMireDisplacement = 0.15, mireDisplacementAngle = 45) dataT3 = simulateData(rings = 12, maximumMireDisplacement = 0.2, mireDisplacementAngle = 50) # Create a list containing the data data = list( dataT1 = dataT1, dataT2 = dataT2, dataT3 = dataT3 ) # Analyze the data over time analyzeEvolution(data = data) ```` This other example imports the corneal data from disk and analyzes the files: ````{r, eval = T, fig.width = 7, fig.height = 4} # Specify a folder path to analyze a patient's evolution over time analyzeEvolution(data = system.file("extdata/evolution/", package="rPACI"), fileExtension = 'txt') ```` The `analyzeEvolution()` function returns a `data.frame` containing 13 columns and as many rows as files analyzed. The first 12 columns correspond to the diagnosis and the indices, as in the object returned by `computePlacidoIndices()`. The last column corresponds to the time step at which the measures were taken. Moreover, two temporal plots are returned. The left-hand side plot shows the $GLPI$ index, represented by a red line, and the boxplots of the primary indices $PI_1$, $PI_2$, $PI_3$, and $SL$ over time. The right-hand side plot shows the times series of these five indices, $GLPI$, $PI_1$, $PI_2$, $PI_3$, and $SL$, individually. Finally, both plots present a colored background, corresponding with the final diagnosis of the patient. In this example, the results indicate that the patient's cornea was normal at time 1, suspected at time 2, and irregular at time 3. ## Summary To summarize, the following description chart can help decide which function to use depending on the specific case to analyze. ![](diagram.png) ````{R, echo = F, eval = F, fig.width = 4, fig.height = 4} # library(DiagrammeR) # DiagrammeR::grViz("digraph { # # graph [layout = dot] # # # define the global styles of the nodes. We can override these in box if we wish # node [shape = rounded, style = filled, fillcolor = Pink] # # step1 [label = 'How many patients?', fillcolor = Beige] # step2 [label = 'How many measures?', fillcolor = Beige] # step3 [label = 'Data on file or \n data.frame?', fillcolor = Beige] # leaf1 [label = 'analyzeFile()'] # leaf2 [label= 'analyzeDataset()'] # leaf3 [label = 'analyzeEvolution()'] # leaf4 [label = 'analyzeFolder()'] # # # # edge definitions with the node IDs # step1 -> step2 [label='One'] # step1 -> leaf4 [label='Several'] # step2 -> step3 [label='Single'] # step2 -> leaf3 [label='Repeated'] # step3 -> leaf2 [label='Data loaded in \nglobal environment'] # step3 -> leaf1 [label='External \nfile'] # }") ```` ## References