For this example we take the case of Viral Sinusitis and
several treatments as events. We set our
minEraDuration = 7, minCombinationDuration = 7
and combinationWindow = 7. We treat multiple events of
Viral Sinusitis as separate cases by setting
concatTargets = FALSE. When set to TRUE it
would append multiple cases, which might be useful for time invariant
target cohorts like chronic conditions.
library(CDMConnector)
library(dplyr)
library(TreatmentPatterns)
cohortSet <- readCohortSet(
  path = system.file(package = "TreatmentPatterns", "exampleCohorts")
)
con <- DBI::dbConnect(
  drv = duckdb::duckdb(),
  dbdir = eunomiaDir()
)
cdm <- cdmFromCon(
  con = con,
  cdmSchema = "main",
  writeSchema = "main"
)
cdm <- generateCohortSet(
  cdm = cdm,
  cohortSet = cohortSet,
  name = "cohort_table",
  overwrite = TRUE
)## ℹ Generating 8 cohorts## ℹ Generating cohort (1/8) - acetaminophen✔ Generating cohort (1/8) - acetaminophen [163ms]
## ℹ Generating cohort (2/8) - amoxicillin✔ Generating cohort (2/8) - amoxicillin [147ms]
## ℹ Generating cohort (3/8) - aspirin✔ Generating cohort (3/8) - aspirin [141ms]
## ℹ Generating cohort (4/8) - clavulanate✔ Generating cohort (4/8) - clavulanate [140ms]
## ℹ Generating cohort (5/8) - death✔ Generating cohort (5/8) - death [93ms]
## ℹ Generating cohort (6/8) - doxylamine✔ Generating cohort (6/8) - doxylamine [146ms]
## ℹ Generating cohort (7/8) - penicillinv✔ Generating cohort (7/8) - penicillinv [137ms]
## ℹ Generating cohort (8/8) - viralsinusitis✔ Generating cohort (8/8) - viralsinusitis [197ms]cohorts <- cohortSet %>%
  # Remove 'cohort' and 'json' columns
  select(-"cohort", -"json", -"cohort_name_snakecase") %>%
  mutate(type = c("event", "event", "event", "event", "exit", "event", "event", "target")) %>%
  rename(
    cohortId = "cohort_definition_id",
    cohortName = "cohort_name",
  )
outputEnv <- computePathways(
  cohorts = cohorts,
  cohortTableName = "cohort_table",
  cdm = cdm,
  minEraDuration = 7,
  combinationWindow = 7,
  minPostCombinationDuration = 7,
  concatTargets = FALSE
)## -- Qualifying records for cohort definitions: 1, 2, 3, 4, 5, 6, 7, 8
##  Records: 14041
##  Subjects: 2693
## -- Removing records < minEraDuration (7)
##  Records: 11347
##  Subjects: 2159
## >> Starting on target: 8 (viralsinusitis)
## -- Removing events outside window (startDate: 0 | endDate: 0)
##  Records: 8327
##  Subjects: 2142
## -- splitEventCohorts
##  Records: 8327
##  Subjects: 2142
## -- Collapsing eras, eraCollapse (30)
##  Records: 8327
##  Subjects: 2142
## -- Iteration 1: minPostCombinationDuration (7), combinatinoWindow (7)
##  Records: 6799
##  Subjects: 2142
## -- Iteration 2: minPostCombinationDuration (7), combinatinoWindow (7)
##  Records: 6663
##  Subjects: 2142
## -- Iteration 3: minPostCombinationDuration (7), combinatinoWindow (7)
##  Records: 6662
##  Subjects: 2142
## -- After Combination
##  Records: 6662
##  Subjects: 2142
## -- filterTreatments (First)
##  Records: 6657
##  Subjects: 2142
## -- Max path length (5)
##  Records: 6653
##  Subjects: 2142
## -- treatment construction done
##  Records: 6653
##  Subjects: 2142results <- export(
  andromeda = outputEnv,
  minCellCount = 1,
  nonePaths = TRUE,
  outputPath = tempdir()
)## Wrote csv-files to: C:\Users\mvankessel\AppData\Local\Temp\RtmpY1v2gGNow that we ran our TreatmentPatterns analysis and have exported our
results, we can evaluate the output. The export() function
in TreatmentPatterns returns an R6 class of
TreatmentPatternsResults. All results are query-able from
this object. Additionally the files are written to the specified
outputPath. If no outputPath is set, only the
result object is returned, and no files are written.
If you would like to save the results to csv-, or zip-file after the fact you can still do this. Or upload it to a database:
# Save to csv-, zip-file
results$saveAsCsv(path = tempdir())## Wrote csv-files to: C:\Users\mvankessel\AppData\Local\Temp\RtmpY1v2gGresults$saveAsZip(path = tempdir(), name = "tp-results.zip")## Wrote zip-file to: C:\Users\mvankessel\AppData\Local\Temp\RtmpY1v2gG# Upload to database
connectionDetails <- DatabaseConnector::createConnectionDetails(
  dbms = "sqlite",
  server = file.path(tempdir(), "db.sqlite")
)
results$uploadResultsToDb(
  connectionDetails = connectionDetails,
  schema = "main",
  prefix = "tp_",
  overwrite = TRUE,
  purgeSiteDataBeforeUploading = FALSE
)## 
## Attaching package: 'DatabaseConnector'## The following objects are masked from 'package:CDMConnector':
## 
##     dbms, insertTable## Connecting using SQLite driver
## Uploading file: attrition.csv to table: attrition## - Preparing to upload rows 1 through 12## Inserting data took 0.0251 secs
## Uploading file: counts_age.csv to table: counts_age## - Preparing to upload rows 1 through 63## Inserting data took 0.0372 secs
## Uploading file: counts_sex.csv to table: counts_sex## - Preparing to upload rows 1 through 2## Inserting data took 0.016 secs
## Uploading file: counts_year.csv to table: counts_year## - Preparing to upload rows 1 through 52## Inserting data took 0.019 secs
## Uploading file: metadata.csv to table: metadata## - Preparing to upload rows 1 through 1## Inserting data took 0.00762 secs
## Uploading file: summary_event_duration.csv to table: summary_event_duration## - Preparing to upload rows 1 through 88## Inserting data took 0.0171 secs
## Uploading file: treatment_pathways.csv to table: treatment_pathways## - Preparing to upload rows 1 through 372## Inserting data took 0.0242 secs
## Uploading file: cdm_source_info.csv to table: cdm_source_info## - Preparing to upload rows 1 through 1## Inserting data took 0.0171 secs
## Uploading file: analyses.csv to table: analyses## - Preparing to upload rows 1 through 1## Warning: Column 'description' is of type 'logical', but this is not supported
## by many DBMSs. Converting to numeric (1 = TRUE, 0 = FALSE)## Inserting data took 0.0154 secs
## Uploading file: arguments.csv to table: arguments## - Preparing to upload rows 1 through 1## Inserting data took 0.0157 secs## Uploading data took 5.63 secsThe treatmentPathways file contains all the pathways found, with a frequency, pairwise stratified by age group, sex and index year.
head(results$treatment_pathways)We can see the pathways contain the treatment names we provided in
our event cohorts. Besides that we also see the paths are annoted with a
+ or -. The + indicates two
treatments are a combination therapy,
i.e. amoxicillin+clavulanate is a combination of
amoxicillin and clavulanate. The -
indicates a switch between treatments,
i.e. acetaminophen-penicillinv is a switch from
acetaminophen to penicillin v. Note that these
combinations and switches can occur in the same pathway,
i.e. acetaminophen-amoxicillin+clavulanate. The first
treatment is acetaminophen that switches to a
combination of amoxicillin and clavulanate.
The countsAge, countsSex, and countsYear contain counts per age, sex, and index year.
head(results$counts_age)head(results$counts_sex)head(results$counts_year)The summaryEventDuration contains summary statistics from different
events, across all found “lines”. A “line” is equal to the level in the
Sunburst or Sankey diagrams. The summary statistics allow for plotting
of boxplots with the plotEventDuration() function.
results$plotEventDuration()Not that besides our events there are two extra rows: mono-event, and combination-event. These are both types of events on average.
We see that most events last between 0 and 100 days. We can see that for combination-events and amoxicillin+clavulanate there is a tendency for events to last longer than that. amoxicillin+clavulanate most likely skews the duration in the combination-events group.
We can alter the x-axis to get a clearer view of the durations of the events:
results$plotEventDuration() +
  ggplot2::xlim(0, 100)
Now we can more clearly investigate particular treatments. We can see
that penicilin v tends to last quite short across all treatment
lines, while aspirin and acetaminophen seem to skew to
a longer duration.
Additionally we can also set a minCellCount for the
individual events.
results$plotEventDuration(minCellCount = 10) +
  ggplot2::xlim(0, 100)The metadata file is a file that contains information about the circumstances the analysis was performed in, and information about R, and the CDM.
results$metadataFrom the filtered treatmentPathways file we are able to create a sunburst plot.
The inner most layer is the first event that occurs, going outwards. This aligns with the event duration plot we looked at earlier.
results$plotSunburst()We can also create a Sankey Diagram, which in theory displays the same data. Additionally you see the Stopped node in the Sankey diagram. This indicates the end of the pathway. It is mostly a practical addition so that single layer Sankey diagrams can still be plotted.
results$plotSankey()