Title: | Clinical Trial Registry History |
Version: | 2.1.11 |
Description: | Retrieves historical versions of clinical trial registry entries from https://ClinicalTrials.gov. Package functionality and implementation for v 1.0.0 is documented in Carlisle (2022) <doi:10.1371/journal.pone.0270909>. |
License: | AGPL (≥ 3) |
URL: | https://github.com/bgcarlisle/cthist |
BugReports: | https://github.com/bgcarlisle/cthist/issues |
Imports: | assertthat, dplyr, httr, jsonlite, lubridate, magrittr, readr, rlang, stringr, tibble, purrr, tidyr |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2024-07-17 15:06:07 UTC; researchfairy |
Author: | Benjamin Gregory Carlisle
|
Maintainer: | Benjamin Gregory Carlisle <murph@bgcarlisle.com> |
Repository: | CRAN |
Date/Publication: | 2024-07-17 16:10:02 UTC |
cthist: Clinical Trial Registry History
Description
Retrieves historical versions of clinical trial registry entries from https://ClinicalTrials.gov. Package functionality and implementation for v 1.0.0 is documented in Carlisle (2022) doi:10.1371/journal.pone.0270909.
Details
This package provides 4 functions for mass-downloading and interpreting historical clinical trial registry entry data from ClinicalTrials.gov
The functions for downloading clinical trial registry data from DRKS that were provided in versions 1.0.0 to 1.3.0 have been deprecated due to the re-writing of drks.de in a manner that broke the previous implementation of web-scraping
clinicaltrials_gov_dates() downloads the dates on which clinical trial registry entries were updated from ClinicalTrials.gov
clinicaltrials_gov_version() downloads a specified historical version of a clinical trial registry entry from ClinicalTrials.gov
clinicaltrials_gov_download() mass-downloads clinical trial registry entry versions for one or many trials on ClinicalTrials.gov
extract_publications() interprets a data frame provided by clinicaltrials_gov_download() and provides a new data frame with one row per publication of the type specified indexed by ClinicalTrials.gov per clinical trial registry history version.
overall_status_lengths() interprets a data frame provided by clinicaltrials_gov_download() or clinicaltrials_gov_dates() and provides a new data frame that indicates how many days were spent in each overall status.
Author(s)
Maintainer: Benjamin Gregory Carlisle murph@bgcarlisle.com (ORCID)
References
Carlisle, BG. Analysis of clinical trial registry entry histories using the novel R package cthist. medRxiv, 2022. doi: 10.1101/2022.01.20.22269538
See Also
Useful links:
Download a table of dates on which a ClinicalTrials.gov registry entry was updated
Description
Download a table of dates on which a ClinicalTrials.gov registry entry was updated
Usage
clinicaltrials_gov_dates(nctids, status_change_only = FALSE, quiet = TRUE)
Arguments
nctids |
A list of well-formed NCT numbers, e.g. c("NCT00942747", "NCT03281616"). (A capitalized "NCT" followed by eight numerals with no spaces or hyphens.) |
status_change_only |
If TRUE, returns only the dates marked with a Recruitment Status change, default FALSE. |
quiet |
A boolean TRUE or FALSE. If TRUE, no messages will be printed during download. TRUE by default, messages printed for every registry entry downloaded showing progress. |
Value
A table with three columns: the version number (starting from 0), the ISO-8601 formatted date on which there were clinical trial history version updates, and the trial's overall status on that date.
Examples
versions <- clinicaltrials_gov_dates("NCT00942747")
Mass-download registry entry historical versions from ClinicalTrials.gov
Description
This function will download all ClinicalTrials.gov registry records
for the NCT numbers specified. Rather than transcribing NCT numbers
by hand, it is recommended that you conduct a search for trials of
interest using the ClinicalTrials.gov web front-end and download
the result as a comma-separated value (CSV) file. The CSV can be
read in to memory as a data frame and the NCT Number
column can
be passed directly to the function as the nctids
argument.
Usage
clinicaltrials_gov_download(
nctids,
output_filename = NA,
quiet = FALSE,
earliest = FALSE,
latest = FALSE
)
Arguments
nctids |
A list of well-formed NCT numbers, e.g. c("NCT00942747", "NCT03281616"). |
output_filename |
A character string for a filename into which the data frame will be written as a CSV, e.g. "historical_versions.csv". If no output filename is provided, the data frame of downloaded historical versions will be returned by the function as a data frame. |
quiet |
A boolean TRUE or FALSE. If TRUE, no messages will be printed during download. FALSE by default, messages printed for every version downloaded showing progress. |
earliest |
A boolean TRUE or FALSE. If TRUE, only the earliest version of the registry entry will be downloaded, if FALSE, all versions will be downloaded. FALSE by default. Can be combined with latest. |
latest |
A boolean TRUE or FALSE. If TRUE, only the latest version of the registry entry will be downloaded, if FALSE, all versions will be downloaded. FALSE by default. Can be combined with earliest. |
Value
If an output filename is specified, on successful completion, this function returns TRUE and otherwise returns FALSE. If an output filename is not specified, on successful completion, this function returns a data frame containing the historical versions of the clinical trial that have been retrieved, and in case of error returns FALSE. After unsuccessful completion with an output filename specified, if the function is called again with the same NCT numbers and output filename, the function will check the output file for errors or incompletely downloaded registry entries, remove them and try to download the historical versions that are still needed, while preserving the ones that have already been downloaded correctly.
Examples
filename <- tempfile()
clinicaltrials_gov_download(c("NCT00942747",
"NCT03281616"), filename)
hv <- clinicaltrials_gov_download("NCT00942747")
Download a registry entry version from ClinicalTrials.gov
Description
Download a registry entry version from ClinicalTrials.gov
Usage
clinicaltrials_gov_version(nctid, versionno = 0)
Arguments
nctid |
A character string including a well-formed ClinicalTrials.gov NCT Number, e.g. "NCT00942747". (A capitalized "NCT" followed by eight numerals with no spaces or hyphens.) |
versionno |
An integer version number, e.g. 3, where 0 is the earliest version of the trial in question, 1 is the next most recent, etc. (Please note that this differs from the convention used in cthist v. <= 1.4.2, in which 1 is the earliest version of the trial in question.) If no version number is specified, the first version will be downloaded. If -1 (negative one) is specified, the latest version will be downloaded. |
Value
A list containing the overall status, enrolment, start date, start date precision (month or day) primary completion date, primary completion date precision (month or day), primary completion date type, minimum age, maximum age, sex, accepts healthy volunteers, inclusion/exclusion criteria, outcome measures, overall contacts, central contacts, responsible party, lead sponsor, collaborators, locations, reason why the trial stopped (if provided), whether results are posted, references data, organization identifiers and other secondary trial identifiers.
Examples
version <- clinicaltrials_gov_version("NCT00942747", 1)
Takes a data frame of the type provided by
clinicaltrials_gov_download()
and returns a new data frame
containing one row per publication of the publication type
specified indexed on ClinicalTrials.gov for every version of the
clinical trial record provided.
Description
This function does not connect to ClinicalTrials.gov, and only
interprets data that has already been downloaded by expanding the
nested JSON-encoded data in the references
column provided by
clinicaltrial_gov_version
.
Usage
extract_publications(df, types = c("RESULT", "BACKGROUND", "DERIVED"))
Arguments
df |
A data frame containing at least the following columns:
|
types |
A list of types to be returned or a character string if only one type specified, e.g. "RESULT" or c("RESULT", "BACKGROUND"). Allowed types: "RESULT", "BACKGROUND", "DERIVED". |
Value
A data frame with all the original columns, as well as an
additional three columns: pmid
, type
and citation
. The
new data frame will have one row per publication.
Examples
hv <- clinicaltrials_gov_download("NCT00942747", latest=TRUE)
extract_publications(hv)
Interpret downloaded version histories to determine how long in days a trial had any given overall status
Description
This function takes a data frame of the type produced by
clinicaltrials_gov_download()
or clinicaltrials_gov_dates()
and
interprets it to determine, for each clinical trial registry entry,
how many days were spent in each overall status (e.g. "RECRUITING",
"ACTIVE, NOT RECRUITING", etc.); upper and lower date bounds can
also be applied, to allow for returning only those dates that fall
within a time range of interest.
Usage
overall_status_lengths(
historical_versions,
start_date = NA,
end_date = NA,
carry_forward_last_status = TRUE
)
Arguments
historical_versions |
A data frame of the type produced by
|
start_date |
A date or character string in YYYY-MM-DD format specifying a date. If specified, only the length of time that is after the given start date will be counted. |
end_date |
A date or character string in YYYY-MM-DD format specifying a date. If specified, only the length of time that is before the given end date will be counted. |
carry_forward_last_status |
Boolean TRUE or FALSE. |
Value
A data frame with two columns: nctid
, which contains all
the distinct NCT numbers from the historical_versions data
frame provided, and days
, which contains the number of