BiocCheck SummaryBiocCheck encapsulates Bioconductor package
guidelines and best practices, analyzing packages and reporting three
categories of issues:
BiocCheck will continue past an
ERROR, thus it is possible to have more than one, but it
will exit with an error code if run from the OS command line.)BiocCheckBiocCheck is meant to run within R on a directory
containing an R package, or a source tarball (.tar.gz
file):
BiocCheck takes options which can be seen with
?BioCheck.
Note that the --new-package option is turned on in the
Single Package Builder (SPB) during the new package submission
process.
BiocCheck be runBiocCheck should always be run after
R CMD check.
Note that BiocCheck is not a replacement for
R CMD check; it is complementary. It should be run after
R CMD check completes successfully.
BiocCheck can also be run via GitHub Actions, a
continuous integration system on GitHub. This service allows automatic
testing of R packages in a controlled build environment.
See the biocthis package for more details.
BiocCheckBiocCheck should be installed as follows:
BiocCheck outputActual BiocCheck output is shown below in
bold.
To skip the installation step, set the --install flag to
NULL or FALSE. BiocCheck loads
the installed package to run some of its checks. When this flag is set
to NULL or FALSE, it will search for the
package in the first path in the local library (i.e.,
.libPaths()[1]).
Similar to R CMD check, the --install flag
can optionally include a check:<file> key-value pair
indicating the file where the installation log can be found (primarily
for compiled packages) and it will be copied into the
<pkg>.BiocCheck output folder.
Note that libloc can be included along with the
--install flag to specify a custom library location for
package installation. By default, when not set, libloc
points to the lib folder in the package installation
directory.
Checking for deprecated package usage…
Can be disabled with --no-check-deprecated.
At present, this looks to see whether your package has a dependency
on the multicore package (ERROR).
Our recommendation is to use BiocParallel. Note that ‘fork’ clusters do not provide any gain from parallelizing code on Windows. Socket clusters work on all operating systems.
Also checks Deprecated Packages currently specified in
release and devel versions of Bioconductor (ERROR).
Checking for remote package usage…
Can be disabled with --no-check-remotes
Bioconductor only allows dependencies that are hosted on CRAN or
Bioconductor. The use of Remotes: in the DESCRIPTION to
specify a unique remote location is not allowed.
Checking for ‘LazyData: true’ usage…
For packages that include data, we recommend not including
LazyData: TRUE. This rarely proves to be a good thing. In
our experience it only slows down the loading of packages with large
data (NOTE).
Can be disabled with --no-check-version-num and
--no-check-R-ver.
Checking version number…
Version: field
in your DESCRIPTION file. If it doesn’t, it usually means
you did not build the tarball with R CMD build.
(ERROR)99 ‘y’ version in the x.y.z versioning
scheme (ERROR). Package versions starting with a non-zero
value will get flagged with a warning. Typical new package submissions
start with a zero ‘x’ version (e.g., 0.99.*;
WARNING). This is only done if the
--new-package option is supplied. An ‘x’ nonzero will only
be accepted if the package was pre-released or published under such a
case.ERROR).Depends: field of your
DESCRIPTION file, BiocCheck checks to make
sure that the R version specified matches the version currently used in
Bioconductor. This helps to prevent mixing of Bioconductor
release and devel versions (esp. when R versions differ) which is a
frequent source of confusion and errors (NOTE).For more information on package versions, see the Version Numbering HOWTO.
Can be disabled with --no-check-pkg-size and
--no-check-file-size.
Checking package size Checks that the package size meets Bioconductor requirements. The current package size limit is 10 MB for Software packages. Experiment Data and Annotation packages are excluded from this check. This check is only run if checking a source tarball. (ERROR)
Checking individual file sizes The current size
limit for all individual files is 5 MB. Checks inspect both package-wide
files and data files found in the data,
inst/extdata, and data-raw folders.
(WARNING)
It may be necessary to remove large files from your Git history; see Remove Large Data Files and Clean Git Tree
These can be disabled with the --no-check-bioc-views
option, which might be useful when checking non-Bioconductor
packages (since biocViews is a concept unique to
Bioconductor).
Checking biocViews…
Can be disabled with --no-check-bioc-views
biocViews field is present in the DESCRIPTION file
(ERROR).ERROR).WARNING).WARNING).recommendBiocViews() function from biocViews
to automatically suggest some biocViews for your package.More information about biocViews is available in the Using biocViews HOWTO.
The Bioconductor Build System (BBS) is our nightly build system and it has certain requirements. Packages which don’t meet these requirements can be silently skipped by BBS, so it’s important to make sure that every package meets the requirements.
Can be disabled with --no-check-bbs
Checking build system compatibility…
Checking for blank lines in DESCRIPTION… Checks
to make sure there are no blank lines in the DESCRIPTION file
(ERROR).
Checking if DESCRIPTION is well formatted…
Checks if the DESCRIPTION can be parsed with read.dcf
(ERROR)
Checking Description: field length… Checks that the Description field in the DESCRIPTION file has a minimum
WARNING if less than 50)WARNING if less than 20)NOTE if less than 3)Checking for whitespace in DESCRIPTION field
names… Checks to make sure there is no whitespace in
DESCRIPTION file field names (ERROR).
Checking that Package field matches dir/tarball
name… Checks to make sure that Package field of
DESCRIPTION file matches directory or tarball name
(ERROR).
Checking for Version field… Checks to make sure
a Version field is present in the DESCRIPTION file
(ERROR).
Checking for valid maintainer… Checks to make
sure the DESCRIPTION file has a valid Authors@R field which
resolves to a valid Maintainer (ERROR).
A valid Authors@R field consists of:
person.cre (creator) role.family or
given name defined.NOTE if not.Suggests that the maintainer provide an ORCID iD in the Authors@R
field as an argument in the person function, e.g.,
comment = c(ORCID = ...) (NOTE).
License: in the DESCRIPTION
file does not restrict use, e.g., to academic-use only
(ERROR). Licenses are compared to R’s internal database
provided at $R_HOME/share/licenses/license.db and read with
read.dcf. Licenses not listed in the database or with
spelling deviations e.g., GPL-3.0 vs GPL-3 are
flagged with a NOTE. A NOTE is also generated
if the license is not a valid SPDX license identifier (with the
exception of those already in the database file) or if the license
cannot be verified in the database. A NOTE is also
generated if the License: field is malformed, or the
database cannot be located. We also recommend developers to browse to
the choosealicense to
find a suitable license for their package as well as the SPDX License List website.== in the DESCRIPTION file
(ERROR).DESCRIPTION file to see whether
recommended fields i.e., ‘URL’, ‘Date’ and ‘BugReports’ are populated
(NOTE). Date field is checked for the format
YYYY-MM-DD.Depends: and Imports: fields;
if none (WARNING).Authors@R field has a person or organization with the
fnd (funder) role (NOTE).Can be disabled with --no-check-namespace.
exportPattern("^[[:alpha:]]+") or similar
(ERROR)..Rbuildignore file
(ERROR).<package_name>.BiocCheck folder
byproduct when running BiocCheck(".") locally does not get
included in the package directory (ERROR).R CMD build; therefore, inst/doc folder is not
needed (ERROR).Can be disabled with --no-check-vignettes.
Checking vignette directory…
vignettes directory exists
(ERROR).vignettes directory only contains
vignette sources (.Rmd, .qmd, .Rnw, .Rrst, .Rhtml,
.Rtex) (ERROR)..Rnw vignettes, if any found, suggest
RMarkdown (.Rmd) vignettes instead
(WARNING).qmd vignettes, if any found, a
‘SystemRequirements’ field must be present in the
DESCRIPTION file (WARNING). The field should
list quarto as a system requirement.ERROR).WARNING)ERROR)WARNING)eval=FALSE chunks is more
than 50% of the total (WARNING).eval=FALSE. The majority of vignette code is expected to be
evaluated (WARNING)BiocInstaller code
(WARNING)sessionInfo() or
session_info() for reproducibility
(NOTE).ERROR). Function calls in the
vignette that match install.* will produce a warning
(WARNING).ERROR).NOTE).Checking whether vignette is built with ‘R CMD build’…
Only run when --build-output-file is specified.
Analyzes the output of R CMD build to see if vignettes
are built. It simply looks for a line that starts:
* creating vignettes ...
If this line is not present, it means R has not detected
that a vignette needs to be built (ERROR).
If you have vignette sources yet still get this message, there could be several causes:
VignetteBuilder line in the
DESCRIPTION file.VignetteEngine line in the vignette
source.See knitr’s package vignette page,
or the Non-Sweave
vignettes section of “Writing R Extensions” for more
information.
Can be disabled with --no-check-library-calls and
--no-check-install-self.
NOTE) Check
for use of functions that install or update packages. This list
currently includes the use of install,
install.packages, install_packages
update.packages or biocLite.ERROR) It is not necessary to call
library() or require() on your own package
within code in the R directory or in man page examples. In these
contexts, your package is already loaded.Can be disabled with --no-check-coding-practices.
Checking coding practices…
Checks to see whether certain programming practices are found in the R directory.
We recommend that vapply() be used instead of
sapply(). Problems arise when the X argument
to sapply() has length 0; the return type is then a
list() rather than a vector or array.
(NOTE)
We recommend that seq_len() or
seq_along() be used instead of 1:.... This is
because the case 1:0 creates the sequence
c(1, 0) which may be an unexpected or unwanted result
(NOTE).
Single colon typos are checked for when a user inputs
‘package:function’ instead of using double colons (‘::’) to import a
function (ERROR).
Users should not download data from external hosting platforms.
This means avoiding references to major platforms such as GitHub,
GitLab, and BitBucket. For the same reason we do not import GitHub
packages, external data can be unstable and not well maintained.
Maintainers should re-use data already available in Bioconductor or
contribute an ExperimentHub, AnnotationHub or
similar package (ERROR).
A package should not download files at the time of loading or
attaching i.e., using library. Using
download.file and download should be avoided
and when found, an ERROR will be emitted.
paste and paste0 function calls within
signaling functions such as message, warning,
and stop are redundant and should be avoided
(NOTE). paste calls with the
collapse argument are ignored.
When notifying users, message should be used. When
cat and print are used, users will get a note
saying that these should only be used in show methods for classes
(NOTE).
message, warn*, and error
keywords should not be included in signal condition functions:
message, warning, and stop. This
is redundant and should be avoided (NOTE).
It is favorable to use the assignment arrow (‘<-’) over the
equals assignment (‘=’) for clarity in the code and legibility. Any use
of the = will be flagged with a NOTE.
New submissions should not use any .Deprecated,
.Defunct, lifeCycle,
deprecate_warn, or deprecate_stop function
calls (WARNING). Existing packages should evolve these
functions after a Bioconductor release according to the package
guidelines.
Checking for T… Checking for F…
It is bad practice to use T and F for
TRUE and FALSE. This is because T
and F are ordinary variables whose value can be altered,
leading to unexpected results, whereas the value of TRUE
and FALSE cannot be changed
(WARNING).
Avoid class membership checks with class() /
is() and == / !=. Developers
should use is(x, 'class') for S4 classes.
(WARNING)
Use system2() over system(). ‘system2’
is a more portable and flexible interface than
‘system’.(NOTE)
Use of set.seed() in R code. The
set.seed should not be set in R functions directly. The
user should always have the option for the set.seed and know when it is
being invoked. (WARNING)
Checking parsed R code in R directory, examples, vignettes…
BiocCheck parses the code in your package’s R directory,
and in evaluated man page and vignette examples to look for various
symbols, which result in issues of varying severity.
BiocCheck checks for direct slot access (via @
or slot()) to S4 objects in vignette and example code. This
code should always use accessors to interact with S4
classes. Since you may be using S4 classes (which don’t provide
accessors) from another package, the severity is only NOTE.
But if the S4 object is defined in your package, it’s
mandatory to write accessors for it and to use them
(instead of direct slot access) in all vignette and example code
(NOTE).browser()
causes the command-line R debugger to be invoked, and should not be used
in production code (though it’s OK to wrap such calls in a conditional
that evaluates to TRUE if some debugging option is set)
(WARNING).<<- is bad practice. It can over-write user-defined
symbols, and introduces non-linear paths of evaluation that are
difficult to debug (NOTE).Sys.setenv
function (ERROR).suppressWarnings and
suppressMessages is problematic as it usually indicates a
larger underlying issue with the fragility of the package codebase
(NOTE).Can be disabled with --no-check-function-len.
Checking function lengths…
BiocCheck prints an informative message about the length
(in lines) of your five longest functions (this includes functions in
your R directory and in evaluated man page and vignette examples).
If there are functions longer than 50 lines, BiocCheck
outputs (NOTE). You may want to consider breaking up long
functions into smaller ones. This is a basic refactoring technique that
results in code that’s easier to read, debug, test, reuse, and
maintain.
Can be disabled with --no-check-man-doc.
Checking man page documentation…
It can be handy to generate man page skeletons with
prompt() and/or RStudio. These skeletons contain comments
that look like this:
%% ~~ A concise (1-5 lines) description of the dataset. ~~
BiocCheck asks you to remove such comments
(NOTE).
Every man page must have a non-empty \value section
(WARNING). Rd pages without
\usage sections or documenting data sets are
excluded.
man page examples examples
Checking exported objects have runnable examples…
BiocCheck looks at all man pages which document exported
objects and lists the ones that don’t contain runnable examples (either
because there is no examples section or because its
examples are tagged with dontrun or donttest).
Runnable examples are a key part of literate programming and help ensure
that your code does what you say it does.
ERROR).BiocCheck lists the
missing ones and asks you to add runnable examples to them
(NOTE).dontrun or donttest.
Use of these functions is not recommended and shoud be justified
(NOTE). If exception is made the recommended usage is to
use donttest over dontrun (NOTE) as donttest requires valid
R code.Can be disabled with --no-check-news.
Checking package NEWS…
BiocCheck looks to see if there is a valid NEWS file
either in the ‘inst’ directory or in the top-level directory of your
package, and checks whether it is properly formatted
(NOTE).
The location and format of the NEWS file must be consistent with
?news. Meaning the file can be one of the following four
options:
inst/NEWS.Rd./NEWS.md./NEWSinst/NEWSNEWS files are a good way to keep users up-to-date on changes to your
package. Excerpts from properly formatted NEWS files will be included in
Bioconductor release announcements to tell users what has
changed in your package in the last release. In order for this to
happen, your NEWS file must be formatted in a specific way; you may want
to consider using an inst/NEWS.Rd file instead as the
format is more well-defined. Malformatted NEWS file outputs
WARNING.
More information on NEWS files is available in the help topic
?news.
Can be disabled with --no-check-unit-tests.
Checking unit tests…
We strongly recommend unit tests, though we do not at present require them. For more on what unit tests are, why they are helpful, and how to implement them, read our Unit Testing HOWTO.
At present we just check to see whether unit tests are present, and
if not, urge you to add them (NOTE).
Checking skip_on_bioc() in tests…
Can be disabled with --no-check-skip-bioc-tests.
Finds flag for skipping tests in the bioconductor environment
(NOTE)
Can be disabled with --no-check-formatting.
Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and vignette source…
There is no 100% correct way to format code. These checks adhere to
the Bioconductor
Style Guide (NOTE).
We think it’s important to avoid very long lines in code. Note that some text editors do not wrap text automatically, requiring horizontal scrolling in order to read it. Also note that R syntax is very flexible and whitespace can be inserted almost anywhere in an expression, making it easy to break up long lines.
These checks are run against not just R code, but the DESCRIPTION and NAMESPACE files as well as man pages and vignette source files. All of these files allow long lines to be broken up.
The output of this check includes the first few lines of offending
code for many of the various checks. To see the full output, users are
encouraged to run the check locally and browse to the
BiocCheck output folder that was created with the
<PackageName>.BiocCheck naming convention. The full
output log file is named 00BiocCheck.log.
There are several helpful packages that can be used for formatting of
R code to particular coding standards such as formatR and styler as well as
the “Reformat code” button in RStudio
Desktop. Each solution has its advantages, though styler works with
roxygen2 examples and is actively maintained. You can
re-format your code using styler as shown
below:
## Install styler if necessary
if (!requireNamespace("styler", quietly = TRUE)) {
install.packages("styler")
}
## Automatically re-format the R code in your package
styler::style_pkg(transformers = styler::tidyverse_style(indent_by = 4))If you are working with RStudio
Desktop use also the “Reformat code” button which will help you
break long lines of code. Alternatively, use formatR, though
beware that it can break valid R code involving both types of quotation
marks (" and ') and does not support
re-formatting roxygen2 examples. In general, it is best to
version control your code before applying any automatic re-formatting
solutions and implement unit tests to verify that your code runs as
intended after you re-format your code.
Checking if package already exists in CRAN… This
can be disabled with the --no-check-CRAN option. A package
with the same name (case differences are ignored) cannot exist on CRAN.
Packages submitted to Bioconductor must be removed from CRAN before the
next Bioconductor release (WARNING).
Checking if new package already exists in
Bioconductor… Only run if the
--new-package flag is turned on. A package with the same
name (case differences are ignored) cannot exist in
Bioconductor (ERROR).
Checking for bioc-devel mailing list subscription…
This only applies if BiocCheck is run on the
Bioconductor build machines, because this step requires special
authorization. This can be disabled with the
--no-check-bioc-help option.
Checking for support site registration…
Check that the package maintainer is register at our support site using the same
email address that is in the Maintainer field of their
package DESCRIPTION file (ERROR). This can be
disabled with the --no-check-bioc-help option.
The main place people ask questions about Bioconductor packages is the support site. Please register and then include your package name in the list of watched tags. When a question is asked and tagged with your package name, you’ll get an email.
Package name is in support site watched tags is now a requirement.
BiocCheckGitCloneBiocCheckGitClone provides a few additional
Bioconductor package checks that can only should be run on a
open source directory (raw Git clone) NOT a tarball. Reporting similarly
in three categories as discussed above:
ERROR.
WARNING.
NOTE.
Developers should check packages with BiocCheck() within
an R session while BiocCheckGitClone() is meant to run on
the source package folder i.e., the cloned git repository.
BiocCheckGitCloneBiocCheckGitClone is meant to run within R on a
directory containing an R package:
BiocCheckGitClonePlease see previous Installing BiocCheck section.
BiocCheckGitClone outputActual BiocCheckGitClone output is shown below in
bold.
Checking valid files
There are a number of files that should not be Git tracked. This
check notifies if any of these files are present
(ERROR)
The current list of files checked are given by this internal constant:
## [1] ".renviron" ".rprofile" ".rproj" ".rproj.user"
## [5] ".rhistory" ".rapp.history" ".o" ".sl"
## [9] ".so" ".dylib" ".a" ".dll"
## [13] ".def" ".ds_store" "unsrturl.bst" ".log"
## [17] ".aux" ".backups" ".cproject" ".directory"
## [21] ".dropbox" ".exrc" ".gdb.history" ".gitattributes"
## [25] ".gitmodules" ".hgtags" ".project" ".seed"
## [29] ".settings" ".tm_properties" ".rdata"
These files may be included in your personal directories but should
be added to a .gitignore file so they are not Git
tracked.
Checking DESCRIPTION
Default R CMD build behavior will format the DESCRIPTION file; After this occurs, it is hard to determine certain aspects of the original DESCRIPTION file. An example would be how the Authors and Maintainers are specified. The DESCRIPTION file is therefore checked in its raw original form.
Checking if DESCRIPTION is well formatted The
DESCRIPTION file must be properly formatted and able to be read in with
read.dcf() in order to function properly on the
Bioconductor build machines. This check attempts to
read.dcf("DESCRIPTION") and throws an ERROR if
mal-formatted. (ERROR)
Checking for valid maintainer While in the past
using the Author and Maintainer fields were acceptable, R has moved
towards using the Authors@R standard for listing package
contributors. This checks that Authors@R is utilized and that there are no instances
of Author or Maintainer in the DESCRIPTION (ERROR)
Checking that CITATION file is correctly formatted
BiocCheck tries to read the provided
CITATION file (i.e. not the one automatically generated by
each package) with readCitationFile() - this is expected to
be in the INST folder (NOTE).
readCitationFile() needs to work properly without the
package being installed. Most common causes of failure occur when trying
to use helper functions like packageVersion() or
packageDate() instead of using meta$Version or
meta$Date. See R
documentation for more information.
Here is an example of a formatted CITATION file. See the
GenomicRanges package CITATION file for
details.
## Warning in file(con, "r"): file("") only supports open = "w+" and open = "w+b":
## using the former
## <0-length citation>
CITATION files are expected to contain a
doi input within the bibentry() function call.
When a doi input is not present, a WARNING is
emitted as most modern publications should have an assigned DOI.
Note that citEntry() should be updated to
bibentry() as seen with
R CMD check --as-cran.
Bioconductor packages are not required to have a
CITATION file but it is useful both for users and for
tracking Bioconductor project-wide metrics. Maintainers should update
the CITATION file once a preprint or publication is
released. Packages that do not have a CITATION file are
flagged with a NOTE.
BiocCheckWe make an effort to reduce package reviewer burden and to increase
the quality of Bioconductor submissions via automated checks; therefore,
BiocCheck is a continually evolving package. Contributions
are certainly most welcome. Consider opening a pull request on GitHub
with unit tests and updates to both the NEWS file and
vignette. Thank you for being part of the community!
## R version 4.5.2 (2025-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] BiocCheck_1.47.18 BiocStyle_2.39.0
##
## loaded via a namespace (and not attached):
## [1] rappdirs_0.3.4 sass_0.4.10 generics_0.1.4
## [4] bitops_1.0-9 RSQLite_2.4.5 digest_0.6.39
## [7] magrittr_2.0.4 evaluate_1.0.5 fastmap_1.2.0
## [10] blob_1.3.0 jsonlite_2.0.0 graph_1.89.1
## [13] DBI_1.2.3 BiocManager_1.30.27 stringdist_0.9.17
## [16] XML_3.99-0.20 codetools_0.2-20 httr2_1.2.2
## [19] jquerylib_0.1.4 cli_3.6.5 rlang_1.1.7
## [22] dbplyr_2.5.1 Biobase_2.71.0 bit64_4.6.0-1
## [25] cachem_1.1.0 yaml_2.3.12 otel_0.2.0
## [28] BiocBaseUtils_1.13.0 parallel_4.5.2 tools_4.5.2
## [31] memoise_2.0.1 dplyr_1.1.4 filelock_1.0.3
## [34] BiocGenerics_0.57.0 curl_7.0.0 RUnit_0.4.33.1
## [37] buildtools_1.0.0 vctrs_0.7.1 R6_2.6.1
## [40] stats4_4.5.2 biocViews_1.79.0 BiocFileCache_3.1.0
## [43] lifecycle_1.0.5 RBGL_1.87.0 bit_4.6.0
## [46] pkgconfig_2.0.3 bslib_0.10.0 pillar_1.11.1
## [49] glue_1.8.0 xfun_0.56 tibble_3.3.1
## [52] tidyselect_1.2.1 sys_3.4.3 knitr_1.51
## [55] htmltools_0.5.9 rmarkdown_2.30 maketools_1.3.2
## [58] compiler_4.5.2 RCurl_1.98-1.17
BiocCheck SummaryBiocCheckBiocCheck
be runBiocCheckBiocCheck output
BiocCheckGitCloneBiocCheckGitCloneBiocCheckGitCloneBiocCheckGitClone output
BiocCheck