Package: wordpiece
Type: Package
Title: R Implementation of Wordpiece Tokenization
Version: 2.1.3
Authors@R: c(
    person(given = "Jonathan",
           family = "Bratt",
           role = c("aut", "cre"),
           email = "jonathan.bratt@macmillan.com",
           comment = c(ORCID = "0000-0003-2859-0076")),
    person(given = "Jon",
           family = "Harmon",
           role = c("aut"),
           email = "jonthegeek@gmail.com",
           comment = c(ORCID = "0000-0003-4781-4346")),
    person(given = "Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning", 
           role = c("cph"))
    )
Description: Apply 'Wordpiece' (<arXiv:1609.08144>) tokenization to input text, 
 given an appropriate vocabulary. The 'BERT' (<arXiv:1810.04805>) tokenization 
 conventions are used by default.
Encoding: UTF-8
URL: https://github.com/macmillancontentscience/wordpiece
BugReports: https://github.com/macmillancontentscience/wordpiece/issues
Depends: R (>= 3.3.0)
License: Apache License (>= 2)
RoxygenNote: 7.1.2
Imports: dlr (>= 1.0.0), fastmatch (>= 1.1), memoise (>= 2.0.0),
        piecemaker (>= 1.0.0), rlang, stringi (>= 1.0), wordpiece.data
        (>= 1.0.2)
Suggests: covr, knitr, rmarkdown, testthat (>= 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2022-03-03 14:19:39 UTC; jonathan.bratt
Author: Jonathan Bratt [aut, cre] (<https://orcid.org/0000-0003-2859-0076>),
  Jon Harmon [aut] (<https://orcid.org/0000-0003-4781-4346>),
  Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]
Maintainer: Jonathan Bratt <jonathan.bratt@macmillan.com>
Repository: CRAN
Date/Publication: 2022-03-03 15:10:02 UTC
Built: R 4.0.5; ; 2022-04-21 05:36:16 UTC; windows
