% %% Copyright 2026, Joppe W. Bos and Kevin S. McCurley %% %% This work may be distributed and/or modified under the %% conditions of the LaTeX Project Public License, either version 1.3c %% of this license or (at your option) any later version. %% The latest version of this license is in %% https://www.latex-project.org/lppl.txt %% %% This work has the LPPL maintenance status `maintained'. %% %% The Current Maintainer of this work is Kevin S. McCurley, %% %% %% This work consists of the files metacapture.sty, metacapture-doc.tex, %% metacapture-doc.bib, metacapture-doc.pdf, and metacapture-sample.tex. \DocumentMetadata{lang=en,debug={xmp-export}} \def\pkgversion{0.9.1} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % It should work with many choices of documentclass. \documentclass{article} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \usepackage{tcolorbox} \newcommand{\BibTeX}{{\rmfamily B\kern-.05em% \textsc{i\kern-.025em b}\kern-.08em% T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}} \usepackage{verbatim} \newcommand{\pkgname}{\texttt{metacapture}} \usepackage{fancyvrb} \usepackage{tabularx} \usepackage{xurl} \newcommand{\iacrcc}{\texttt{iacrcc}} \newcommand{\cmd}[2][]{% \def\FirstArg{#1}% \ifx\FirstArg\empty% \texttt{\textbackslash{}#2}% \else% \texttt{\textbackslash{}#2\{#1\}}% \fi } \newcommand{\pkg}[1]{\texttt{#1}} \makeatletter \@ifclassloaded{iacrj}{}{\bibliographystyle{plainurl} \usepackage[cityrequired]{metacapture} } \makeatother \title[plaintext={The metacapture LaTeX package}, running={The \pkgname\ \LaTeX\ package v\pkgversion}, ]{The \pkgname\ \textrm{\LaTeX} package v\pkgversion\footnote{Footnotes on titles do not use \cmd{thanks}}} \subtitle[plaintext={Structured metadata from authors}]{Structured metadata from authors} \addauthor[orcid = {0000-0003-1010-8157}, inst = {1}, onclick = {https://www.joppebos.com}, email = {joppe.bos@nxp.com}, surname = {Bos} ]{Joppe W. Bos} \addauthor[orcid = {0000-0001-7890-5430}, inst = {2}, footnote={Authors are allowed to have footnotes on their name.}, email = {latex@digicrime.com}, % onclick = {https://swcp.com/\%7Emccurley/index.html\#humor}, surname = {McCurley}, ]{Kevin S. McCurley} \addaffiliation[ror = {031v4g827}, street = {Interleuvenlaan 80}, city = {Leuven}, postcode = {3001}, country = {Belgium}, countrycode = {BE} ]{NXP Semiconductors} \addaffiliation[country={United States}, % countrycode={US}, we omit this. department={Department of Redundancy Department}, state={California}, city={San Jose}]{Unaffiliated} \addfunding[country={United States}]{IACR} \addkeywords[Metadata, publishing, LaTeX]{Metadata, publishing, \LaTeX} \license{CC-BY-4.0} %\license{CC0-1.0} \makeatletter \@ifclassloaded{iacrj}{\genericfootnote{We can add generic footnotes with \texttt{iacrj.cls}.}}{} \makeatother \usepackage{todonotes} \usepackage{framed} \newcommand{\todok}[1]{\todo[inline,color=green!20]{K: #1}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% end of preamble %%%%%%%%%%%%%%%%%%%% %\RequirePackage[mathlines]{lineno} %\def\linenumberfont{\normalfont\tiny\sffamily\color{gray}} %\AtBeginDocument{\linenumbers} %%%%%%%%%%%% %\addbibresource{metacapture-doc.bib} \begin{document} \hypersetup{colorlinks=true} \maketitle \renewenvironment{abstract}{\begin{quote}\noindent\textbf{\textsf{Abstract.}}}{\end{quote}} \begin{abstract} This document describes the \pkgname\ \LaTeX\ package that can be used to capture metadata during the compilation of a \LaTeX\ document. This package is intended for use by document class designers as part of a journal publishing workflow. Authors provide their title, author information, affiliation, license, etc in macros that are used to produce the final document as well as a machine-parseable external text file. This external text file can then be used in a publishing workflow to provide HTML pages, JATS, and registration with indexing agencies like crossref. This packages comes with several implementations of a default \cmd{maketitle} command that can be invoked from the package at load time, or the document class designer can design their own using documented internal variables of the package. \end{abstract} \begin{textabstract} This document describes the metacapture LaTeX package that can be used to capture metadata during the compilation of a LaTeX document. This package is intended for use by document class designers as part of a publishing workflow. Authors provide their title, author information, affiliation, license, etc in macros that are used to produce the final document as well as a machine-parseable external text file. This external text file can then be used in a publishing workflow to provide HTML pages, JATS, and registration with indexing agencies like crossref. This package also provides several implementations of a default maketitle command that can be invoked from the package at load time, or the document class designer can design their own using documented internal variables of the package. \end{textabstract} %% \ExplSyntaxOn %% HERE: \g_metac_displayemails_tl %% \ExplSyntaxOff \section{Motivation} The original goal of \TeX\ was focused on typesetting and the appearance of the output on paper. With the later invention of \LaTeX, Lamport advised authors that \begin{quote}As you are writing your document, you should be concerned with its logical structure, not its visual appearance. The \LaTeX\ approach to typesetting can therefore be characterized as \emph{logical design}.~\cite[\S 1.4]{latex} \end{quote} Users were encouraged to use high-level macros like \cmd{section}, and leave the decisions like how much space to put before or after a section to the style that is used. This separation between structure and appearance is an example of a more general concept from computer science known as ``separation of concerns''. The goal of the \pkgname\ package is to extend this concept to metadata about a publication. Authors provide their metadata (e.g., title, subtitle, keywords, license, author names, emails, ORCID, funding, affiliations, etc) without any styling, and the display of this is left completely to the document class. We should mention that we insist on metadata consisting of only text elements, and not \LaTeX\ macros. We allow some simple things like accents \verb+\"u+ to slip through because they are easily converted to just text in a post-processing step. We also allow mathematics inside titles and abstracts. One of the purposes for the \pkgname\ package is to check that authors comply with the restriction to avoid user-defined macros in their metadata. Authors are still able to use macros and stylized text in their titles, subtitles, and abstracts, but we require authors to also supply ``plain text'' versions of all metadata. \subsection{Previous approaches} The \cmd{author} macro represents a fundamental limitation of the original \LaTeX\ \pkg{article} class, because authors are asked to include formatting as part of their author list using newline codes and macros such as \cmd{and} and \cmd{thanks}. If you peek under the covers, the default implementation of the \cmd{and} macro used to separate authors in the \cmd{author} macro is given in \texttt{latex.ltx} as \begin{Verbatim}[samepage=true] \DeclareRobustCommand\and{% % \begin{tabular} \end{tabular}% \hskip 1em \@plus.17fil% \begin{tabular}[t]{c}}% % \end{tabular} \end{Verbatim} The \cmd{and} macro therefore ends up serving two purposes, namely as a delimiter between author markup blobs and as a spacing instruction. This clearly violates the separation of concerns principle because it mixes structure and appearance. Some document classes such as \texttt{acmart} and \texttt{llncs} redefine the \cmd{and} macro for other purposes. In almost all cases, authors need to associate other metadata elements with their name, such as email, affiliations, ORCID, funding, etc. Depending on which document class they use, authors have various choices for this such as the \cmd{thanks} macro to create footnotes, or things like the \cmd{orcidlink} macro of the \texttt{orcidlink} package. Both of these are implemented as visual display macros, which again violates the separation of concerns principle. More modern document classes have recognized the need for metadata to be associated with articles and authors, and each of them has invented their own way to encode this data. The \pkg{llncs} class extends \pkg{article} and still uses a single \cmd{author} macro with authors separated by \cmd{and}, but intersperses other macros like \cmd{orcidID} and \cmd{inst} inside the \cmd{author} macro to annotate the individual authors. The implementation of the \cmd{inst} and \cmd{orcidID} macros are still based on layout rather than structure. The \pkg{IEEEtran} class also takes this approach. The \pkg{acmart}, \pkg{amsart}, and \pkg{revtex4-2} document classes all use a sequence of \cmd{author} commands for each author, with intervening macros such as \cmd{orcid}, \cmd{affiliation}, \cmd{email}, etc.\ to describe the metadata for each author. The \pkg{acmart} package also defines a \cmd{additionalaffiliation} macro in case the layout of affiliations takes too much space, but this places the burden of layout back on the author instead of the document class. The \pkg{elsarticle} document class also uses a sequence of \cmd{author} macros for the authors, but the main argument contains embedded footnote marks. Email addresses and home page links are inserted by intervening \cmd{ead} macros. Affiliations are specified with a main argument that consists of a set of key-value pairs. This bears some resemblance to our approach, but the style of metadata entry determines the styling of frontmatter, which once again violates the separation of concerns principle. There have been several packages such as \pkg{titling}\footnote{See \url{https://ctan.org/pkg/titling}} and \pkg{authblk}\footnote{See \url{https://ctan.org/pkg/authblk}} that offer some flexibility in how authors provide their metadata, but none of them are sufficiently detailed for modern metadata requirements. \section{Standard metadata schemas in publishing} There has unfortunately been no effort among \LaTeX\ document classes to standardize the syntax for entering metadata, or even which fields to associate with an author.\footnote{For example, the \pkg{llncs} class associates emails with affiliations instead of authors.} This is annoying for authors who try to adapt their \LaTeX\ from one publisher format to another. Moreover, the metadata associated with authors and articles has become increasingly complicated over the years, with new requirements to identify authors by a unique ID (ORCID), as well as the need to identify institutions by their ROR ID\footnote{See \url{https://ror.org/}} and a need to identify funding sources with standard identifiers like the funder ID.\footnote{See \url{https://www.crossref.org/services/funder-registry/}} In the world of scholarly journal publishing, there have been several efforts to standarize schemas for metadata. One of the best examples of this is the schema used by \texttt{crossref.org} for requests to register a DOI.\footnote{See \url{https://data.crossref.org/reports/help/schema_doc/5.4.0/index.html}} Another well-designed schema is described in the Journal Article Tag Standard (JATS)\footnote{See \url{https://jats.nlm.nih.gov/publishing/tag-library/1.4/}} that is used as a structured document format by many publishers. We took our guidance from these two schemas in how to represent metadata in a \LaTeX\ document. In fact, our workflow creates both the crossref format and the \texttt{} and \texttt{} sections of a JATS document. This package does not attempt to cover all possible metadata associated with an article, but see section~\ref{missing}. Authors and affiliations are listed independently, with an \texttt{inst} argument for an author to indicate which affiliation is associated to an author. Funding is associated to the document itself rather than the author, in keeping with the schemas provided by \texttt{crossref.org} and JATS. \section{Our solution} The processing of metadata really has four parts to it: \begin{enumerate} \item \label{supply}Authors use \LaTeX\ macros to supply their metadata in a well-structured format. \item \label{markup}When compiled, the metadata is used to perform visual markup of the front matter (e.g., title, authors, affiliations, keywords, etc). This is completely under the control of the document class, but this package supplies multiple versions of the \cmd{maketitle} macro to assist in the process. \item \label{extract}When the article is published, the metadata is {\em extracted} from the author-supplied document and used in the publishing workflow. This author-supplied metadata is combined with publisher-supplied metadata such as volume number, issue number, dates, etc. All of the metadata can then be registered with indexing agencies and used to supply structured data for the journal web pages and later harvesting agents. \item \label{xmp} The metadata may be embedded into the output PDF or HTML. \end{enumerate} This package addresses all four steps, and they are addressed in the subsections that follow. \subsection{Author-supplied metadata} The primary macros used by authors are \cmd{title}, \cmd{subtitle}, \cmd{addauthor}, \cmd{addaffiliation}, \cmd{addfunding}, \cmd{license}, and \cmd{addkeywords}. A complete description of these is in Section~\ref{authorusage}. The author enters only the data with these macros, omitting all formatting of how authors are to be displayed in the front matter. Macros other than accents are forbidden in the primary argument to \cmd{author}, and in particular \cmd{thanks} is disabled. This is so that the package can clearly identify the name of the author. Any attributes to the author such as email are added as optional key-value pairs (\cmd{thanks} is replaced by a \pkg{footnote} attribute to \cmd{addauthor}). In our first implementation of metadata capture~\cite{tugboat}, the metadata extraction was intertwined with the document class \texttt{iacrcc}~\cite{iacrcc}. In this \pkgname\ package we have separated out the macros to capture the metadata from the formatting of metadata. This completes the separation of metadata capture from document formatting, and allows document classes to style their documents however they like. \subsubsection{Abstracts} There is a bit of ambiguity in what constitutes ``metadata'' about an article. While we have attempted to cover the most important elements, we also list some additional elements in Section~\ref{missing}. One element that is problematic is abstracts. These are supported metadata in the Crossref schema, and they have encouraged publishers to submit them as part of the Initiative for Open Abstracts (I4OA). While abstracts can be useful for summarization and discovery, there are a few problems associated with treating abstracts as metadata. For one thing, some journals treat them as copyrighted material, whereas many institutions like the Research Library Association argue that metadata should be made available under a CC0 license. Quite a few publishers incuding ACM, Elsevier, and Springer were \href{https://www.crossref.org/blog/open-abstracts-where-are-we/}{withholding abstracts} from their metadata in 2020. Another complication that arises with abstracts is in formatting. The crossref schema an abstract to be formatted in JATS format, and the conversion from \LaTeX\ to JATS can be problematic. Some authors treat their abstracts as mini-articles and use all sorts of formatting including displayed equations, bibliographic references, tables, bulleted lists, etc. They also often use user-defined macros in their abstracts. For this reason, it can be compilicated to encode abstracts as metadata. Due to the complications associated with abstracts, we have decided to pursue a middle ground between trying to restrict author content in abstracts and successfully capturing an abstract that can easily be encoded as metadata. We do not modify the \texttt{abstract} environment, but instead have a load-time option to require a \texttt{textabstract} environment that will result in the contents being captured to an external file. \subsection{Display of metadata}\label{maketitle} When it is displayed, the metadata of an article is called the ``front matter'', and there are many different styles for this to be displayed, often with a custom \cmd{maketitle} macro. Despite the name, the \cmd{maketitle} macro is often responsible for display of author information, and sometimes also responsible for display of abstract, keywords, and license. The display of front matter can be quite complicated, with authors having multiple affiliations, authors sharing affiliations, footnotes attached to titles and authors, etc. For example, \url{https://arxiv.org/pdf/2210.03375} has hundreds of authors, 75 affiliations, and 12 footnotes on author names (not surprisingly, they omit email addresses). The author metadata is inherently {\em relational}, with authors related to their affiliations, and other attributes. These relationships are often represented visually with footnote structures.\footnote{One problem with this is that the standard \cmd{footnote} macro does not work inside boxes that may be used to construct the front matter. This shows up in some two-column formats because the title and author names are typically displayed in a block across both columns. We use the \texttt{footnotehyper} package to overcome this.} There are numerous common styles for displaying this information, including listing author affiliations under each author's name (repeating the information), or using footnotes to show affiliations for authors, or grouping authors together for a given institute, or authors ordered in some way (e.g., alphabetically or randomly). The \pkg{amsart} class places the affiliations {\em after} the body of the article as endnotes, and so does \pkg{OUP-EJ} for \emph{The Economic Journal}. This package supplies several ways to display the front matter of the document. This is done by having various implementations of \cmd{maketitle} that can be selected at load time. This particular document is typeset with the standard \texttt{article} document class, for which default values of \cmd{@title} and \cmd{@author} are supplied to just work out of the box with the existing \cmd{maketitle}. Document classes are free to use one of the built-in implementations of \cmd{maketitle}, but they can also provide their own. At present, the styles consist of the following (visual appearance of each is displayed in Appendix~\ref{appendix}): \begin{description} \item[\texttt{iacrj}] Author names are strung together in a list, with optional ORCID icons after their names, and footnotes to indicate which affiliations they belong to. Affiliations are listed individually under the block of author names. This is the official version used by the \texttt{iacrj.cls} document class for IACR journals. It is similar to the first style of \texttt{elsarticle}. See page~\pageref{iacrj}. \item[\texttt{acmsmall}] This is similar to the \texttt{acmsmall} style of \texttt{acmart.cls}, with one author per line in a vertical list, with author names in small caps followed by their affiliations and countries. See page~\pageref{acmsmall}. \item[\texttt{acmconf}] This is similar to the conference proceedings style of \texttt{acmart.cls}. Each author is listed in a block with their email and affiliations underneath their name. Shared affiliations are repeated under each author's name, and links to home page and ORCID are omitted. See page~\pageref{acmconf}. \item[\texttt{jems}] Modeled after the style used for the {\em Journal of the European Mathematical Society}, in which author names appear before the title, keywords after the abstract, and each author has an unnumbered footnote that includes the affiliation, email, and URL. See page~\pageref{jems}. \item[\texttt{inv}] A left-aligned style inspired in part by the style of {\em Inventiones mathematicae} in that it uses blocks of text to display emails. See page~\pageref{inv}. \item[\texttt{lipics}] This is modeled after the style of the Dagstuhl \texttt{lipics-v2021} document class. It shows icons for the author email, homepage, and orcid. See page~\pageref{lipics}. \item[\texttt{ams}] This is similar to what is used in \texttt{amsart}, namely a title and author names in small caps, with affiliations listed after the references. For some reason this style has author footnotes with \href{https://ctan.math.washington.edu/tex-archive/info/amscls-doc/Author_Handbook_Journals.pdf}{no footnote mark}, so the footnote has to mention the author to give context in the footnote. See page~\pageref{ams}. \end{description} The visual appearance of these styles can be seen in Appendix~\ref{appendix} at the end of this document. There is also a \texttt{sample.tex} file supplied with this package that can be used to test the combination of these \cmd{maketitle} styles with various document classes. With the exception of the \texttt{iacrj} style, none of these represent the official styles of their respective publishers. These styles are included to allow authors to choose a preferred style, but also to demonstrate the flexibility of the schema and to provide useful examples for document class designers who wish to to implement their own \cmd{maketitle} using the internal variables documented in Section~\ref{variables}. We believe that this should simplify the construction of a \cmd{maketitle} macro, since the variables hold only metadata without formatting. \subsubsection{Abstracts} In the original \LaTeX\ document classes, the abstract was considered merely as a preliminary section of the document with special styling, and it would appear after the \cmd{maketitle} macro. Some document classes have started treating the abstract as part of the frontmatter, and delegate the display of it to \cmd{maketitle}. As a result, some document classes like \texttt{amsart}, \texttt{acmart}, \texttt{elsarticle}, and \texttt{REVTeX} now require the \texttt{abstract} environment to appear before the \cmd{maketitle} macro. Our implementations of \cmd{maketitle} can adapt to the \texttt{amsart}, \texttt{acmart}, and \texttt{elsarticle} document classes by invoking their internal commands to display the abstract when \cmd{maketitle} is invoked. There are other metadata elements that may need to be displayed, such as license, keywords, abstract, etc. The display of these is up to the document class. Our document class \texttt{iacrj} has implementations for visual display of license, abstract, and keywords, but also things like a volume number, issue number, DOI, Crossmark, etc. A document class can implement these elements in any manner they wish using the internal variables from this package that are defined in Section~\ref{variables}. \subsection{Capture of metadata} When a document that uses the \pkgname\ package is compiled, the author-supplied metadata is extracted from the \LaTeX\ and written into a \texttt{.meta} file that is machine-parseable. The extraction of metadata in a machine-readable format during compilation makes it easy to build publishing workflow systems around \LaTeX, and this was a big part of the original motivation for this package. An example of this was used by the journal {\em IACR Communications in Cryptology}\footnote{See \url{https://cic.iacr.org/}} and the publishing pipeline system for this is available as open source.\footnote{Source code available at \url{https://github.com/IACR/latex-submit} and a demo is at \url{https://publishtest.iacr.org/}.} One part of that system is a \href{https://github.com/IACR/latex/tree/main/iacrcc/parser}{python parser} for the file containing extracted metadata that is written by the package, but it should be easy to write another parser, because the extracted metadata has a simplified yaml-like structure. The structure of this file is described in Section~\ref{metafile} and a sample is given in Figure~\ref{samplemeta}. For more information on this workflow system, the reader is referred to~\cite{loweringthecost}. Most journal production workflows are proprietary and opaque, but it appears that some use parsing tools to extract the metadata directly from the \LaTeX\ source. Examples of this include the ACM workflow\footnote{Extraction tools are mentioned in \url{https://mirror.math.princeton.edu/pub/CTAN/macros/latex/contrib/acmart/acmart.pdf}} and the Dagstuhl \LaTeX\ project.\footnote{See \url{https://github.com/dagstuhl-publishing/latex}}. This approach can be difficult because \LaTeX\ is a full programming language, and things like \cmd{ifx} conditionals make it difficult to reliably parse \LaTeX. This is one reason why we decided to use \LaTeX\ itself to produce the metadata in an external file. The only real parser for \TeX\ is the \TeX\ binary itself, but our approach avoids the problem. It appears that the \texttt{aomart.cls} document class used for the Annals of Mathematics also follows the approach of writing metadata to an external file. \subsection{Embedding metadata in PDF}\label{pdfmetadata} There have been multiple attempts to provide packages for embedding metadata into PDF. These include the \texttt{hyperxmp}, \texttt{pdfx}, and \texttt{xmpincl} packages. The \LaTeX\ team is working on providing XMP metadata in the PDFs as part of their accessibility initiative~\cite{xmpinlatex}, and we expect this to be the eventual solution. We plan to support this as part of \pkgname\ when the API for the \href{https://ctan.math.washington.edu/tex-archive/macros/latex/contrib/pdfmanagement-testphase/l3pdfmeta.pdf}{\texttt{l3pdfmeta}} module becomes stable. Similar solutions should exist to inject the structured metadata into other output formats such as HTML or EPUB. We don't require the \pkg{hyperref} package to be loaded unless the \texttt{maketitle} package option is used or the \cmd{license} macro is used. If the \pkg{hyperref} package is loaded, then the \pkgname\ package will set the PDF metadata for \texttt{pdftitle}, and \texttt{pdfkeywords}. If the \texttt{anonymous} option is not used, then it will also set \texttt{pdfauthor}. If \pkg{hyperref} is loaded, it should not be loaded with the \pkg{pdfusetitle} option. \section{Options for loading} The \pkgname\ package may be loaded by the document class but may also be loaded by the author. In any event, the \pkgname\ package must be loaded before the author specifies author, title, etc. \pkgname\ may be loaded with various options: \begin{description} \item[\texttt{maketitle=\textless{style}\textgreater}] If this is used, then the package provides a \cmd{maketitle}. The \texttt{style} can be any of the styles listed in Section~\ref{maketitle}. If this is not chosen, then the class must define its own \cmd{maketitle} that makes reference to internal variables of the package. Note that the \cmd{maketitle} macro from \texttt{article.cls} will work out of the box, because under the covers we implement the \cmd{@title} and \cmd{@author} macros. This document is typeset using those values. See Section~\ref{maketitle} and Appendix~\ref{appendix}. \item[\texttt{anonymous}] If chosen, then the implementations of \cmd{maketitle} that may be invoked with the texttt{maketitle} option will not disclose author names or affiliations in the PDF. A document class should load with this option if it is intending to format for a blind peer review system. \item[\texttt{licensereq}] This required the document to specify a license with the \cmd{license} macro. At present we only support a few licenses (see section~\ref{license}) If a document class wishes to further restrict which license is acceptable, they can check the \cmd{METAC@license} variable at the end of the preamble. \item[\texttt{countryrequired}] if chosen, then every affiliation is required to declare a \texttt{country} attribute. \item[\texttt{cityrequired}] if chosen, then every affiliation is required to declare both a \texttt{city} and a \texttt{country} attribute. \item[\texttt{textabstract}] if chosen, then the document must specify a separate ``text-only'' abstract that is free of macros other than mathematics in a \texttt{textabstract} environment that contains no user-defined macros. This abstract is in addition to the ordinary \texttt{abstract} environment, and results in a file named \cmd{jobname.abstract} that contains the abstract when the paper is compiled. We ask for such an abstract from authors so that we can capture an abstract that is suitable for indexing and HTML pages. \item[\texttt{emailreq}] this takes one of three possible options \texttt{none,one,all} that indicates whether no emails are required for authors, at least one email is required for some author, or all authors must supply an email. This option might be used by a document class that wishes to require a corresponding author. \item[\texttt{orcidreq}] whether each author must have an ORCID. Keep in mind that some authors may refuse to use an ORCID. The ORCID of an author should probably only be included if it is supplied by the author themself. \item[\texttt{notitlefootnote}] when selected, the \cmd{footnote} macro is disabled inside the main argument of \cmd{title} %\item[\texttt{lefttitle}] when selected with \texttt{maketitle}, the title and authors will be left-aligned \item[\texttt{footnotesymbols}] Some of the options for \texttt{maketitle} use a different style of footnote marker for affiliations from the rest of the footnotes. For example, in the \texttt{iacrj} style the footnotes on title and authors would ordinarily be labeled as a,b,c, but they are labeled as symbols \textasteriskcentered, \dag, \ddag, etc if the \texttt{footnotesymbols} option is also used. Note that this option should be used with caution, because at most 10 authors can have footnotes with this option. \end{description} \section{Usage by authors}\label{authorusage} The main macros for authors that are provided by this package are \cmd{title}, \cmd{subtitle}, \cmd{license}, \cmd{addauthor}, \cmd{addaffiliation}, \cmd{addfunding}, and \cmd{addkeywords}. These can only be used in the preamble before \cmd[document]{begin}. There is also a \texttt{textabstract} environment to capture text-only versions of the abstract. \subsection{Title} A title is added using the \cmd{title} macro, which has a number of optional arguments: \newcommand{\argrow}[2]{\texttt{#1} & #2\\} \newenvironment{arglist}{% \begin{flushleft}\renewcommand{\arraystretch}{1.2}\begin{tabular}{@{}lp{0.7\linewidth}}% }{% \end{tabular}\end{flushleft} } \begin{arglist} \argrow{running}{The running title intended for display in the headers.} \argrow{plaintext}{A text version of the title (mandatory if macros are used in the title).} % \argrow{footnote}{Add a footnote to the title. Only one is allowed.} \end{arglist} \noindent An example using all the optional arguments is shown below. \begin{Verbatim}[samepage=true] \title[running = {The iacrcc class}, plaintext = {How to use the iacrcc LaTeX class}, ]{How to use the \texttt{iacrcc} \LaTeX\ class\footnote{A revision of an earlier paper on arxiv.org}} \end{Verbatim} The \verb+plaintext+ option is only required if you use macros in your title (it is required in the example). Inline mathematics and accents like \verb+\"u+ are allowed in the main argument to \cmd{title}, and so are newlines \texttt{\textbackslash\textbackslash}. Note that \LaTeX\ has defaulted to UTF-8 input since 2019, so just ü is preferred to \verb+\"u+. Note also that \cmd{thanks} is disabled inside \cmd{title}, and \cmd{footnote} can optionally be disabled by loading \pkgname\ with the option \texttt{notitlefootnote}. See Section~\ref{footnotes} for information about footnotes. In our previous implementation from \texttt{iacrcc.cls}, we had a \texttt{subtitle} attribute, but that has now been moved into a separate \cmd{subtitle} macro in order to support a plain text version. \subsubsection{Subtitle} An author is always allowed to have a two-line title by inserting a newline \texttt{\textbackslash\textbackslash} into the main argument of \cmd{title}, but a subtitle would often be typeset in a smaller font. The semantics of a subtitle are always a little unclear, but the most common definition is for a ``subordinate or explanatory title''.\footnote{The JATS standard states that ``The is a subordinate or auxiliary title that adds information to the full title or modifies the full title.''} If an author wishes to have a subtitle, they use the \cmd{subtitle} macro, which also requires an optional \texttt{plaintext} attribute if the main argument to \cmd{subtitle} contains any macros. A full example could be: \begin{Verbatim}[samepage=true] \subtitle[plaintext={A LaTeX tutorial}]{% A \LaTeX\ tutorial\protect\footnote{Thanks to Leslie Lamport}} \end{Verbatim} Note that footnotes need to be protected inside a subtitle. The \texttt{notitlefootnote} option also prevents \cmd{footnote} from being used inside \cmd{subtitle}. A document class is free to treat subtitles in any way they see fit, but if the \cmd{title} macro is used with the \texttt{running} attribute, then the subtitle should probably not be added to a running title. \subsection{Authors} Author information is entered using the \cmd{addauthor}, \cmd{addaffiliation}, and \cmd{addfunding} macros. Authors are asked to enter this information in a structured way so that we can provide it to indexing agencies. The \cmd{author} macro is disabled. Authors are listed individually using repeated calls to the \cmd{addauthor} command, and these must appear before \cmd{begin\{document\}}. The \cmd{addauthor} macro has a number of optional arguments shown in Figure~\ref{addauthor}. \begin{figure*} \begin{arglist} \argrow{inst}{A numerical list of 1-based indices specifying an institution in the affiliation array (see below).} \argrow{orcid}{The ORCID of the author, specified using the 19-character format \texttt{xxxx-xxxx-xxxx-xxxx}.} \argrow{footnote}{Create an author-specific footnote.} \argrow{surname}{Indicate the surname of the author for indexing purposes.} \argrow{onclick}{Provide a URL for the author, e.g., a home page.} \argrow{email}{Define the e-mail address of this author. Note that the load option \texttt{emailreq} may place restrictions on whether an author needs to supply an e-mail address.} \end{arglist} \caption{Arguments to \cmd{addauthor}} \label{addauthor} \end{figure*} The display of these elements by a document class may be customized in any way the document designer sees fit. In some of the \cmd{maketitle} implementations provided by \pkgname, the presence of the \texttt{orcid} attribute creates a small clickable orcid logo next to the authors name looking like \OrcidLink{0000-0003-1010-8157}[auth]~that is a hyperlink to the authors ORCID home page. This is the authenticated logo for ORCID, but the unauthenticated version \OrcidLink{0000-0003-1010-8157}[unauth]~is also bundled into this package if your journal workflow requires it. Similarly, some of our implementations of \cmd{maketitle} display the \texttt{onclick} attribute with an icon like \AuthorLink{https://theonion.com/} or \homelink{https://theonion.com} displayed next to the author's name that is an active link to the URL. It's not obvious how to interpret the omission of the \texttt{inst} argument from \cmd{addauthor}. It's possible that the author has no affiliation, but it's also possible that the author is affiliated with all listed affiliations. That is a matter of policy for the document class. In order to eliminate this ambiguity, the document class may choose to require the \texttt{inst} argument for every author, and use an empty \texttt{inst} argument in case the author has no affiliation. In the \cmd{maketitle} implementations supplied in this package, we have chosen to omit the footnotes on author names for affiliations in the following cases: \begin{itemize} \item if \cmd{addauthor} omits the \texttt{inst} attribute or it is empty, \item if there is only a single author, \item if there is only a single affiliation \end{itemize} In the last two cases we also omit the numbers on the affiliations. The \texttt{inst} array serves two purposes, namely for appearance to link authors to affiliations, and for metadata processing in a journal workflow where author affiliations are reported. In the latter case the indices must be validated to make sure that they refer to actual entries in the affiliation array. Some downstream processors like crossref request author names to be broken into \texttt{given-name,surname} but this is in conflict with many existing cultural norms for author names (see~\cite{falsehoods}). \texttt{crossref} has a required element for surname, which is why we include this. %% \todok{We recently had a case in CiC %% with the author name ``Arthur Herlédan Le Merdy'' in which their %% surname is ``Herlédan Le Merdy''. In this case the bibtex parser %% failed, the python bibtex parser failed, and the HumanName parser %% failed to identify the surname. Another example was an author named %% ``Mahdi Rahimi'', where the HumanName parser failed to recognize %% ``Mahdi'' as a given name. It is simply not feasible to reliably parse %% names and recognize what the given name and surname should be, and %% there is no real reason to require it other than alphabetic ordering %% on author names.} When the URL provided to the {\texttt onclick} option contain characters with a ``special'' meaning in \LaTeX{} they might render incorrectly. For example, the URL \begin{quote} \verb+https://web.com/~foo/the best/#zoo+ \end{quote} contains a tilde, a space, and a pound symbol \#. It would be encoded as \begin{verbatim} onclick = {https://web.com/\%7Efoo/the\%20best\#zoo} \end{verbatim} An example using all the optional arguments is given below. In this case the author has \verb+inst={1,2}+ to indicate that they are affiliated with the first and second affiliations that are entered with \cmd{addaffiliation}: \begin{Verbatim}[samepage=true] \addauthor[orcid = {0000-0000-0000-0000}, inst = {1,2}, footnote = {Thanks to my supervisor for the support.}, onclick = {https://www.mypersonalwebpage.com}, email = {alice@accomplished.com}, surname = {Accomplished}, ]{Alice Accomplished} \end{Verbatim} The \cmd{thanks} macro is disabled inside \cmd{addauthor}, so use the \verb+footnote+ option on \cmd{addauthor} instead. In fact, if an author attempts to use any non-accent macros inside the primary argument to \cmd{addauthor} it generates an error. \subsection{Affiliations} Affiliations are listed individually using the \cmd{addaffiliation} command \emph{after} the last author has been added using \cmd{addauthor}. It can only be used before \cmd{begin\{document\}}, and has several optional arguments: \begin{arglist} \argrow{ror}{The Research Organization Registry (ROR) indentifier for this affiliation. This is the equivalent of ORCID for organizations. See \url{https://ror.org/}.} \argrow{department}{Department or suborganization name.} \argrow{street}{Street address.} \argrow{city}{City name.} \argrow{state}{State or province name.} \argrow{postcode}{Zip or postal code.} \argrow{country}{Country name. This is strongly recommended.} \argrow{countrycode}{ISO-3166 Alpha-2 identifier for country. This is strongly recommended, and it eliminates ambiguity in country name (e.g., Österreich vs Austria). If \texttt{country} is omitted, this can be used to fill it in. A list of these can be found at \url{https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2}.} \end{arglist} \noindent There is an online tool at \href{https://publish.iacr.org/funding}{\texttt{publish.iacr.org/funding}} to help you find ROR identifiers, and authors are strongly urged to include these. It is up to the implementation of \texttt{maketitle} to decide whether to show all attributes on an affiliation. Most implementations will use the name, \verb+city+ and \verb+country+ arguments. All arguments can be used to provide metadata to indexing agencies. A full invocation of \cmd{addaffiliation} would look like: \begin{Verbatim}[samepage=true] \addaffiliation[ror = {05f950310}, department = {Computer Security and Industrial Cryptography}, street = {Kasteelpark Arenberg 10, box 2452}, city = {Leuven}, state = {Vlaams-Brabant}, postcode = {3001}, country = {Belgium}, countrycode = {BE} ]{KU Leuven} \end{Verbatim} \subsection{Funding information} Authors should use the \texttt{\textbackslash addfunding} macro to make sure that funding agencies can find articles published under their sponsorship. An example is: \begin{verbatim} \addfunding[fundref = {100000001}, grantid = {CNS-1237235}, country = {United States}]{National Science Foundation} \addfunding[ror = {00pn5a327}, country = {United States}]{Rambus} \end{verbatim} \noindent In this example, the author acknowledges a grant from the National Science Foundation and support from Rambus (with no \texttt{grantid}). The inclusion of funding from an agency without a \texttt{grantid} might be appropriate if the author simply received support for a visit. The complete list of optional arguments for \texttt{\textbackslash addfunding} is: \begin{arglist} \argrow{fundref}{An identifier from the \href{https://publish.iacr.org/funding}{Crossref funder registry}.} \argrow{ror}{An identifier from the \href{https://publish.iacr.org/funding}{Research Organization Registry} (ROR).} \argrow{country}{The country of the funding agency.} \argrow{countrycode}{ISO-3166 Alpha-2 identifier for country.} \argrow{grantid}{The identifier of the grant that is assigned by the agency who provided it.} \end{arglist} \noindent You can use the online tool at \href{https://publish.iacr.org/funding}{\texttt{publish.iacr.org/funding}} to help you find \texttt{fundref} and \texttt{ror} identifiers. Note that \cmd{addfunding} \textbf{does not} automatically create footnotes or an acknowledgements section to identify funding - it only collects the metadata for indexing. If you wish to include such visible annotations, you can use the \texttt{footnote} option on \cmd{addauthor} or add a separate acknowledgements section. Some funding agencies have specific requirements for how they want to be acknowledged in the article. \subsection{Footnotes}\label{footnotes} Authors may be accustomed to using \cmd{thanks} for footnotes indicating affiliation, email, or funding, but the \cmd{thanks} macro is disabled and authors should use the methods described in this document. We provide the \texttt{footnote} attribute on authors so that they can add an arbitrary footnote to their name. This can be used for indicating that the author's affiliation for the work was different than their current affiliation, or to indicate contact address, or a previous name, etc. Some of the implementations of \cmd{maketitle} use footnotes to connect authors to their affiliations. Document designers often have specific requirements on footnotes, and one such requirement is supported by the \texttt{notitlefootnote} option of this package in case footnotes are not allowed on titles. It should be noted that footnotes are specifically tied to paper-oriented layouts, and can be problematic in HTML output. \subsection{License}\label{license} When the \texttt{licensereq} option is used upon load, the author needs to provide a supported license. At present the only acceptable licenses are the following creative commons licenses: \texttt{CC-BY-4.0}, \texttt{CC-BY-NC-4.0}, \texttt{CC-BY-NC-ND-4.0}, \texttt{CC-BY-NC-SA-4.0}, \texttt{CC-BY-ND-4.0}, and \texttt{CC0-1.0}. An example would look like: \begin{verbatim} \license{CC-BY-4.0} \end{verbatim} \subsection{Keywords} Use \cmd{addkeywords}\{keyword1, keyword2\} to give a list of keywords or key phrases. This is an optional macro that should appear before the abstract. Individual keywords should be separated by commas. If the keywords contains math or macros, then you must supply an additional set of text-only keywords; for example: \begin{verbatim} \addkeywords[rings, arithmetic on Z]{ rings, arithmetic on $\mathbb{Z}$} \end{verbatim} \subsection{Abstract} A document class that loads the \pkgname\ package may format the abstract however it is desired, but \pkgname\ also provides a mechanism for extracting a ``text-only'' abstract. If the author provides such an abstract within the \texttt{textabstract} environment, it will create a file named \texttt{\textbackslash{jobname}.abstract} that contains the contents. The purpose of the text-only abstract is to provide for indexing and production of {HTML} pages to describe the paper. As such, it is just as important as the classical \texttt{abstract} of a paper because it contains a textual summary that readers will use to decide if the paper is worth reading. The only difference is that the contents of the \texttt{textabstract} is constrained on what it may contain. Note that the contents of the \texttt{textabstract} will not be displayed in the final PDF except as metadata. Note also that \verb+\begin{textabstract}+ must appear on a line by itself. %% \section{Auxiliary files} %% Users will already be familiar with the fact that running a latex compiler %% will produce a number of auxiliary files, including the \texttt{.log}, %% \texttt{.aux}, \texttt{.bbl}, \texttt{.blg}, \texttt{.toc}, and %% \texttt{.out} files produced by \texttt{bibtex} %% and the \texttt{hyperref} package. If the main \LaTeX\ file is \texttt{main.tex}, %% then the \pkgname\ package will produce two additional files, namely %% \texttt{main.meta} and \texttt{main.abstract}. The \texttt{main.meta} %% contains all metadata from the paper, and the file \texttt{main.abstract} contains %% the contents of the \texttt{textabstract} environment. \section{Format of the \texttt{.meta} file}\label{metafile} The \texttt{metacapture-doc.meta} file that is created when a \LaTeX\ document is compiled is similar to yaml. An example is shown in Figure~\ref{samplemeta}. \newtcolorbox{smalltcolorbox}{fontupper=\footnotesize,colback=blue!5!white,boxrule=0.7pt} \begin{figure*} \begin{smalltcolorbox} \begin{verbatim} schema:0.9.1 title: The metacapture LaTeX package subtitle: A demo with different styles and classes author: name:Paul Erdős orcid:0000-1111-2222-3333 inst:1,2 footnote:Paul has a footnote email:erdos@att.com surname:Erdős author: name:P\'al Tur\'an orcid:0000-0001-7890-5430 inst:3 footnote:Another remarkable Hungarian mathematician email:latex@digicrime.com surname:Tur\'an affiliation: name:University of California, San Diego ror:0168r3w48 department:Computer Science Department country:United States affiliation: name:Mega Corporation department:Department of Redundancy Department city:Sunnydale state:California country:Elbonia affiliation: name:Faber College country:Absurdistan department:Department of Unfundable Research city:Gottaknow keywords: Metadata, publishing, LaTeX license: CC-BY-4.0 \end{verbatim} \caption{Sample \texttt{.meta} file that is described in Section~\ref{metafile}. The \texttt{schema} attribute indicates a version of \pkgname\ that was used to create the file. The resst of the format should be fairly clear.} \label{samplemeta} \end{smalltcolorbox} \end{figure*} While this looks like yaml, it's not quite the same. The reader might wonder why we don't write yaml, and the real reason is that yaml requires enclosing strings inside double quotes if they contain any of the characters \verb+{}[]&*#?|-<>=!\%@:+, and those characters would need to be escaped. This would be a pain to implement in \LaTeX, and we don't need the full generality of yaml. The syntax of the \texttt{.meta} file is simplified by the fact that every value is on a single line. Note that the output format may contain macros in math mode, and also a few simple macros such as \cmd{'e}, The complete list of macros is defined in \texttt{IsMacroAllowed\{\}}. \section{Internal variables}\label{variables} For those seeking to implement their own document class based on this, you should make use of some internal variables. If a document class wishes to provide additional restrictions on the metadata that is provided, then they can implement additional checks on these variables at the end of the preamble. An example might be to check that every author supplied a surname, or that every author supplied an affiliation. The most important internal variables are listed in Table~\ref{othervariables}. We believe that these are sufficient to construct any form of front matter that is desired, and we provide several implementations of a \cmd{maketitle} command that can be accessed through the load option \texttt{maketitle=\textless{style}\textgreater}. The first version of this package was written in LaTeX2$\epsilon$ syntax, but that made it complicated to store a list of authors, affiliations, or funding agencies. The \pkgname\ package is now implemented using functionality from the LaTeX3 programming layer.\footnote{For those who are unfamiliar with this, we recommend reading \url{https://ctan.math.washington.edu/tex-archive/macros/latex/required/l3kernel/expl3.pdf} and the reference manual \url{https://ctan.math.washington.edu/tex-archive/macros/latex/required/l3kernel/interface3.pdf}.} In particular, this means that some variable names follow the general pattern of\\ \texttt{\textbackslash\textless{scope}\textgreater\_\textless{module}\textgreater\_\textless{name}\textgreater\_\textless{type}\textgreater}, where \begin{itemize} \setlength\itemsep{0pt} \item \texttt{\textless{scope}\textgreater} is either \texttt{g} or \texttt{l} for global or local variables, \item \texttt{\textless{module}\textgreater} is the string \texttt{metac}, which we use to denote the module, \item \texttt{\textless{name}\textgreater} is a variable name, \item \texttt{\textless{type}\textgreater} is a data type. \end{itemize} The two most important data types from the LaTeX3 programming layer are the \texttt{seq} and \texttt{prop} data structures. The \texttt{prop} data structure is a property list, and is much like a dictionary that holds key-value pairs. This is a natural match for storing each author, which is itself a set of key-value pairs. The same goes for each affiliation and each funder. The other important data structure is \texttt{seq}, which is a sequence. We use the variable \cmd{g\_metac\_author\_seq} to hold the sequence of authors. Due to a limitation of the \texttt{seq} and \texttt{prop} objects, the sequences hold only serialized versions of the author \texttt{prop} rather than the \texttt{prop} object itself.\footnote{Apparently the entry of a \texttt{seq} variable can only be ``balanced text'' as defined in the \TeX\ book. See \url{https://tex.stackexchange.com/questions/115700/can-i-store-sequences-in-sequences-with-expl3} and \url{https://github.com/latex3/latex3/issues/500} where the LaTeX team discussed such nested data structures and decided not to support them.} Finally, there is an additional datastructure called a \texttt{clist} for comma-separated list that is useful for holding the lists of keywords. If any of the LaTeX3 variables are used in a document class, then the code has to be enclosed inside \cmd{ExplSyntaxOn}...\cmd{ExplSyntaxOff} groups. This is not a serious limitation, since it's much like the restriction to access variables that contain the \texttt{@} character inside \cmd{makeatletter}...\cmd{makeatother} blocks. \newcommand{\vardesc}[2]{\item[#1]\hfill\\#2} \begin{table*}\label{othervariables} \begin{smalltcolorbox} \begin{description} \setlength\itemsep{0pt} \vardesc{\cmd{g\_metac\_author\_seq}}{the list of authors, each of which is a serialized key-value \texttt{prop}} \vardesc{\cmd{g\_metac\_affil\_seq}}{the list of affiliations} \vardesc{\cmd{g\_metac\_funders\_seq}}{the list of funders} \vardesc{\cmd{g\_metac\_keywords\_raw\_clist}}{The list of raw encoded keywords (may contain macros)} \vardesc{\cmd{g\_metac\_keywords\_plaintext\_clist}}{The list of plaintext keywords} \vardesc{\cmd{METAC@license}}{When \cmd{license} is called, this is set to the license identifier. This is an SPDX identifier because of our dependence on the \texttt{doclicense} package. An example is \texttt{CC-BY-4.0}.} \vardesc{\cmd{if@metacapture@anonymous}}{Set if the anonymous option is used to load it.} \vardesc{\cmd{g\_metac\_display\_emails\_tl}}{This is a comma-delimited list of \texttt{email,(name)} values that were constructed from calls to the \cmd{addauthor} macro.} \vardesc{\cmd{@title}}{The formatted title supplied by the author as argument \texttt{\#2} of \cmd{title}. This does not include anything from \cmd{subtitle}.} \vardesc{\cmd{g\_metac\_titleraw\_tl}}{The raw title supplied as the main argument to \cmd{title}.} \vardesc{\cmd{g\_metac\_titlerunning\_tl}}{Optional running title supplied by the author.} \vardesc{\cmd{g\_metac\_titleplain\_tl}}{Optional plain text title.} %\vardesc{\cmd{METAC@title@footnote}}{Optional footnote for the title.\todok{no replacement?}} \vardesc{\cmd{g\_metac\_subtitleraw\_tl}}{Optional subtitle.} \vardesc{\cmd{g\_metac\_subtitleplain\_tl}}{Optional plaintext version of subtitle.} \vardesc{\cmd{METAC@listofauthors}}{A list of author names separated by ', '} %\vardesc{\cmd{@author}}{A marked up list of authors that is used internally by the \cmd{maketitle} of the package.} \vardesc{\texttt{METAC@author@cnt}}{A counter for the number of authors. It is incremented each time \cmd{addauthor} is called.} \vardesc{\texttt{METAC@email@cnt}}{A counter for the number of authors with email.} \vardesc{\texttt{METAC@affil@cnt}}{A counter for the number of affiliations. It is incremented each time \cmd{addaffiliation} is called.} \end{description} \end{smalltcolorbox} \caption{Internal variables that are set by calls to \cmd{addauthor}, \cmd{addaffiliation}, \cmd{addfunder}, \cmd{addkeywords}, \cmd{title}, \cmd{subtitle}, and \cmd{license}. Some of these are LaTeX3-specific, as indicated by the name used for them. All of these are available at the end of the preamble, because the commands to set them may only be used in the preamble.} \end{table*} A complete tutorial on the use of \texttt{expl3} is beyond the scope of this article, but we hope that the source code of the package contains sufficiently many examples of how to use the variables. \section{What's missing}\label{missing} The purpose of this package is to capture author-supplied metadata rather than publisher-supplied metadata such as a DOI or page numbers. Such publisher-supplied metadata is often encoded into the PDF of a publication, e.g.\ as a hyperlink to the DOI. We leave the handling of publisher-supplied metadata to the document class, but the \pkg{iacrj.cls} and our open-source workflow may prove useful as an example. The breadth of metadata for a publication has been growing in recent years. We have attempted to include only the minimal metadata elements that have clear definitions, are reported to Crossref, and are currently required in all disciplines. We expect that others may be needed in the future. This list is not complete, but some things include: \begin{description} \item[Licenses] We currently only support a limited selection of licenses (e.g., we omit copyleft). It's possible that someone may wish to place different licenses on media embedded in the document. It's also possible that someone may wish to place different licenses on the \LaTeX\ source than the final document intended for readers. We do not cover these cases. \item[Copyright] The \texttt{acmart} document class provides the \cmd{setcopyright} macro to stipulate addtional copyright conditions such as \texttt{usgovmixed} to stipulate that some authors are employees of the US government. Authors may also wish to declare copyright limitations on selected portions of the document. Both JATS and the crossref schema currently supports the elements \texttt{\textless{copyright-holder}\textgreater}, \texttt{\textless{copyright-statement}\textgreater}, and \texttt{\textless{copyright-year}\textgreater} that contain structured data. Both schemas allow them to be applied to subsections of the document so that a document may recognize copyright of a third party for embedded elements. \item[Languages] We have no way for an author to express which languages are used in the document, or to provide language-specific versions of title, keywords, affiliation name, abstract, etc. \item[Article categories] Some journals tag an article as a type, e.g, ``Commentary'', ``Research article'', ``Book Review'', or ``Survey''. These appear as \texttt{\textless{article-categories}\textgreater} in JATS. \item[Affiliations] There are a number of other elements that might be associated with an affiliation, including address lines for a postal address, phone number, a URL, or other identifiers such as Grid, Ringgold, Scopus, etc. \item[XMP] XMP stands for ``eXtensible Metadata Platform'', and is an XML standard for embedding metadata into PDF as well as other document formats. Unfortunately the schema lags badly behind other standards (it doesn't even have support for ORCID without resorting to non-standard extension schema). See Section~\ref{pdfmetadata}. \newcommand\CREDIT{CRediT} \item[Contributor roles] There have been various attempts to define a taxonomy of roles played by authors. The \texttt{amsart.cls} document class allows specifying a {\em contributor} with \cmd{contrib} and a {\em role} argument to say things like ``with an appendix by N.\ Bourbaki'' after the list of authors. They do not appear to report this information to crossref. Perhaps the best known definition of contributor roles is \CREDIT, which stands for Contributor Role Taxonomy, and has now become an ISO standard.\footnote{See \url{https://credit.niso.org/}} Crossref has announced that they will support something like this in version~5.5 of their schema. There are several things that remain to be determined, like the role of AI agents in authorship, the degree of a role, whether ``translator'' is a recognized role, etc. \item[Author bio] IEEE and other publishers may collect an author bio, and JATS also supports this. The model in JATS is pretty complex and supports titled sections. \item[Other author IDs] ORCID is pretty common now, but some authors may not have them (e.g., a deceased author) and a publisher may wish to use their own namespace (e.g., SCOPUS or MathSciNet Author ID). \item[Author notes] Sometimes a particular author will receive a designation (e.g., a ``contact author'', or the author responsible for supplying data). This is in the \texttt{\textless{author-notes}\textgreater} element of JATS, and may have multiple authors referencing a single note. \item[Bibliographic references] Since most users of \LaTeX\ use \texttt{bibtex} or \texttt{biblatex}, it is natural to think of exporting bibliographic references as a structured part of the metadata for the article. There are several problems with this, including the fact that the fields for a \BibTeX\ entry are not well defined and the format has failed to evolve.\footnote{The original \BibTeX\ documentation says ``Don't take the field names too seriously''.} For example, authors may add things like a URL as part of the \texttt{url} field, or a \texttt{note} field, or a \texttt{howpublished} field. Moreover, packages like \texttt{biblatex} have added additional entry types and fields. Given the weaknesses of the \BibTeX\ format, we might consider an alternative export format. There are several such bibliographic database formats, but they are seldom used with \LaTeX, and they all suffer from deficiencies. These include RIS,\footnote{See \url{https://en.wikipedia.org/wiki/RIS_(file_format)}}, Endnote, Zotero,\footnote{See \url{https://gist.github.com/pchemguy/19fa69fb4e74ef0cca0026aa0dbf5f42}}, citeproc JSON,\footnote{See \url{https://github.com/citation-style-language/schema}} and JATS.\footnote{See \url{https://jats.nlm.nih.gov/publishing/tag-library/1.4/element/element-citation.html}} In our first effort at metadata extraction~\cite{tugboat}, we used a custom \BibTeX\ style to export the bibliography in a structured format, but that introduced additional problems because we wanted to follow the separation of concerns principle. In the end we decided that exporting bibliographic references is a big complicated mess that is better left to a high-level language. In our companion workflow software,\footnote{See \url{https://github.com/IACR/latex-submit}} we use python to invoke \pkg{bibexport} to find the cited references, and then parse the bibtex files directly. This was complicated by the fact that we wanted to support both \pkg{biblatex} and \pkg{bibtex}. \item[Name parts] Some agencies like crossref are attempting to gather names of authors in two parts, namely first and last (or given and family name). We have attempted to comply by allowing an optional surname field on author names, but this approach is flawed since names cannot be assumed to have the same structure across all cultures. See~\cite{falsehoods}. We also do not support alternate names for authors, and we do not support author-supplied \texttt{name-style} attribute that crossref supports for an author to report that they have only a given name. \item[Funding text] Some funding agencies have specific text that they want to be displayed to acknowledge them. Ideally this would appear in both the document itself but also as part of the metadata on an HTML landing page. We could address this by including an optional \texttt{text} attribute on \cmd{addfunder} \item[Funding groups] We currently support the name, identifier, and award number for a funder, but we may wish to provide further information like the name of a PI or the program within a larger funding organization. This would be driven by downstream requirements. \item[Shared footnotes] We don't support shared footnotes for authors. These might be useful to for a single statement that they contributed equally, or to identify all corresponding authors. We also don't support multiple footnotes on an author or a title. \item[Multiple departments] Consider the case where author$_1$ is in the mathematics department of UCSB, and author$_2$ wishes to list both the mathematics and computer science departments in their affiliation. In this case it's not clear how the affiliations should be listed. One choice is to list UCSB twice, with author$_1$ specifying the mathematics department and author$_2$ specifying both departments in the \texttt{department} attribute. Alternatively, the UCSB affiliation would be listed once, but footnotes used on the author to indicate which department. The \pkg{acmart} document class has some support for this. \item[Discipline-specific data] Some disciplines use additional metadata such as clinical trials that are registered with a International Standard Randomized Controlled Trial Number (ISRCTN), or the \verb+ClinicalTrials.gov+ number. We don't understand them well enough to include them here, but they seem like natural extensions. \item[External documents] Some journal articles are explicitly linked to other documents or media. This could include supplementary material, former versions of the document, translations, related media, data, code, clinical information, etc. \item[Keywords and taxonomies] Some disciplines also use specific taxonomies or keyword vocabularies (e.g., ACM Computing Classification System, AMS Mathematics Subject Classification, or the JEL classification system in economics). At present, we regard these as too publisher-specific be included in this general package. A document class can always provide support for them. Both JATS and Crossref have support for keywords and/or subject classifications in their schemas. In both cases there is support for multiple classifications, with multiple vocabularies or assigning authorities. \end{description} \section{Package dependencies} This package depends directly on several other packages, including the following: \begin{description} \item[\texttt{xstring}] This is used for \cmd{IfSubStr}. \item[\texttt{footnote}] Authors are allowed to have footnotes attached to them, and these may be contained inside boxes in the \cmd{maketitle} implementations that the package provides. For this we use the \texttt{footnote} package for footnotes inside of boxes. We tried using the \texttt{footnotehyper} package but that package is too restrictive in how footnotes are defined. \item[\texttt{alphalph}] Footnote labels may be alphabetic, depending on the load options. \item[\texttt{tokcycle}] This is used to perform checks on metadata arguments to make sure that they contain ``only text'' that can safely be written to a plain text file. \item[\texttt{listofitems}] This is used to process a list of macros that are allowed to appear in ``text-only'' arguments to macros. We use \cmd{readlist} from \texttt{listofitems} to read that list. We might be able to switch to native \texttt{clist} from \texttt{expl3} instead. \item[\texttt{doclicense}] This is used to identify creative commons licenses in the \cmd{license} macro. \item[\texttt{hyperref}] This is used to provide hyperlinks on footnotes, ORCID links, and because \texttt{doclicense} requires it. We try to delay loading this as late as possible so as not to collide with any options from other packages or the document class. \item[\texttt{fancyvrb}] This is used to write out the \texttt{textabstract} environment to a file. \item[\texttt{xpatch}] This is used to patch an output macro from \texttt{fancyvrb}. \item[\texttt{tikz}] This is used with the \texttt{svg.path} library to draw some icons like the home link and the ORCID link. \end{description} \section{Feedback} Use the \pkgname\ github project to report bugs and submit feature requests.\footnote{See \url{https://github.com/IACR/latex/tree/main/metacapture}} If your feature is only relevant to a specific discipline, then perhaps the natural thing to do is to extend the \pkgname\ package and add additional fields. Adding too many fields and too much complexity can make the documentation hard to digest. %\printbibliography \bibliography{metacapture-doc} \appendix \section{Appendix: Example styles for \cmd{maketitle}\label{appendix}} This document was typeset with \texttt{article} class and the default \cmd{@title} and \cmd{@author} (i.e., using \texttt{maketitle=none}). In subsequent pages we show the appearance of the different styles for \cmd{maketitle}. A class designer can of course make their own \cmd{maketitle} to suit their own needs, and hopefully these examples will be useful. \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=iacrj }{iacrjdemo} \setcounter{footnote}{0} \makeatletter \METAC@iacrj@maketitle \makeatother \begin{quote} This uses \texttt{maketitle=iacrj}. Footnotes for affiliations are numbered, but footnote symbols on title footnotes and author footnotes are alphabetic (they can also be symbols). The icon for a home page is different than what is used in \cmd{@author}. \end{quote} \label{iacrj} \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=acmsmall }{acmsmalldemo} \setcounter{footnote}{0} \makeatletter \METAC@acmsmall@maketitle \makeatother \begin{quote} This uses \texttt{maketitle=acmsmall}. Author names are in small caps. \end{quote} \label{acmsmall} \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=acmconf }{acmconfdemo} \label{acmconf} \setcounter{footnote}{0} \savenotes \makeatletter \METAC@acmconf@maketitle \makeatother \spewnotes \begin{quote} This uses \texttt{maketitle=acmconf}. Each author is displayed in a block with repeated affiliations. It appears similar to the default \cmd{@author}, but the spacing is better for more than a couple of authors and links for ORCID and author home pages are omitted. \end{quote} \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=jems }{jemsdemo} \label{jems} \setcounter{footnote}{0} \makeatletter \METAC@jems@maketitle \makeatother \begin{quote} This uses \texttt{maketitle=jems}. Author names appear above the title, and each author has an unnumbered footnote with their information. It's not clear what to do with footnotes on author names, and the journal class appears not to support them. \end{quote} \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=inv }{invdemo} \label{inv} \setcounter{footnote}{0} \begin{savenotes} \makeatletter \METAC@inv@maketitle \makeatother \end{savenotes} \begin{quote} This uses \texttt{maketitle=inv}. Affiliations are listed below each author name, and are repeated for shared affiliations. Emails are listed after affiliations in a block. \end{quote} \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=lipics }{lipicsdemo} \label{lipics} \setcounter{footnote}{0} \makeatletter \METAC@lipics@maketitle \makeatother \begin{quote} This uses \texttt{maketitle=lipics}. Author names have icons for email, home page, and ORCID. Affiliations are listed below each author name, and are repeated for shared affiliations. \end{quote} \newpage \phantomsection{} \pdfbookmark[2]{Demo of maketitle=ams }{amsdemo} \label{ams} \setcounter{footnote}{0} \makeatletter \METAC@ams@maketitle \makeatother \begin{quote} This uses \texttt{maketitle=ams}. Title and author names are in small caps. Author footnotes are unnumbered (for some reason this is the style for \texttt{amsart}). Each author's affiliation is listed at the end of the document as below. \end{quote} \end{document}