--- title: "OnassisJavaLibs: Java Libraries to support Onassis, Ontology Annotation and Semantic Similarity software" author: "Eugenia Galeota" date: "`r Sys.Date()`" output: BiocStyle::html_document vignette: > %\VignetteIndexEntry{OnassisJavaLibs: Java Libraries to support Onassis, Ontology Annotation and Semantic Similarity software} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignettePackage{OnassisJavaLibs} %\VignetteDepends{OnassisJavaLibs} --- # Introduction OnassisJavaLibs is a data package containing various compiled libraries in jar format and their source code used by `r BiocStyle::Biocpkg("Onassis")` to # Contents of the package Conceptmapper Java (1.8 version) libraries (https://github.com/UCDenver-ccp/ccp-nlp) version 3.3.2, and semantic similarity libraries (https://github.com/UCDenver-ccp/ccp-nlp) version 0.9.5 have been compiled using maven. They are available in jar format within the extdata directory of this package and can be located through `r system.file('extdata', 'java', packade='OnassisJavaLibs')`. # Accessing and compiling the Java source code The source code for the java libraries is available in the `java` directory of package tarball. Users interested in compiling their own jar files can refer to the following information. The `conceptmapper` subdirectory contains the java code to annotate text with concepts from OBO ontologies. To create a jar file including all the needed dependencies, the source code can be compiled using maven (Apache Maven 3.3.9 was used). The `slib` and `similarity` subdirectories contain the java code to determine the semantic similarities between concepts, and can be compiled and installed with the following goals, respectively: ```{r, engine = 'bash', eval = FALSE} #From the conceptmapper directory mvn clean compile assembly:single --Dlog4j.configuration=log4j2.properties #From the slib directory mvn clean install #From the similarity directory mvn clean install assembly:single ``` # Use of the package The methods and classes implemented in the described Java libraries can be used through R functions and methods available within Onassis. Alternatively, the Java code can be directly executed using `rJava`. For example a dictionary from an OBO ontology file can be created through the following code: ```{r echo=TRUE, eval=TRUE} require(rJava) #Initializing the JVM .jinit() #Adding the path to the jar file jarfilePath <- file.path(system.file('extdata', 'java', 'conceptmapper-0.0.1-SNAPSHOT-jar-with-dependencies.jar', package='OnassisJavaLibs')) .jaddClassPath(jarfilePath) #Creating an instance of the OntologyUtil with the sample obo file ontoutil <- .jnew("edu.ucdenver.ccp.datasource.fileparsers.obo.OntologyUtil", .jnew('java/io/File', file.path(system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs')))) #Creating the output file containing the conceptmapper dictionary outputFile = .jnew("java/io/File", "dict.xml") #Building of the dictionary from the OBO ontology dictionary <- J("edu.ucdenver.ccp.nlp.wrapper.conceptmapper.dictionary.obo.OboToDictionary")$buildDictionary( outputFile, ontoutil, .jnull(), J("edu.ucdenver.ccp.datasource.fileparsers.obo.OntologyUtil")$SynonymType$EXACT ) ``` To compute the semantic similarity between two terms of the same ontology, classes in the similarity library can be used in this way: ```{r echo=TRUE, eval=TRUE} #Adding the similarity library containing the similarity class to compute semantic similarities jarfilePath <- file.path(system.file('extdata', 'java', 'similarity-0.0.1-SNAPSHOT-jar-with-dependencies.jar', package='OnassisJavaLibs')) .jaddClassPath(jarfilePath) #Creating an instance of the class Similarity similarity <- .jnew("iit/comp/epigen/nlp/similarity/Similarity") #Loading the ontology in a grah structure file_obo <- file.path(system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs')) ontology_graph <- similarity$loadOntology(file_obo) #Setting the semantic similarity measures measure_configuration <- similarity$setPairwiseConfig('resnik', 'seco') #Terms of the ontologies need to be converted into URIs term1 <- 'http://purl.obolibrary.org/obo/CL_0000771' term2 <- 'http://purl.obolibrary.org/obo/CL_0000988' URI1 <- .jcast(similarity$createURI(term1), new.class = "org.openrdf.model.URI", check = FALSE, convert.array = FALSE) URI2 <- .jcast(similarity$createURI(term2), new.class = "org.openrdf.model.URI", check = FALSE, convert.array = FALSE) # Computation of the semantic similarity score similarity_score <- .jcall(similarity, "D", "pair_similarity", URI1, URI2, .jcast(ontology_graph, new.class = "slib.graph.model.graph.G"), measure_configuration) similarity_score ``` # Acknowledgements We would like to thank you the library providers. The methods for the conceptmapper pipeline and defining the ccp-nlp type system, have been developed and published by the Reagents of the University of Colorado under BSD 3-clause license. The methods for computing the semantic similarities instead have been developed and published by the the Ecole des mines d'Alès under the GPL-compatible CeCILL license. Both licenses are provided within the package.