--- title: "Saving and Sharing Graphs with the Caugi Format" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Saving and Sharing Graphs with the Caugi Format} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(caugi) ``` ## Overview The caugi package provides a native JSON-based serialization format for saving and loading causal graphs. This format enables reproducible research, data sharing, and caching of graph structures. ## Quick Start ### Writing Graphs First, create a causal graph: ```{r} cg <- caugi( A %-->% B + C, B %-->% D, C %-->% D, class = "DAG" ) ``` Then, write it to a file in the caugi format: ```{r} tmp <- tempfile(fileext = ".caugi.json") write_caugi(cg, tmp, comment = "Example causal graph", tags = c("research", "example") ) ``` That's it! The graph is now saved in a human-readable JSON file. ### Reading Graphs You can read the graph back from the file, and verify it matches the original: ```{r} cg_loaded <- read_caugi(tmp) identical(edges(cg), edges(cg_loaded)) ``` ## The Caugi Format ### Structure The caugi format uses a simple, human-readable JSON structure: ```{r echo=FALSE, comment=""} cat(readLines(tmp), sep = "\n") ``` ### Key Features - **Versioned**: Schema version 1 with forward compatibility - **Human-readable**: Uses node names and DSL operators (not indices) - **Self-documenting**: Includes `$schema` reference for IDE validation - **Metadata support**: Optional comments and tags ### Edge Types The format supports all caugi edge types using their DSL operators: | Operator | Description | Graph Types | |:-----------|:---------------------------|:---------------------------| | `-->` | Directed edge | DAG, PDAG, ADMG, UNKNOWN | | `---` | Undirected edge | UG, PDAG, UNKNOWN | | `<->` | Bidirected edge | ADMG, UNKNOWN | | `o->` | Partially directed | PDAG, UNKNOWN | | `--o` | Partially undirected | PDAG, UNKNOWN | | `o-o` | Partial (both circles) | PDAG, UNKNOWN | ## Working with the Format ### String Serialization For programmatic use, you can serialize to/from strings: ```{r} # Serialize to JSON string json_str <- caugi_serialize(cg) cat(substr(json_str, 1, 200), "...\n") # Deserialize from JSON string cg_from_json <- caugi_deserialize(json_str) ``` ### Lazy Loading For large graphs, you can defer building: ```{r} # Read without building the Rust graph structure cg_lazy <- read_caugi(tmp, lazy = TRUE) # Build when needed cg_lazy <- build(cg_lazy) ``` ### Metadata Add context to your graphs with comments and tags: ```{r} write_caugi(cg, tmp, comment = "Mediation model from Study A", tags = c("mediation", "study-a", "validated") ) ``` ## Different Graph Types The format supports all caugi graph classes: ```{r} # DAG dag <- caugi(X %-->% Y, Y %-->% Z, class = "DAG") # PDAG (with undirected edges) pdag <- caugi(X %-->% Y, Y %---% Z, class = "PDAG") # ADMG (with bidirected edges) admg <- caugi(X %-->% Y, Y %<->% Z, class = "ADMG") # UG (undirected graph) ug <- caugi(X %---% Y, Y %---% Z, class = "UG") # Save them all write_caugi(dag, tempfile(fileext = ".caugi.json")) write_caugi(pdag, tempfile(fileext = ".caugi.json")) write_caugi(admg, tempfile(fileext = ".caugi.json")) write_caugi(ug, tempfile(fileext = ".caugi.json")) ``` ## File Extension Convention We recommend using `.caugi.json` as the file extension to clearly indicate both the format and content type. This helps tools recognize the files and enables automatic handling by IDEs and validators. ## Schema Validation All files generated by `write_caugi()` include a `$schema` field pointing to the formal JSON Schema specification: ``` https://caugi.org/schemas/caugi-v1.schema.json ``` This enables: - **IDE support**: Autocomplete and inline validation in VS Code, IntelliJ, etc. - **Automated validation**: Use standard JSON Schema validators - **Documentation**: Hover hints in editors show field descriptions ## Performance Serialization is implemented in Rust for high performance. Large graphs serialize and deserialize efficiently: ```{r eval=FALSE} tmp_file <- tempfile(fileext = ".caugi.json") large_dag <- generate_graph(n = 1000, m = 500, class = "DAG") system.time(write_caugi(large_dag, tmp_file)) system.time(res <- read_caugi(tmp_file)) unlink(tmp_file) ``` ```{r cleanup, include=FALSE} # Clean up temp files unlink(tmp) ```