--- title: "Getting started with vcfheader" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started with vcfheader} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} library(vcfheader) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "centre" ) ``` The fast variant call format (VCF) file header intelligence and audit. ## TLDR The main workflow is to parse a VCF header and write a standalone HTML report. ```{r eval = FALSE} input <- parse_vcf_header("file.vcf") vcfheader( input, file = "file_report.html" ) ``` ## Example report preview A preview of the bundled example HTML report is shown below. ```{r echo = FALSE, out.width = "100%"} knitr::include_graphics("simple_vcfheader_screenshot1.png") knitr::include_graphics("simple_vcfheader_screenshot2.png") ``` The report includes file metadata, contig summaries, INFO and FORMAT field definitions, and other header entries in a portable HTML layout. ## Overview The `vcfheader` package reads and interprets Variant Call Format (VCF) header metadata without loading full variant records. It is designed for fast inspection, validation, and reporting of header structure in both small and large VCF files. This vignette shows how to: * read bundled example VCF headers * parse the header into a structured `vcf_hdr` object * inspect key metadata and inferred annotations * generate an HTML report * open a bundled example HTML report * preview the report layout with a screenshot ## Bundled example files The package examples use small VCF files shipped with the package, so the vignette works offline and is suitable for package checking. ```{r} simple_vcf <- system.file("extdata", "simple.vcf", package = "vcfheader") sv_vcf <- system.file("extdata", "sv44.vcf", package = "vcfheader") basename(simple_vcf) basename(sv_vcf) ``` ## Read raw header lines Use `read_vcf_header()` when you want only the original header lines. ```{r} hdr_lines <- read_vcf_header(simple_vcf) head(hdr_lines, 5) ``` ## Parse a VCF header Use `parse_vcf_header()` to create a structured object containing file metadata, contigs, INFO and FORMAT definitions, sample names, warnings, and errors. ```{r} hdr <- parse_vcf_header(simple_vcf) hdr ``` ## Inspect parsed content The parsed object is a list-like S3 object with standard components. ```{r} names(hdr) ``` File-level metadata: ```{r} file_meta <- hdr$file file_meta$input_path <- basename(file_meta$input_path) file_meta ``` Sample names: ```{r} hdr$samples ``` INFO definitions: ```{r} hdr$info[, c("ID", "Number", "Type", "Description")] ``` FORMAT definitions: ```{r} hdr$format[, c("ID", "Number", "Type", "Description")] ``` Validation output: ```{r} hdr$warnings hdr$errors ``` ## Inference from the header The parser can add lightweight inferred annotations. For example, some common annotation systems can be guessed from reserved INFO tags. ```{r} hdr$file$caller_guess ``` ## Structural variant example The structural-variant example includes symbolic ALT definitions and structural INFO fields. ```{r} sv_hdr <- parse_vcf_header(sv_vcf) sv_hdr ``` Contigs: ```{r} sv_hdr$contigs[, intersect(c("ID", "length", "assembly", "md5"), names(sv_hdr$contigs)), drop = FALSE] ``` ALT definitions: ```{r} sv_hdr$alt[, c("ID", "Description")] ``` INFO fields: ```{r} sv_hdr$info[, c("ID", "Number", "Type", "Description")] ``` ## Generate an HTML report `vcfheader()` writes a standalone HTML report. This is the main use case. ```{r} hdr <- parse_vcf_header("file.vcf") out_file <- "file_report.html" vcfheader( hdr, file = out_file ) file.exists(out_file) basename(out_file) ``` You can also let `vcfheader()` derive the output path from the original input file path. ```{r eval = FALSE} vcfheader(hdr) ``` ## Bundled example HTML report The package also ships a prebuilt example report generated from the bundled `simple.vcf` file. ```{r} example_report <- system.file( "extdata", "simple_vcfheader.html", package = "vcfheader" ) basename(example_report) file.exists(example_report) ``` This file can be opened directly in a browser after installation. ## Summary `vcfheader` provides a quick way to inspect VCF header metadata, validate structure, extract structured definitions, and produce a portable HTML summary for review or reporting. *For research use only.* VCFheader is a free and open-source tool provided by Switzerland Omics. Use of the software and generated reports is permitted, including in commercial settings, under the MIT License. Attribution to Switzerland Omics and VCFheader should be retained where reasonably practicable. Further information: . Switzerland OmicsĀ® is a registered trade mark. VCF specification references in this report relate to samtools and the broader HTS specifications ecosystem and are distributed under the MIT/Expat License by Genome Research Ltd. Source: . Further reading on HTS and file format specifications: .