--- title: Using the rip.opencv package author: Deepayan Sarkar and Kaustav Nandy vignette: > %\VignetteIndexEntry{Using the rip.opencv package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r opts, echo = FALSE, results = "hide", warning = FALSE, message = FALSE} knitr::opts_chunk$set(dev = "jpeg", cache = FALSE, autodep = TRUE, echo = TRUE, prompt = FALSE, comment = NA, warning = TRUE, message = TRUE, knitr.table.format = "html", fig.width = 6, fig.height = 7, dpi = 96) ``` # Introduction The `rip.opencv` package provides access to selected routines in the [OpenCV](https://opencv.org/) computer vision library. Rather than exposing OpenCV classes and methods directly through external pointers, the package uses standard R objects to represent corresponding OpenCV objects, and explicitly converts between the two forms as necessary. The idea behind this design is that most operations will be performed in R, and OpenCV is used to simply make an additional suite of operations available. Depending on use case, the [opencv](https://github.com/ropensci/opencv) R package may be more useful for you, and a workflow mixing the two is facilitated by the image import / export functions in the respective packages. As of now, the only OpenCV class that has an R analogue is [`cv::Mat`](https://docs.opencv.org/master/d3/d63/classcv_1_1Mat.html#details), which can be used to represent an n-dimensional dense numerical single-channel or multi-channel array. It is used by OpenCV to store real or complex-valued vectors and matrices, grayscale or color images, and various other kinds of data. The R analogue of `cv::Mat` is an S3 class named `"rip"`. Regardless of the number of channels, a `"rip"` object is stored as a matrix, with the number of channels recorded in an attribute along with the number of rows and columns. A `"rip"` object can be created in R, most simply from a numeric (or complex) matrix. ```{r} suppressMessages(library(rip.opencv)) v <- as.rip(t(volcano), channel = 1) v class(v) str(v) ``` An `image()` method for `"rip"` objects can be used to plot it. ```{r,fig.height=5} image(v) ``` \ None of the above involves calling any OpenCV routines. Suppose we now want to use the OpenCV function [cv::dft()](https://docs.opencv.org/master/d2/de8/group__core__array.html#gadd6cf9baf2b8b704a11b5f04aaf4f39d) to obtain a 2-D Discrete Fourier transform of `v`. This can be done using the high-level `rip.dft()` function as follows: ```{r} V <- rip.dft(v) # uses OpenCV function cv::dft() V str(V) ``` The return value is again an R matrix containing complex values wrapped in the `"rip"` class. We can use standard R functions for further processing. ```{r,fig.width=10, fig.height=4.5} par(mfrow = c(1, 2)) image(log(Mod(V)), main = "log-modulus of DFT coefficients") image(Arg(V), main = "argument of DFT coefficients") ``` \ # Using the `"rip"` object for image manipulation The most common use of a `"rip"` object is to represent a grayscale or color image. Grayscale images are essentially matrices, and their representation as a `"rip"` object is conceptually straightforward. Color images require multiple channels, and depending on the color space used, many different representations are possible. The `"rip"` class uses one particular representation, corresponding to the default `cv::Mat` representation for color images: For an n-channel image, rows correspond to rows of the image, and successive n-tuples of columns represent the n channels corresponding to each column of the the image (i.e., successive columns do not represent the same channel). When plotting a color image using the `image()` function, it is assumed that the channels represent colors in the RGB color space and that columns are in the BGR ordering (this is the OpenCV default). The `rip.import()` function uses the OpenCV function `cv::imread()` to read image data from a file. ```{r} logo_path <- file.path(R.home(), "doc/html/logo.jpg") rlogo <- rip.import(logo_path, type = "color") rlogo dim(rlogo) str(rlogo) ``` __Note__ that although the image is 155 x 200, the underlying representation is a 155 x 600 matrix. Extracting and manipulating individual channels in R is difficult with this representation. To facilitate such manipulation, the `as.array()` method can convert such an image into a 3-way array, with the third dimension representing channels in RGB order by default. By design, no array-like indexing operators have been defined for `"rip"` objects; vector and matrix-like indexing yields vectors and matrices, dropping the class attribute. ```{r} a <- as.array(rlogo) # add reverse.rgb = FALSE to retain column order str(a) ``` Individual channels can now be easily extracted. The `as.rip()` function can also handle 3-way arrays as input, interpreting it as a color image. This allows channels to be manipulated retaining the array structure before further processing. ```{r,fig.width=10, fig.height=4} par(mfrow = c(1,2)) N <- prod(dim(a)[1:2]) red <- a; red[,,-1] <- runif(N, 0, 255); image(as.rip(red)) green <- a; green[,,-2] <- runif(N, 0, 255); image(as.rip(green)) ``` \ Similar methods are also available to convert to and from `"raster"` objects, including those created by the `png` and `jpeg` packages. In fact, the `image()` method essentially converts a `"rip"` object to a `"raster"` object before plotting. # Modules The `rip.dft()` and `rip.import()` functions are exceptions rather than the rule, in the sense that they are two of the very few high-level functions available in the `rip.opencv` package. Most functionality is instead exposed through low-level interfaces to selected OpenCV routines through Rcpp modules. These are not expected to be called directly by the end-user, and are rather meant to be used by other packages; careless use may lead to the R session crashing. All modules in the package are available through the `rip.cv` variable, which is an environment containing named modules that group together similar functionality. ```{r} ls(rip.cv) ``` The grouping is somewhat arbitrary and may change as the package evolves. For example, the `IO` module contains interfaces to the `cv::imread()` and `cv::imwrite()` functions, which are also exposed through the high-level functions `rip.import()` and `rip.export()`. However, one can also call a function in the module directly. ```{r} rip.cv$IO$imread ``` We will use this function to read in a sample image. The result depends on a "read mode" flag, which is an integer code (the OpenCV documentation has details) that can be supplied explicitly. Some selected flags are available as named integer vectors in `rip.cv$enums`; these are not exhaustive but covers most common use cases. The following example reads in a color image in grayscale mode. ```{r} f <- system.file("sample/color.jpg", package = "rip.opencv", mustWork = TRUE) (imreadModes <- rip.cv$enums$ImreadModes) (x <- rip.cv$IO$imread(f, imreadModes["IMREAD_GRAYSCALE"])) ``` Once imported, the `"rip"` object can be manipulated as usual. ```{r,fig.width=10, fig.height=6} par(mfrow = c(1, 2)) image(x, rescale = FALSE) image(255 - x, rescale = FALSE) # negative ``` \ ## Example: convert color image to grayscale We can of course read in the image in color mode as well. ```{r} (x <- rip.cv$IO$imread(f, imreadModes["IMREAD_COLOR"])) ``` Suppose we now want to convert it into a grayscale image. One way to do so is to use the `cv::decolor()` function, which implements a contrast-preserving decolorization algorithm. Another alternative is to use the `cv::cvtColor()`. ```{r,fig.width=10, fig.height=3} y1 <- rip.cv$photo$decolor(x) convcodes <- rip.cv$enums$ColorConversionCodes y2 <- rip.cv$imgproc$cvtColor(x, convcodes["COLOR_BGR2GRAY"]) range(y1 - y2) par(mfrow = c(1, 4)) image(x, rescale = FALSE, main = "original") image(y1, rescale = FALSE, main = "decolor") image(y2, rescale = FALSE, main = "cvtColor") image(y1 - y2, rescale = TRUE, main = "difference") ``` \ The `cv::cvtColor()` function is designed for more general color space conversion, and can be used, for example, to go from the default BGR channel ordering to RGB, or an entirely different colorspace such as HSV. The `image()` function always assumes BGR or grayscale, so it will be confused by such changes. ```{r,fig.width=10, fig.height=6} par(mfrow = c(1, 2)) x.rgb <- rip.cv$imgproc$cvtColor(x, convcodes["COLOR_BGR2RGB"]) x.hsv <- rip.cv$imgproc$cvtColor(x, convcodes["COLOR_BGR2HSV"]) image(x.rgb) image(x.hsv) ``` \ ```{r,fig.width=10, fig.height=4} par(mfrow = c(1, 3)) image(rip.cv$photo$pencilSketch(x, color = TRUE, 80, 0.1, 0.02), rescale = FALSE) image(y <- rip.cv$photo$edgePreservingFilter(x, 2L, 60), rescale = FALSE) image(rip.cv$photo$stylization(y, 60), rescale = FALSE) ``` \ ## Other modules A full list of currently available modules and the functions in them can be obtained as follows (FIXME: find better way). ```{r} do.call(rbind, lapply(ls(rip.cv), function(m) data.frame(Module = m, Function = .DollarNames(rip.cv[[m]], "")))) ``` The choice of functions as well as their organization into modules are somewhat arbitrary, and details may change as the package evolves. Not all of these are well tested, and most are thin wrappers around OpenCV functions that take one or more `cv::Mat` objects as input and produce one as output. More routines may be added in future; an incomplete list of potential candidates are available [here](https://github.com/deepayan/rip/blob/main/rip.opencv/README.md). It is of course perfectly reasonable to want operations that combine multiple OpenCV functions, especially those that involve structures and classes more complicated than `cv::Mat`. Such functions (written in OpenCV) can also be wrapped, although there are few examples as of now. A very simple example is `rip.cv$IO$vfread`, which reads in a single frame from a video file. # High level wrappers Apart from the modules discussed above, the package provides a few high level functions for common tasks that do more error checking and give a more R like interface. These include: - `rip.import()`, `rip.export()` for file import and export respectively, - `rip.desaturate()` for converting color images to grayscale, supporting several methods, - `rip.pad()` to add borders to a matrix with various kinds of border extrapolation, - `rip.resize()` for resizing matrices using various kinds of interpolation (including bicubic and bilinear), - `rip.blur()` for various kinds of blurring, - `rip.flip()` to reverse row and column order, and - `rip.filter()` for filtering and convolution. In addition, the `rip.dft()` function, and its normalized version `rip.ndft()`, are designed to compute the 2-D DFT of a matrix and its inverse. These are particularly useful because they provide a non-trivial amount of sugar around `cv::dft()` in terms of handling complex values and combining various DFT flags. A related function `rip.shift()` shifts rows and columns by half, which is useful for changing DFT coefficients from a $[0, 2\pi]$ range to $[-\pi, \pi]$ and back. The project page on Gihub can be used for bug reports and patches.