--- title: "Prediction Power Heatmaps" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Prediction Power Heatmaps} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set(out.width = "100%", cache = FALSE) ``` The function `make_pred_plot()` visualizes the output from `prediction_power()` as a heatmap. Each cell shows an expected conditional entropy value, where lower values indicate stronger prediction power. Diagonal entries correspond to prediction using a single predictor, while off-diagonal entries correspond to prediction using pairs of predictors. ```{r load-lib} library(netropy) ``` We first edit the node attributes so that all variables have finite categorical range spaces. The variables years and age are discretized into three categories. ```{r data-edit} df_att <- lawdata[[4]] att_var <- data.frame( status = df_att$status - 1, gender = df_att$gender, office = df_att$office - 1, years = ifelse(df_att$years <= 3, 0, ifelse(df_att$years <= 13, 1, 2)), age = ifelse(df_att$age <= 35, 0, ifelse(df_att$age <= 45, 1, 2)), practice = df_att$practice, lawschool = df_att$lawschool - 1 ) ``` The first rows of the edited attribute data are: ```{r} head(att_var) ``` ## Prediction Power Assume we are interested in predicting `status`, which indicates whether a lawyer is an associate or a partner. We first compute the prediction power matrix: ```{r pred-pow} pred_status <- prediction_power("status", att_var) pred_status ``` ### Heatmap Visualization The matrix can be visualized using `make_pred_plot()`: ```{r pred-plot, fig.height=7, fig.width=8} make_pred_plot(pred_status, "Prediction Power for Status") ``` Darker cells indicate lower expected conditional entropy and therefore stronger prediction power. The diagonal entries show prediction based on one variable, while the off-diagonal entries show prediction based on pairs of variables. ### Changing Plot Colors The colors can be adjusted using the `low` and `high` arguments. For example: ```{r col-change, fig.height=7, fig.width=8} make_pred_plot( pred_status, "Prediction Power for Status", low = "steelblue", high = "white" ) ``` ### Changing Text Size The size of the cell labels can be controlled with `text_size`: ```{r text-change, fig.height=7, fig.width=8} make_pred_plot( pred_status, "Prediction Power for Status", text_size = 6 ) ``` ## References > Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. *Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique*, 129(1), 45-63. [link](https://doi.org/10.1177%2F0759106315615511)