The function make_pred_plot() visualizes the output from
prediction_power() as a heatmap. Each cell shows an
expected conditional entropy value, where lower values indicate stronger
prediction power. Diagonal entries correspond to prediction using a
single predictor, while off-diagonal entries correspond to prediction
using pairs of predictors.
We first edit the node attributes so that all variables have finite categorical range spaces. The variables years and age are discretized into three categories.
df_att <- lawdata[[4]]
att_var <- data.frame(
status = df_att$status - 1,
gender = df_att$gender,
office = df_att$office - 1,
years = ifelse(df_att$years <= 3, 0,
ifelse(df_att$years <= 13, 1, 2)),
age = ifelse(df_att$age <= 35, 0,
ifelse(df_att$age <= 45, 1, 2)),
practice = df_att$practice,
lawschool = df_att$lawschool - 1
)The first rows of the edited attribute data are:
## status gender office years age practice lawschool
## 1 0 1 0 2 2 1 0
## 2 0 1 0 2 2 0 0
## 3 0 1 1 1 2 1 0
## 4 0 1 0 2 2 0 2
## 5 0 1 1 2 2 1 1
## 6 0 1 1 2 2 1 0
Assume we are interested in predicting status, which
indicates whether a lawyer is an associate or a partner. We first
compute the prediction power matrix:
## status gender office years age practice lawschool
## status NA NA NA NA NA NA NA
## gender NA 0.695 0.818 0.404 0.514 0.871 0.800
## office NA NA 1.084 0.302 0.526 0.944 0.841
## years NA NA NA 0.927 0.329 0.406 0.322
## age NA NA NA NA 1.007 0.683 0.617
## practice NA NA NA NA NA 1.226 0.916
## lawschool NA NA NA NA NA NA 1.693
The matrix can be visualized using make_pred_plot():
Darker cells indicate lower expected conditional entropy and therefore stronger prediction power. The diagonal entries show prediction based on one variable, while the off-diagonal entries show prediction based on pairs of variables.
The colors can be adjusted using the low and
high arguments. For example:
Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63. link