This R package performs regularization of differential item functioning (DIF) parameters in item response theory (IRT) models using a penalized expectation-maximization algorithm.
regDIF can:
To get the current released version from CRAN:
install.packages("regDIF")To get the current development version from Github:
# install.packages("devtools")
devtools::install_github("wbelzak/regDIF")A simulated data example with 6 item responses (binary) and 3 background variables (gender, age, study) is available in the regDIF package:
library(regDIF)
head(ida)
#>   item1 item2 item3 item4 item5 item6 age gender study
#> 1     0     0     0     0     0     0  -2     -1    -1
#> 2     0     0     0     0     0     0   0     -1    -1
#> 3     0     0     0     0     0     0   3     -1    -1
#> 4     0     1     1     1     1     1   1     -1    -1
#> 5     0     0     0     0     0     0  -2     -1    -1
#> 6     1     0     0     0     0     0   1     -1    -1First, the item responses and predictor values are separately specified:
item.data <- ida[, 1:6]
pred.data <- ida[, 7:9]Second, the regDIF() function fits a sequence of 10
tuning parameter values using a penalized EM algorithm, which assumes a
normal latent variable affects all item responses:
fit <- regDIF(item.data, pred.data, num.tau = 10)The DIF results are shown below:
summary(fit)
#> Call:
#> regDIF(item.data = item.data, pred.data = pred.data, num.tau = 10)
#> 
#> Optimal model (out of 10):
#>          tau          bic 
#>    0.1753246 4081.6941000 
#> 
#> Non-zero DIF effects:
#>    item4.int.age    item5.int.age item5.int.gender  item5.int.study 
#>           0.2153          -0.0897          -0.5717           0.6018 
#>  item4.slp.study item5.slp.gender 
#>          -0.0936          -0.1764When estimation speed is slow, proxy data may be used in place of latent score estimation:
fit_proxy <- regDIF(item.data, pred.data, prox.data = rowSums(item.data))summary(fit_proxy)
#> Call:
#> regDIF(item.data = item.data, pred.data = pred.data, prox.data = rowSums(item.data))
#> 
#> Optimal model (out of 100):
#>          tau          bic 
#>    0.2766486 3540.8070000 
#> 
#> Non-zero DIF effects:
#> item3.int.gender    item4.int.age item5.int.gender  item5.int.study 
#>           0.0955           0.2200          -0.5118           0.7040 
#> item2.slp.gender  item4.slp.study item5.slp.gender 
#>           0.1102          -0.1413          -0.1384Other penalty functions (besides LASSO) may also be used. For
instance, the elastic net penalty uses a second tuning parameter,
alpha, to vary the ratio of LASSO to ridge penalties:
fit_proxy_net <- regDIF(item.data, pred.data, prox.data = rowSums(item.data), alpha = .5)summary(fit_proxy_net)
#> Call:
#> regDIF(item.data = item.data, pred.data = pred.data, prox.data = rowSums(item.data), 
#>     alpha = 0.5)
#> 
#> Optimal model (out of 100):
#>          tau          bic 
#>    0.5685967 3563.7495000 
#> 
#> Non-zero DIF effects:
#> item3.int.gender    item4.int.age    item5.int.age item5.int.gender 
#>           0.0681           0.1672          -0.0939          -0.3463 
#>  item5.int.study item2.slp.gender  item4.slp.study item5.slp.gender 
#>           0.4346           0.0778          -0.1172          -0.1379Please send any questions to wbelzak@gmail.com.