--- title: "Introduction to the TDCM Package" author: - "Matthew J. Madison" - "Michael E. Cotterell" date: "January 2024" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to the TDCM Package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( dpi = 300, out.width = "100%" ) # set output options ``` ## Overview of the TDCM Package The **TDCM** R package implements estimation of longitudinal diagnostic classification models (DCMs) using the transition diagnostic classification model (TDCM) framework described in [Madison & Bradshaw (2018)](https://doi.org/10.1007/s11336-018-9638-5). The TDCM is a longitudinal extension of the log-linear cognitive diagnosis model (LCDM) developed by [Henson, Templin & Willse (2009)](https://doi.org/10.1007/s11336-008-9089-5). As the LCDM is a general DCM, many other DCMs can be embedded within TDCM. The **TDCM** package includes functions to estimate the single group (`TDCM::tdcm()`) and multigroup (`TDCM::mg.tdcm()`) TDCM and summarize results of interest, including: item parameter estimates, growth proportions, transition probabilities, transition reliability, attribute correlations, model fit, and growth plots. Internally, the **TDCM** package uses `CDM::gdina()` from the **CDM** package developed by [Robitzsch et al. (2022)](https://doi.org/10.18637/jss.v074.i02) to estimate TDCMs using a method described in [Madison et al. (2024)](https://doi.org/10.1007/s41237-023-00202-5). This vignette provides an overview of the package's core functionality by walking through two examples. The code below can be copied into the R console and run. For more detailed video demonstrations of the package and its functionality, visit Matthew J. Madison's [Logitudinal DCMs page](http://www.matthewmadison.com/longdcms.html). ## Core Functionalities * To estimate the single group and multigroup TDCM, use the `TDCM::tdcm()` and `TDCM::mg.tdcm()` functions, respectively. * To extract item, person, and structural parameters from TDCM estimates, use the `TDCM::tdcm.summary()` and `TDCM::mg.tdcm.summary()` functions for single group and multigroup analyses, respectively. These summary functions produce a list of results that include: item parameter estimates, growth proportions, transition probability matrices, transition reliability, attribute correlations, and model fit. * To compare TDCMs and assess relative fit, use the `TDCM::tdcm.compare()` function. * To plot the results of a TDCM analysis, use the `TDCM::tdcm.plot()` function. * To score responses using fixed item parameters from a previously calibrated model, use the `TDCM::tdcm.score()` function. ## Extended Functionalities * Different DCMs (e.g., DINA, ACDM) can be modeled using the `TDCM::tdcm()` function by supplying an argument for its `rule` parameter. These correspond to the condensation rules available via `CDM::gdina()`. * Using multiple Q-matrices for each time is supported by the `TDCM::tdcm()` function. To enable this functionality, an argument $>=$ 2 must be supplies for its `num.q.matrix` parameter, and an appropriately stacked Q-matrix must be supplied for its `q.matrix` parameter. * Anchor (common) items between time points can be specified with the `anchor` parameter. * For more than two time points, transitions can be defined differently (e.g., first-to-last, first-to-each, successive) with the `transition.option` parameter. * Responses can be scored using fixed item parameters from a previously calibrated model using the `TDCM::score()` function. ## Example 1: Single Group TDCM Suppose we have a sample of 1000 fourth grade students. They were assessed before and after a unit covering 4 measurement and data (MD) standards (attributes): ```{r, eval = TRUE} standards <- paste0("4.MD.", 1:4) standards ``` The students took the same 20-item assessment, five weeks apart. The goal is to examine how the students transition to proficiency of the 4 assessed attributes. ### Step 1: Load the Package and Sample Dataset ```{r, eval = TRUE} # Load the TDCM package and sample dataset library(TDCM) data(data.tdcm01, package = "TDCM") # Get item responses from sample data. data <- data.tdcm01$data head(data) # Get Q-matrix from sample data and rename the attributes to match the standard. q.matrix <- data.tdcm01$q.matrix colnames(q.matrix) <- standards q.matrix ``` ### Step 2: Estimate the TDCM To estimate the TDCM, let's make some decisions. The Q-matrix has some complex items measuring 2 attributes, so we initially estimate the full LCDM with two-way interactions (default). Since the students took the same assessment, we can assume measurement invariance and will test the assumption later. ```{r, eval = TRUE} # Calibrate TDCM with measurement invariance assumed, full LCDM model1 <- tdcm(data, q.matrix, num.time.points = 2) ``` ### Step 3: Summarize the Results To summarize results, use the `TDCM::tdcm.summary()`function. After running the summary function, we can examine item parameters, growth in attribute proficiency, transition probability matrices, individual transitions, and transitional reliability estimates. ```{r, eval = TRUE} # Summarize the results results1 <- tdcm.summary(model1, num.time.points = 2, attribute.names = standards) ``` To demonstrate interpretation, let's discuss some of the results. ```{r, eval = TRUE} item.parameters <- results1$item.parameters item.parameters ``` Item 1 measuring `4.MD.1` has an intercept estimate of `r results1$item.parameters[1]` and a main effect estimate of `r results1$item.parameters[2]`. ```{r, eval = TRUE} growth <- results1$growth growth ``` ```{r, include = FALSE} growth.change <- growth[, 2] - growth[, 1] growth.similar <- paste0(round(mean(growth.change[1:3]) * 100, digits = 2), "%") growth.outlier <- paste0(round(growth.change[4] * 100, digits = 2), "%") ``` With respect to growth, we see that students exhibited about the same amount of growth for `4.MD.1`, `4.MD.2`, and `4.MD.3` (about `r growth.similar` growth in proficiency), but showed larger gains for `4.MD.4` (about `r growth.outlier`). ```{r, eval = TRUE} transition.probabilities <- results1$transition.probabilities transition.probabilities ``` ```{r, include = FALSE} p1 <- transition.probabilities[,, 1] p1 <- round(p1[[1,2]] * 100, digits = 2) p1 <- paste0(p1, "%") ``` Examining the `4.MD.1` transition probability matrix, we see that of the students who started in non-proficiency, `r p1` of them transitioned into proficiency. ```{r, eval = TRUE} transition.posteriors <- results1$transition.posteriors head(transition.posteriors) ``` ```{r, include = FALSE} maxp01 <- max(head(transition.posteriors[,,1][,2])) ``` Examining the individual transition posterior probabilities, we see that Examinee 1 has a mostly likely transition of 0 → 1 (`r maxp01` probability). ```{r, eval = TRUE} results1$reliability ``` Finally, transition reliability appears adequate, with average maximum transition posteriors ranging from .88 to .92 for the four attributes. ### Step 4: Assess Measurement Invariance To assess measurement invariance, let's estimate a model without invariance assumed, then compare to our first model. Here we see that AIC, BIC, and the likelihood ratio test prefer the model with invariance assumed. Therefore, item parameter invariance is a reasonable assumption and we can interpret results. ```{r, eval = TRUE} # Estimate TDCM with measurement invariance not assumed. model2 <- tdcm(data, q.matrix, num.time.points = 2, invariance = FALSE) # Compare Model 1 (longitudinal invariance assumed) to Model 2 (invariance not assumed). tdcm.compare(model1, model2) ``` ### Step 5: Estimate other DCMs To estimate other DCMs, change the `rule` argument. To specify one DCM across all items, include one specification. To specify a different DCM on each item, use a vector with length equal to the number of items. Here, we specify a DINA measurement model and a main effects model (ACDM). Here, we see that the full LCDM fits better than the DINA model and the main effects model. ```{r, eval = TRUE} # calibrate TDCM with measurement invariance assumed, DINA measurement model model3 <- tdcm(data, q.matrix, num.time.points = 2, rule = "DINA") #calibrate TDCM with measurement invariance assumed, ACDM measurement model model4 <- tdcm(data, q.matrix, num.time.points = 2, rule = "ACDM") #compare Model 1 (full LCDM) to Model 3 (DINA) tdcm.compare(model1, model3) #compare Model 1 (full LCDM) to Model 4 (ACDM) tdcm.compare(model1, model4) ``` ### Step 6: Assess Absolute Fit To assess absolute fit, extract model fit statistics from the results summary. ```{r, eval = TRUE} results1$model.fit$Global.Fit.Stats results1$model.fit$Global.Fit.Tests results1$model.fit$Global.Fit.Stats2 results1$model.fit$Item.RMSEA results1$model.fit$Mean.Item.RMSEA ``` ### Step 7: Visualize For a visual presentation of results, run the `tdcm.plot()` function: ```{r, eval = FALSE} # plot results (check plot viewer for line plot and bar chart) tdcm.plot(results1, attribute.names = standards) ``` ## Example 2: Multigroup TDCM Suppose now that we have a sample of 1700 fourth grade students. But in this example, researchers wanted to evaluate the effects of an instructional intervention. So they randomly assigned students to either the control group (Group 1, N1 = 800) or the treatment group (Group 2, N2 = 900). The goal was to see if the innovative instructional method resulted in more students transitioning into proficiency. Similar to Example #1, students were assessed before and after a unit covering four measurement and data (MD) standards (attributes; 4.MD.1 - 4.MD.4). The students took the same 20-item assessment five weeks apart. **Step 1:** Load the package and Dataset #4 included in the package: ```{r, eval = TRUE} #load the TDCM library library(TDCM) #read data, Q-matrix, and group labels dat4 <- data.tdcm04$data qmat4 <- data.tdcm04$q.matrix groups <- data.tdcm04$groups head(dat4) ``` **Step 2:** To estimate the multigroup TDCM, we will use the **mg.tdcm()** function. For this initial model, we will assume item invariance and group invariance. In the next step, we will test these assumptions. ```{r, eval = TRUE} #calibrate mgTDCM with item and group invariance assumed, full LCDM mg1 <- mg.tdcm(data = dat4, q.matrix = qmat4, num.time.points = 2, rule = "GDINA", groups = groups, group.invariance = TRUE, item.invariance = TRUE) ``` **Step 3:** To assess measurement invariance, let's estimate three additional models: - A model assuming item invariance (TRUE) and not assuming group invariance (FALSE) - A model not assuming item invariance (FALSE) and assuming group invariance (TRUE) - A model not assuming either; item invariance (FALSE) and group invariance (FALSE) All model comparisons prefer the model with group and time invariance. Therefore, we can proceed in interpreting Model 1. ```{r, eval = TRUE} #calibrate mgTDCM with item invariance assumed, full LCDM mg2 <- mg.tdcm(data = dat4, q.matrix = qmat4, num.time.points = 2, groups = groups, group.invariance = FALSE, item.invariance = TRUE) #calibrate mgTDCM with group invariance assumed, full LCDM mg3 <- mg.tdcm(data = dat4, q.matrix = qmat4, num.time.points = 2, groups = groups, group.invariance = TRUE, item.invariance = FALSE) #calibrate mgTDCM with no invariance assumed, full LCDM mg4 <- mg.tdcm(data = dat4, q.matrix = qmat4, num.time.points = 2, groups = groups, group.invariance = FALSE, item.invariance = FALSE) #compare Model 1 (group/item invariance) to Model 2 (no group invariance) tdcm.compare(mg1, mg2) #compare Model 1 (group/item invariance) to Model 3 (no item invariance) tdcm.compare(mg1, mg3) #compare Model 1 (group/item invariance) to Model 4 (no invariance) tdcm.compare(model1, model4) ``` **Step 4:** To summarize results, use the **mg.tdcm.summary()** function. After running the summary function, we can examine item parameters, growth in attribute proficiency by group, transition probability matrices by group, individual transitions, and transitional reliability estimates. To demonstrate interpretation, let's discuss some of the results. Item 1 measuring 4.MD.1 has an intercept estimate of -1.87 and a main effect estimate of 2.375. With respect to growth, first we see that the randomization appeared to work, as both groups had similar proficiency proportions at the first assessment. Then we see that for all but the 4.MD.4 attribute, the treatment group showed increased growth in attribute proficiency. ```{r, eval = TRUE} #summarize results resultsmg1 <- mg.tdcm.summary(mg1, num.time.points = 2, attribute.names = c("4.MD.1", "4.MD.2", "4.MD.3", "4.MD.4"), group.names = c("Control", "Treatment")) resultsmg1$item.parameters resultsmg1$growth resultsmg1$transition.probabilities head(resultsmg1$transition.posteriors) resultsmg1$reliability ``` **Step 5:** For a visual presentation of results, run the **tdcm.plot()** function: ```{r, eval = TRUE} #plot results (check plot viewer for line plots and bar charts) tdcm.plot(resultsmg1, attribute.names = c("4.MD.1", "4.MD.2", "4.MD.3", "4.MD.4"), group.names = c("Control", "Treatment")) ``` ## References * Madison, M.J., Chung, S., Kim, J., Bradshaw, L.P. (2024). Approaches to estimating longitudinal diagnostic classification models. _Behaviormetrika_ 51, 7–19. doi:10.1007/s41237-023-00202-5 * George, A. C., Robitzsch, A., Kiefer, T., Gross, J., & Uenlue, A. (2016). The R Package CDM for cognitive diagnosis models. _Journal of Statistical Software_, 74(2), 1-24. doi:10.18637/jss.v074.i02 * Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2022). _CDM: Cognitive Diagnosis Modeling_. R package version 8.2-6. https://CRAN.R-project.org/package=CDM