--- title: "Pathway case" author: "Ingo Rohlfing" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Pathway case} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(MMRcaseselection) ``` The pathway was originally proposed by Gerring ([2007](https://doi.org/10.1177%2F0010414006290784)). He defines the *pathway value* of a case as $|resid_{i_{reduced}}-resid_{i_{full}}|$ if it holds that $|resid_{i_{reduced}}| > |resid_{i_{full}}|$, where 'full' stands for the full regression model, 'reduced' for the model that lacks the pathway variable of theoretical interest and $i$ being a case index. Following Gerring, one should only choose among the cases meeting the requirement that $|resid_{i_{reduced}}| > |resid_{i_{full}}|$. In follow up research, Weller and Barnes ([2014](https://doi.org/10.1017/CBO9781139644501)) propose a different calculation of the pathway value, $|resid_{i_{reduced}}|-|resid_{i_{full}}|$, without specifying an additional requirement about the relationship between the full model residuals and reduced model residuals. The function `pathway()` calculates both types of pathway values and requires the full regression model and the reduced regression model as input. Both models must be `lm` objects. The dataframe generated by the function contains all variables from the full model plus the following variables: - `full_resid`: Residuals in full model - `reduced_resid`: Residuals in reduced model - `pathway_wb`: Pathway values as proposed by Weller and Barnes ($|resid_{i_{reduced}}|-|resid_{i_{full}}|$) - `pathway_gvalue`: Pathway values as proposed by Gerring ($|resid_{i_{reduced}}-resid_{i_{full}}|$) - `pathway_gstatus`: Binary character variable that is coded "yes" if $|resid_{reduced}| > |resid_{full}|$ is met and "no" otherwise. ```{r} df_full <- lm(mpg ~ disp + wt, data = mtcars) # full model df_reduced <- lm(mpg ~ wt, data = mtcars) # reduced model dropp 'disp' as pathway variable pw_out <- pathway(df_full, df_reduced) # calculation of pathway variables head(pw_out) ``` The visualization of pathway values is different from the presentation of ordinary residuals because two models are involved and an observed-vs-fitted plot is not meaningful. Following the approach by Weller and Barnes, the `pathway_xvr()` function plots the pathway values against the pathway variable. The option `pathwaytype = "pathway_wb` produces a plot for the Weller/Barnes values. The pathway variable is determined by the function and does not have to be specified. The plot is a `gg` object that can be customized with the usual `ggplot2` options. ```{r, fig.height = 6, fig.width = 6} pathway_xvr(df_full, df_reduced, pathway_type = "pathway_wb") ``` The Gerring pathway values are plotted against the pathway variable if the option is `pathway_type = "pathway_gvalue". A color scheme is used to distinguish the cases that meet the pathway case requirement ("yes") from those that don't ("no"). ```{r, fig.height = 6, fig.width = 6} pathway_xvr(df_full, df_reduced, pathway_type = "pathway_gvalue") ```