---
title: "Pathway case"
author: "Ingo Rohlfing"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Pathway case}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(MMRcaseselection)
```

The pathway was originally proposed by Gerring ([2007](https://doi.org/10.1177%2F0010414006290784)). He defines the *pathway value* of a case as  $|resid_{i_{reduced}}-resid_{i_{full}}|$ if it holds that $|resid_{i_{reduced}}| > |resid_{i_{full}}|$, where 'full' stands for the full regression model, 'reduced' for the model that lacks the pathway variable of theoretical interest and $i$ being a case index. Following Gerring, one should only choose among the cases meeting the requirement that $|resid_{i_{reduced}}| > |resid_{i_{full}}|$. In follow up research, Weller and Barnes ([2014](https://doi.org/10.1017/CBO9781139644501)) propose a different calculation of the pathway value, $|resid_{i_{reduced}}|-|resid_{i_{full}}|$, without specifying an additional requirement about the relationship between the full model residuals and reduced model residuals.

The function `pathway()` calculates both types of pathway values and requires the full regression model and the reduced regression model as input. Both models must be `lm` objects. The dataframe generated by the function contains all variables from the full model plus the following variables:

- `full_resid`: Residuals in full model
- `reduced_resid`: Residuals in reduced model
- `pathway_wb`: Pathway values as proposed by Weller and Barnes ($|resid_{i_{reduced}}|-|resid_{i_{full}}|$)
- `pathway_gvalue`: Pathway values as proposed by Gerring ($|resid_{i_{reduced}}-resid_{i_{full}}|$)
- `pathway_gstatus`: Binary character variable that is coded "yes" if $|resid_{reduced}| > |resid_{full}|$ is met and "no" otherwise.
```{r}
df_full <- lm(mpg ~ disp + wt, data = mtcars) # full model
df_reduced <- lm(mpg ~ wt, data = mtcars) # reduced model dropp 'disp' as pathway variable
pw_out <- pathway(df_full, df_reduced) # calculation of pathway variables
head(pw_out)
```

The visualization of pathway values is different from the presentation of ordinary residuals because two models are involved and an observed-vs-fitted plot is not meaningful. Following the approach by Weller and Barnes, the `pathway_xvr()` function plots the pathway values against the pathway variable. The option `pathwaytype = "pathway_wb` produces a plot for the Weller/Barnes values. The pathway variable is determined by the function and does not have to be specified. The plot is a `gg` object that can be customized with the usual `ggplot2` options.
```{r, fig.height = 6, fig.width = 6}
pathway_xvr(df_full, df_reduced, pathway_type = "pathway_wb")
```

The Gerring pathway values are plotted against the pathway variable if the option is `pathway_type = "pathway_gvalue". A color scheme is used to distinguish the cases that meet the pathway case requirement ("yes") from those that don't ("no"). 
```{r, fig.height = 6, fig.width = 6}
pathway_xvr(df_full, df_reduced, pathway_type = "pathway_gvalue")
```