--- title: "Most typical and most deviant" author: "Ingo Rohlfing" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Most typical and most deviant} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, message = F} library(MMRcaseselection) ``` One can use four functions to choose four *types of cases* without classifying all cases as either typical or deviant. The *most typical* and *most deviant* cases are proposed by Seawright and Gerring ([2008](https://doi.org/10.1177/1065912907313077)). The most typical case has the smallest residual of all cases. The most deviant case has the largest residual of all cases. The two functions `most_typical()` and `most_deviant()` work in the same way and show you the case with its residual. The input into the function is an `lm` object. ```{r} df <- lm(mpg ~ disp + wt, data = mtcars) most_typical(df) most_deviant(df) ``` The most deviant case does not distinguish between cases that have a large negative and a large positive residual. Cases with a negative residual are *overpredicted* because the predicted outcome is higher than the observed outcome. Cases with a positive residual are *underpredicted* because the predicted outcome is lower than the observed outcome. It might not matter whether a case is overpredicted or underpredicted because both subtypes of outliers can have the same type of deviance. However, one might be interested in knowing whether a case has a positive or negative residual and what the most overpredicted and underpredicted cases are. This is what the functions `most_overpredicted()` and `most_underpredicted()` achieve, each taking an `lm` object as input. ```{r} # largest positive residual most_underpredicted(df) # largest negative residual most_overpredicted(df) ``` The package does not include functions for plotting the cases. There are multiple, very useful packages such as the [olsrr](https://olsrr.rsquaredacademy.com/articles/residual_diagnostics.html) package that can be used for the easy visualization of residuals.