--- title: "Plot model coefficients with `ggcoef_model()`" author: Joseph Larmarange output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Plot model coefficients with `ggcoef_model()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) ``` ```{r include=FALSE} if ( !broom.helpers::.assert_package("emmeans", boolean = TRUE) ) { knitr::opts_chunk$set(eval = FALSE) } ``` The purpose of `ggcoef_model()` is to quickly plot the coefficients of a model. It is an updated and improved version of `GGally::ggcoef()` based on `broom.helpers::tidy_plus_plus()`. For displaying a nicely formatted table of the same models, look at `gtsummary::tbl_regression()`. ## Quick coefficients plot To work automatically, this function requires the `{broom.helpers}`. Simply call `ggcoef_model()` with a model object. It could be the result of `stats::lm`, `stats::glm` or any other model covered by `{broom.helpers}`. ```{r ggcoef-reg} data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) ``` In the case of a logistic regression (or any other model for which coefficients are usually exponentiated), simply indicated `exponentiate = TRUE`. Note that a logarithmic scale will be used for the x-axis. ```{r ggcoef-titanic} d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) ggcoef_model(mod_titanic, exponentiate = TRUE) ``` ## Customizing the plot ### Variable labels You can use the `{labelled}` package to define variable labels. They will be automatically used by `ggcoef_model()`. Note that variable labels should be defined before computing the model. ```{r} library(labelled) tips_labelled <- tips |> set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) ``` You can also define custom variable labels directly by passing a named vector to the `variable_labels` option. ```{r} ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) ``` If variable labels are to long, you can pass `ggplot2::label_wrap_gen()` or any other labeller function to `facet_labeller.` ```{r} ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) ``` Use `facet_row = NULL` to hide variable names. ```{r} ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) ``` ### Term labels Several options allows you to customize term labels. ```{r} ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) ``` By default, for categorical variables using treatment and sum contrasts, reference rows will be added and displayed on the graph. ```{r} mod_titanic2 <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial, contrasts = list(Sex = contr.sum, Class = contr.treatment(4, base = 3)) ) ggcoef_model(mod_titanic2, exponentiate = TRUE) ``` Continuous variables with polynomial terms defined with `stats::poly()` are also properly managed. ```{r} mod_poly <- lm(Sepal.Length ~ poly(Petal.Width, 3) + Petal.Length, data = iris) ggcoef_model(mod_poly) ``` Use `no_reference_row` to indicate which variables should not have a reference row added. ```{r} ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = "Sex" ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous() ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_categorical(), categorical_terms_pattern = "{level}/{reference_level}" ) ``` ### Elements to display Use `intercept = TRUE` to display intercepts. ```{r} ggcoef_model(mod_simple, intercept = TRUE) ``` You can remove confidence intervals with `conf.int = FALSE`. ```{r} ggcoef_model(mod_simple, conf.int = FALSE) ``` By default, significant terms (i.e. with a p-value below 5%) are highlighted using two types of dots. You can control the level of significance with `significance` or remove it with `significance = NULL`. ```{r} ggcoef_model(mod_simple, significance = NULL) ``` By default, dots are colored by variable. You can deactivate this behavior with `colour = NULL`. ```{r} ggcoef_model(mod_simple, colour = NULL) ``` You can display only a subset of terms with **include**. ```{r} ggcoef_model(mod_simple, include = c("time", "total_bill")) ``` It is possible to use `tidyselect` helpers. ```{r} ggcoef_model(mod_simple, include = dplyr::starts_with("t")) ``` You can remove stripped rows with `stripped_rows = FALSE`. ```{r} ggcoef_model(mod_simple, stripped_rows = FALSE) ``` Do not hesitate to consult the help file of `ggcoef_model()` to see all available options. ### ggplot2 elements The plot returned by `ggcoef_model()` is a classic `ggplot2` plot. You can therefore apply `ggplot2` functions to it. ```{r} ggcoef_model(mod_simple) + ggplot2::xlab("Coefficients") + ggplot2::ggtitle("Custom title") + ggplot2::scale_color_brewer(palette = "Set1") + ggplot2::theme(legend.position = "right") ``` ## Forest plot with a coefficient table `ggcoef_table()` is a variant of `ggcoef_model()` displaying a coefficient table on the right of the forest plot. ```{r} ggcoef_table(mod_simple) ggcoef_table(mod_titanic, exponentiate = TRUE) ``` You can easily customize the columns to be displayed. ```{r} ggcoef_table( mod_simple, table_stat = c("label", "estimate", "std.error", "ci"), ci_pattern = "{conf.low} to {conf.high}", table_stat_label = list( estimate = scales::label_number(accuracy = .001), conf.low = scales::label_number(accuracy = .01), conf.high = scales::label_number(accuracy = .01), std.error = scales::label_number(accuracy = .001), label = toupper ), table_header = c("Term", "Coef.", "SE", "CI"), table_witdhs = c(2, 3) ) ``` ## Multinomial models For multinomial models, simply use `ggcoef_multinom()`. Three types of visualizations are available: `"dodged"`, `"faceted"` and `"table"`. ```{r} library(nnet) hec <- as.data.frame(HairEyeColor) mod <- multinom( Hair ~ Eye + Sex, data = hec, weights = hec$Freq ) ggcoef_multinom( mod, exponentiate = TRUE ) ggcoef_multinom( mod, exponentiate = TRUE, type = "faceted" ) ``` ```{r, fig.height=9, fig.width=6} ggcoef_multinom( mod, exponentiate = TRUE, type = "table" ) ``` You can use `y.level_label` to customize the label of each level. ```{r} ggcoef_multinom( mod, type = "faceted", y.level_label = c("Brown" = "Brown\n(ref: Black)"), exponentiate = TRUE ) ``` ## Multi-components models Multi-components models such as zero-inflated Poisson or beta regression generate a set of terms for each of their components. You can use `ggcoef_multicomponents()` which is similar to `ggcoef_multinom()`. ```{r} library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ``` ```{r, fig.height=7, fig.width=6} ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) ``` ## Comparing several models You can easily compare several models with `ggcoef_compare()`. To be noted, `ggcoef_compare()` is not compatible with multinomial or multi-components models. ```{r} mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") ``` ## Advanced users Advanced users could use their own dataset and pass it to `ggcoef_plot()`. Such dataset could be produced by `ggcoef_model()`, `ggcoef_compare()` or `ggcoef_multinom()` with the option `return_data = TRUE` or by using `broom::tidy()` or `broom.helpers::tidy_plus_plus()`. ## Supported models ```{r, echo=FALSE} broom.helpers::supported_models |> knitr::kable() ``` Note: this list of models has been tested. `{broom.helpers}`, and therefore `ggcoef_model()`, may or may not work properly or partially with other types of models.