--- title: "GxE Testing" author: "Alexia Jolicoeur-Martineau" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{GxE Testing} %\VignetteEncoding{UTF-8} --- **Note that you can cite this work as: ** *Jolicoeur-Martineau, A., Wazana, A., Szekely, E., Steiner, M., Fleming, A. S., Kennedy, J. L., Meaney M. J. & Greenwood, C.M. (2017). Alternating optimization for GxE modelling with weighted genetic and environmental scores: examples from the MAVAN study. arXiv preprint arXiv:1703.08111.* *Jolicoeur-Martineau, A., Belsky, J., Szekely, E., Widaman, K. F., Pluess, M., Greenwood, C., & Wazana, A. (2017). Distinguishing differential susceptibility, diathesis-stress and vantage sensitivity: beyond the single gene and environment model. arXiv preprint arXiv:1712.04058.* ## GxE testing From version 1.2.0 of the LEGIT package, we introduce a function to test for different types of gene-by-environment interactions (GxE) as per [Belsky et al. (2013)](https://www.researchgate.net/publication/256600905_FormalGXEtestJCPP2013). See our preprint: https://osf.io/preprints/psyarxiv/27uw8. We want to test if the GxE reflects diathesis stress, differential susceptibility or vantage sensitivity. The figure below is a graphical representation of the different types of interactions assuming a STRONG model: ```{r, out.width = "700px"} knitr::include_graphics("https://raw.githubusercontent.com/AlexiaJM/LEGIT/master/images/GxE_testing_strong.png") ``` The figure below is a graphical representation of the different types of interactions assuming a WEAK model: ```{r, out.width = "700px"} knitr::include_graphics("https://raw.githubusercontent.com/AlexiaJM/LEGIT/master/images/GxE_testing_weak.png") ``` To test for differential susceptibility, Widaman et al. (2013) estimate the cross-over point of the model. This can be done easily with LEGIT by simply adding a negative intercept to the environment E. This can be seen by the following: $$y = 2 + 3(E-c) + 5G(E-c) \\ E = q_1e_1 + q_2e_2 + ... + q_le_l$$ is equivalent to $$y = 2 + 3E' + 5GE' \\ E' = -c + q_1e_1 + q_2e_2 + ... + q_le_l$$ To test for vantage sensitivity and diathesis-stress, we can use the model above but fix the cross-over point to the expected minimum and maximum of $E$ respectively. ## Simulation of diathesis-stress, vantage sensitivity and differential susceptibility Let's look at a two-way interaction model with a cross-over point and a continuous outcome: $$\mathbf{g}_j \sim Binomial(n=1,p=.30) \\ j = 1, 2, 3, 4$$ $$\mathbf{e}_l \sim 10 Beta(\alpha=2,\beta=2) \\ l = 1, 2, 3$$ $$\mathbf{g} = .30\mathbf{g}_1 + .10\mathbf{g}_2 + .20\mathbf{g}_3 + .40\mathbf{g}_4 $$ $$ \mathbf{e} = .45\mathbf{e}_1 + .35\mathbf{e}_2 + .20\mathbf{e}_3$$ $$\mathbf{\mu} = 3 + \beta_E(\mathbf{e}-c) + 2\mathbf{g}(\mathbf{e}-c) $$ where $c$ and $\beta_E$ depend on the model assumed. $c = 0$ represents vantage sensitivity, $c = 10$ represents diathesis-stress. We set $c=5$ to represents differential susceptibility. $\beta_E = 0$ assumes a STRONG model while $\beta_E \ne 0$ assumes a WEAK model. $$ y \sim Normal(\mu,\sigma=1)$$ Note that the environmental score E is in the range [0,10]. ## Example 1: Vantage sensitivity WEAK When assuming vantage sensitivity WEAK, we let $c=0$ so that: $$\mathbf{\mu} = 3 + (\mathbf{e}-0) + 2\mathbf{g}(\mathbf{e}-0) $$ We generate N=250 observations and test all 4 possible models based on the BIC (other choices are also available: AIC, AICc, cross-validation and cross-validated AUC). It is important to not include G or E in the model formula because they will be added automatically. We start by assuming that we don't know the true minimum of the environmental score, thus we use option *crossover = c("min","max")* to find the minimum/maximum from the observed data (this is the default). ```{r} set.seed(777) library(LEGIT) ex_dia = example_with_crossover(N=250, c=0, coef_main = c(3,1,2), sigma=1, seed=7) GxE_test_BIC = GxE_interaction_test(data=ex_dia$data, genes=ex_dia$G, env=ex_dia$E, formula_noGxE = y ~ 1, crossover = c("min","max"), criterion="BIC") GxE_test_BIC$results ``` Here we see the results. Notice that vantage sensitivity WEAK is the model with the lowest BIC. Differential susceptibility WEAK has a slightly worse fit and the crossover point is not within observable range, thus we reject it. Note that we only show the element *results* of the output, but the 6 models (weak/strong vantage sensitivity, diathesis-stress and differential susceptibility) are also returned. Considering that we know that the minimum and maximum of E are 0 and 10 respectively, we can also refit the models with *crossover = c(0, 10)*. ```{r} GxE_test_BIC_ = GxE_interaction_test(data=ex_dia$data, genes=ex_dia$G, env=ex_dia$E, formula_noGxE = y ~ 1, crossover = c(0, 10), criterion="BIC") GxE_test_BIC_$results ``` We obtain similar results. We can then plot the best model: ```{r fig1, fig.height = 5, fig.width = 5} plot(GxE_test_BIC$fits$vantage_sensitivity_WEAK, xlim=c(0,10), ylim=c(3,13),cex.leg=1.4, cex.axis=1.5, cex.lab=1.5) ``` ## Example 2: Vantage sensitivity STRONG When assuming vantage sensitivity STRONG, we let $c=0$ and $\beta_E=0$ so that: $$\mathbf{\mu} = 3 + 2\mathbf{g}(\mathbf{e}-0) $$ We generate N=250 observations and test all 6 possible models based on the BIC assuming we don't know what is the minimum environmental score. ```{r} ex_dia_s = example_with_crossover(N=250, c=0, coef_main = c(3,0,2), sigma=1, seed=7) GxE_test_BIC = GxE_interaction_test(data=ex_dia_s$data, genes=ex_dia_s$G, env=ex_dia_s$E, formula_noGxE = y ~ 1, crossover = c("min","max"), criterion="BIC") GxE_test_BIC$results ``` We conclude that vantage sensitivity STRONG is the best model. We can then plot the best model: ```{r fig2, fig.height = 5, fig.width = 5} plot(GxE_test_BIC$fits$vantage_sensitivity_STRONG, xlim=c(0,10), ylim=c(3,13),cex.leg=1.4, cex.axis=1.5, cex.lab=1.5) ``` ## Example 3: Differential susceptibility WEAK When assuming differential susceptibility WEAK, we let $c=5$ (and use 8 instead of 3 as intercept to keep a similar outcome range) so that: $$\mathbf{\mu} = 8 + (\mathbf{e}-5) + 2\mathbf{g}(\mathbf{e}-5) $$ We generate N=250 observations and test all 6 possible models based on the BIC assuming we don't know what is the minimum environmental score. ```{r} ex_ds = example_with_crossover(N=250, c=5, coef_main = c(3+5,1,2), sigma=1, seed=7) GxE_test_BIC = GxE_interaction_test(data=ex_ds$data, genes=ex_ds$G, env=ex_ds$E, formula_noGxE = y ~ 1, crossover = c("min","max"), criterion="BIC") GxE_test_BIC$results ``` We conclude that differential susceptibility WEAK is the best model. We can then plot the best model: ```{r fig3, fig.height = 5, fig.width = 5} plot(GxE_test_BIC$fits$diff_suscept_WEAK, xlim=c(0,10), ylim=c(3,13),cex.leg=1.4, cex.axis=1.5, cex.lab=1.5) ``` ## Example 4: Differential susceptibility STRONG When assuming differential susceptibility STRONG, we let $c=5$ and $\beta_E=0$ (and use 8 instead of 3 as intercept to keep a similar outcome range) so that: $$\mathbf{\mu} = 8 + 2\mathbf{g}(\mathbf{e}-5) $$ We generate N=250 observations and test all 6 possible models based on the BIC assuming we don't know what is the minimum environmental score. ```{r} ex_ds_s = example_with_crossover(N=250, c=5, coef_main = c(3+5,0,2), sigma=1, seed=7) GxE_test_BIC = GxE_interaction_test(data=ex_ds_s$data, genes=ex_ds_s$G, env=ex_ds_s$E, formula_noGxE = y ~ 1, crossover = c("min","max"), criterion="BIC") GxE_test_BIC$results ``` We conclude that differential susceptibility STRONG is the best model. We can then plot the best model: ```{r fig4, fig.height = 5, fig.width = 5} plot(GxE_test_BIC$fits$diff_suscept_STRONG, xlim=c(0,10), ylim=c(3,13),cex.leg=1.4, cex.axis=1.5, cex.lab=1.5) ```