In this vignette, we fit two variable selection models: the first
(“null”) model has a uniform prior for all variables (the 442,001
genetic markers); the second model has higher prior probability for
genetic markers near cytokine signaling genes. This analysis is intended
to assess support for enrichment of Crohn’s disease risk factors near
cytokine signaling genes; a large Bayes factor means greater support for
this enrichment hypothesis. The data in this analysis consist of 442,001
SNPs genotyped for 1,748 cases and 2,938 controls. Note that file
cd.RData
cannot be made publicly available due to data
sharing restrictions, so this script is for viewing only.
Begin by loading a couple packages into the R environment.
library(lattice)
library(varbvs)
Set the random number generator seed.
set.seed(1)
load("cd.RData")
data(cytokine)
Here we compute the variational approximation given the assumption that all variables (genetic markers) are, a priori, equally likely to be included in the model.
<- varbvs(X,NULL,y,"binomial",logodds = -4,n0 = 0) fit.null
Next, compute the variational approximation given the assumption that genetic markers near cytokine signaling genes are more likely to be included in the model. For faster and more accurate computation of posterior probabilities, the variational parameters are initialized to the fitted values computed above with the exchangeable prior.
<- matrix(-4,442001,13)
logodds == 1,] <- matrix(-4 + seq(0,3,0.25),6711,13,byrow = TRUE)
logodds[cytokine <- varbvs(X,NULL,y,"binomial",logodds = logodds,n0 = 0,
fit.cytokine alpha = fit.null$alpha,mu = fit.null$mu,
eta = fit.null$eta,optimize.eta = TRUE)
Compute the Bayes factor.
<- varbvsbf(fit.null,fit.cytokine) BF
save(list = c("fit.null","fit.cytokine","map","cytokine","BF"),
file = "varbvs.demo.cytokine.RData")
Show two “genome-wide scans” from the multi-marker PIPs, with and without conditioning on enrichment of cytokine signaling genes.
<- which(fit.null$pip > 0.5 | fit.cytokine$pip > 0.5)
i <- paste0(round(map$pos[i]/1e6,digits = 2),"Mb")
var.labels print(plot(fit.null,groups = map$chr,vars = i,var.labels = NULL,
gap = 7500,ylab = "posterior prob."),
split = c(1,1,1,2),more = TRUE)
print(plot(fit.cytokine,groups = map$chr,vars = i,var.labels = var.labels,
gap = 7500,ylab = "posterior prob."),
split = c(1,2,1,2),more = FALSE)