This package contains functions for association testing in 2x2 tables (ie. two binary variables). In particular, the scientific setting that motivated this package's development was testing for associations between diseases and rare genetic variants in case-control studies. When the expected number of subjects possessing a variant is small, standard methods perform poorly (usually tend to be overly conservative in controlling the Type I error).
The two alternative methods implemented in the package are permutation testing and approximate unconditional (AU) testing.
Permutation testing works by computing a test statistic T for the observed data, generating all plausible datasets with the same total number of exposed subjects, then adding up the probabilities of those datasets which give more extreme test statistics than T.
The perm.tests
function returns p-values from permutation tests based on score, likelihood ratio, Wald (with and without regularization), and Firth statistics.
The following code runs the tests for a dataset containing 5,000 cases (55 with a minor allele of interest) and 15,000 controls (45 with a minor allele of interest):
library(AUtests)
# Example data, 1:3 case-control ratio
perm.tests(15000, 5000, 45, 55)
## score.p lr.p wald.p wald0.p firth.p
## 1.362901e-10 3.109880e-10 1.362901e-10 1.365853e-10 1.464917e-10
For comparison purposes, the basic.tests
function returns p-values for the standard score, likelihood ratio, Wald, Firth, and Fisher's exact tests:
basic.tests(15000, 5000, 45, 55)
## score.p lr.p wald.p wald0.p firth.p
## 3.768763e-12 1.524214e-10 7.777712e-11 9.028622e-11 1.325086e-10
## fisher.p
## 1.464917e-10
AU testing works by computing a test statistic T for the observed data, generating all plausible datasets with any number of variants, then adding up the probabilities of those datasets which give more extreme test statistics than T.
The au.tests
function returns p-values from AU tests based on score, likelihood ratio, and Wald (with and without regularization) statistics. The au.firth
function returns a p-value from the AU Firth test. It was implemented as a separate function due to its increased computational time.
The following code runs the tests for a dataset containing 10,000 cases (60 with a minor allele of interest) and 10,000 controls (45 with a minor allele of interest):
# Example data, balanced case-control ratio
au.tests(10000, 10000, 45, 60)
## score.p lr.p wald.p wald0.p
## 0.1420303 0.1430718 0.1431031 0.1431030
au.firth(10000, 10000, 45, 60)
## au.firth.p
## 0.143103
In order to gain precision or adjust for a confounding variable, it can be of interest to perform a stratified analysis. The perm.test.strat
function implements a permutation likelihood ratio test that allows for categorical covariates, and the au.test.strat
implements a similar AU test. The functions read in vectors of controls, cases, controls with the exposure, and cases wih the exposure, where the i-th element of each vector corresponds to the coount for the i-th strata.
Consider the following example data, with two strata (ie. a binary covariate):
m0list = c(500, 1250) # controls
m1list = c(150, 100) # cases
r0list = c(60, 20) # exposed controls
r1list = c(25, 5) # exposed cases
A non-stratified analysis would yield a highly significant result:
perm.tests(1750, 250, 80, 30)
## score.p lr.p wald.p wald0.p firth.p
## 1.283296e-05 1.758305e-05 1.283296e-05 1.310045e-05 1.758305e-05
au.tests(1750, 250, 80, 30)
## score.p lr.p wald.p wald0.p
## 7.592631e-06 2.077567e-05 7.893789e-06 8.701288e-06
When adjusting for the covariate, however, the result is much less significant:
perm.test.strat(m0list, m1list, r0list, r1list)
## lrt.p
## 0.0460971
au.test.strat(m0list, m1list, r0list, r1list)
## lrt.p
## 0.04333194