We present some extensions to the package. These aim to demonstrate additional functionality of the package which has not yet been subject to extensive testing.
Since the Monte Carlo simulations required to generate the results
can be time-consuming, we have precomputed them and included them as a
data file in the package. This enables a quick overview of the methods
without having to rerun the simulations. Below, we load the precomputed
results using the data()
function. We also provide the code
used to generate these results later in this vignette.
Here, we demonstrate two other application scenarios, sensitivity analysis and compromise analysis.
Sensitivity analysis involves fixing the sample size, desired power, and alpha level, and determining the effect size that can be detected with the desired power. We demonstrate this functionality using a t-test.
The fixed parameters are the sample size N, the alpha level, and the goal power. We want to determine the corresponding effect size:
We define our simulation function:
simfun_sensitivity <- function(esize) {
# Generate a data set
dat <- rnorm(n = N, mean = esize)
# Test the hypothesis
res <- t.test(dat)
res$p.value < alpha
}
We can find the corresponding effect size using:
# Effect Size Finding res1 <- find.design(simfun
# = simfun_sensitivity, boundaries =
# c(0,1),integer=FALSE, power =
# goal_power,surrogate = 'gpr',evaluations=8000)
res1 <- extensions_results[[1]]
summary(res1)
#>
#> Call:
#> find.design(simfun = simfun_sensitivity, boundaries = c(0, 1),
#> power = goal_power, evaluations = 8000, surrogate = "gpr",
#> integer = FALSE)
#>
#> Design: esize = 0.431758284218821
#>
#> Power: 0.95127, SE: 0.0028
#> Evaluations: 8000, Time: 26.02, Updates: 16
#> Surrogate: Gaussian process regression
This leads us to an effect size of about 0.43, as measured by Cohen’s d. To confirm the correctness of this result, we check it analytically using the pwr package.
library(pwr)
esize_correct <- pwr.t.test(n = N, sig.level = alpha,
power = goal_power, type = "one.sample")$d
esize_correct
#> [1] 0.4293178
The analytical result is very close to the result obtained with mlpwr. With a small simulation study, we further confirm that the analytical result is correct.
In compromise analysis, the sample size and effect size are fixed in the DGF, and the goal is to find a decision criterion that optimizes the desired ratio of alpha and beta errors. Again, we demonstrate this functionality using a t-test scenario.
The fixed parameters are the sample size N, the effect size (given by Cohen’s d) and the desired ratio of the alpha and beta errors:
We now define our simulation function. Its argument is a given threshold for the test statistic. This function then generates two data sets - one under the null, one under the alternative hypothesis - and evaluates whether the use of this threshold leads to a Type I error (for the data set generated under the null hypothesis) or a Type II error (for the data set generated under the alternative hypothesis).
simfun_compromise <- function(crit) {
# Generate a data set
dat <- rnorm(n = N, mean = 0)
# Test the hypothesis
res <- t.test(dat)
a <- res$statistic > crit
# Generate a data set
dat <- rnorm(n = N, mean = esize)
# Test the hypothesis
res <- t.test(dat)
b <- res$statistic < crit
return(c(a, b))
}
We run this function one to see how the output looks - TRUE indicates that a Type I (1st entry) or Type II error (2nd entry) was observed:
However, we are not interested in the Type I and Type II errors themselves, but in their ratio. We therefore need to define a function that organizes how the individual simulations are aggregated:
We can find the optimal decision criterion for reaching the desired ratio of alpha and beta errors using:
# res2 <- find.design(simfun = simfun_compromise,
# boundaries = c(1,2), power =
# desired_ratio,integer = FALSE, aggregate_fun =
# aggregate_fun,surrogate='svr',use_noise=FALSE,evaluations=8000)
res2 <- extensions_results[[2]]
summary(res2)
#>
#> Call:
#> find.design(simfun = simfun_compromise, boundaries = c(1, 2),
#> power = desired_ratio, evaluations = 8000, surrogate = "svr",
#> aggregate_fun = aggregate_fun, integer = FALSE, use_noise = FALSE)
#>
#> Design: crit = 1.48455055391577
#>
#> Power: 1.00086, SE: NA
#> Evaluations: 8000, Time: 6.89, Updates: 6
#> Surrogate: Support vector regression
The analysis with mlpwr leads to a desired threshold of about 1.48. The ratio of Type I and Type II errors is given as “power” in this output and close to the desired value of 1.
Note that we used some additional arguments in this case. The ‘integer’ argument tells the algorithm whether to look for integer design parameters - this is set to ‘FALSE’ for our case of effect sizes. The ‘use_noise=FALSE’ indicates that we do not account for the Monte Carlo error at each design parameter. This functionality can be added in the future. Note also that we use ‘power’ as an argument to the find.design function and in the output of summary(res2) to indicate the desired ratio of alpha and beta error. This is a makeshift solution and we consider rewording this argument to ‘goal’ in the future.
To check the correctness of our result, we compare it with an analytical solution obtained from the t-distribution and an numerical optimizer:
fun <- \(alpha) {
beta <- alpha * desired_ratio
abs(qt(1 - alpha, ncp = 0, df = N - 1) # crit for alpha
-
qt(beta, ncp = esize * sqrt(N), df = N - 1) # crit for beta
)
}
alpha <- optim(0.1, fun, lower = 1e-09, upper = 1 -
1e-09, method = "L-BFGS-B")$par
crit <- qt(1 - alpha, ncp = 0, df = N - 1)
crit
#> [1] 1.503731
This leads to a threshold of about 1.50, which is very close to the solution of mlpwr. Using simulations, we can show that the analytical solution is indeed correct:
Some uses cases feature more complex cost functions. We cover an example here. It builds on the “ANOVA” simulation function in the “simulation_functions” vignette.
library(tidyr)
simfun_anova <- function(n, n.groups) {
# Generate a data set
groupmeans <- rnorm(n.groups, sd = 0.2) # generate groupmeans using cohen's f=.2
dat <- sapply(groupmeans, function(x) rnorm(n,
mean = x, sd = 1)) # generate data
dat <- dat |>
as.data.frame() |>
gather() # format
# Test the hypothesis
res <- aov(value ~ key, data = dat) # perform ANOVA
summary(res)[[1]][1, 5] < 0.01 # extract significance
}
We assume that there are different costs for different clusters. We assume that the costs per cluster is cheaper if there is only a small number of clusters. Adding additional clusters gets increasingly expensive (for example, because assessing a large number of groups is more costly).
Specifically, we assume that the first five clusters cost 10, the next five cost 15, the next five cost 20, and so on.
prices <- c(rep(10, 5), rep(15, 5), rep(20, 5), rep(25,
5), rep(30, 5), rep(35, 5), rep(40, 5))
costfun <- function(n, n.groups) {
5 * n + n.groups * sum(prices[1:n.groups])
}
We can now find a good design using:
# res3 <- find.design( simfun = simfun_anova,
# costfun = costfun, boundaries = list(n = c(10,
# 150), n.groups = c(5, 30)), power = .95 )
res3 <- extensions_results[[3]]
summary(res3)
#>
#> Call:
#> find.design(simfun = simfun_anova, boundaries = list(n = c(10,
#> 150), n.groups = c(5, 30)), power = 0.95, costfun = costfun)
#>
#> Design: n = 135, n.groups = 10
#>
#> Power: 0.95017, SE: 0.00584, Cost: 1925
#> Evaluations: 4000, Time: 13.6, Updates: 32
#> Surrogate: Gaussian process regression
The proposed solution contains 10 groups and 135 participants per group.
As an example for a study design with more than 2 dimensions, we consider a multilevel design with the three dimensions:
simfun_3d <- function(n.per.school, n.schools, n.obs) {
# generate data
school <- rep(1:n.schools, each = n.per.school *
n.obs)
student <- rep(1:(n.schools * n.per.school), each = n.obs)
pred <- factor(rep(c("old", "new"), n.per.school *
n.schools * n.obs, each = n.obs), levels = c("old",
"new"))
dat <- data.frame(school = school, student = student,
pred = pred)
params <- list(theta = c(0.5, 0, 0.5, 0.5), beta = c(0,
1), sigma = 1.5)
names(params$theta) <- c("school.(Intercept)",
"school.prednew.(Intercept)", "school.prednew",
"student.(Intercept)")
names(params$beta) <- c("(Intercept)", "prednew")
dat$y <- simulate.formula(~pred + (1 + pred | school) +
(1 | student), newdata = dat, newparams = params)[[1]]
# test hypothesis
mod <- lmer(y ~ pred + (1 + pred | school) + (1 |
student), data = dat)
pvalue <- summary(mod)[["coefficients"]][2, "Pr(>|t|)"]
pvalue < 0.01
}
costfun_3d <- function(n.per.school, n.schools, n.obs) {
100 * n.per.school + 200 * n.schools + 0.1 * n.obs *
n.per.school * n.schools
}
# res4 = find.design(simfun = simfun_3d, costfun
# = costfun_3d, boundaries = list(n.per.school =
# c(5, 25), n.schools = c(10, 30), n.obs =
# c(3,10)), power = .95)
res4 <- extensions_results[[4]]
summary(res4)
#>
#> Call:
#> find.design(simfun = simfun_3d, boundaries = list(n.per.school = c(5,
#> 25), n.schools = c(10, 30), n.obs = c(3, 10)), power = 0.95,
#> costfun = costfun_3d)
#>
#> Design: n.per.school = 15, n.schools = 15, n.obs = 3
#>
#> Power: 0.96948, SE: 0.00374, Cost: 4567.5
#> Evaluations: 4020, Time: 2237.39, Updates: 48
#> Surrogate: Gaussian process regression
In this case, going with the minimum number of observations per student seems to be the ideal choice.