This package assumes that a hierarchical testing procedure for the three-arm gold-standard non-inferiority design is applied. The first test aims to establish assay sensitivity of the trial. It is a test of superiority of the experimental treatment (T) against the placebo treatment (P). If assay sensitivity is successfully established, the treatment is tested for non-inferiority against the control treatment (C). Individual observations are assumed to be normally distributed, where higher values correspond to better treatment effects. Testing is assumed to be done via Z test statistics.
We highly recommend reading our open-access article (Meis et al., 2022) where the theoretical background of this package is explained.
To showcase the capabilities of this package, we will reproduce some results from the paper in the following.
It should be noted that the results will not completely agree with the results from the paper, as the calculations in the paper used much lower error tolerances and more function evaluations.
To achieve results closer to the results from the paper, you can supply the following options, though this will significantly increase computation times:
mvnorm_algorithm = mvtnorm::Miwa(
# steps = 128,
steps = 4097,
checkCorr = FALSE,
maxval = 1000),
nloptr_opts = list(algorithm = "NLOPT_LN_SBPLX",
# xtol_abs = 1e-3,
# xtol_rel = 1e-2,
# maxeval = 2000,
xtol_abs = 1e-10,
xtol_rel = 1e-9,
maxeval = 2000,
print_level = 0)
You may also want to put
when running code interactively to see the progress of the optimization.
The designs from in Table 2 from the paper are optimized to minimize the expected sample size under the alternative hypothesis.
This is (approximately) the first line in Table 2 from the paper:
tab1_D1 <- optimize_design_onestage(
alpha = .025,
beta = .2,
alternative_TP = .4,
alternative_TC = 0,
Delta = .2,
print_progress = FALSE
)
tab1_D1
#> Sample sizes (stage 1): T: 413, P: 125, C: 404
#> Efficacy boundaries (stage 1): Z_TP_e: 1.95996, Z_TC_e: 1.95996
#> Maximum overall sample size: 942
#> Placebo penalty at optimum (kappa * nP): 0.0
#> Objective function value: 942.0
#> Type I error for TP testing: 2.5%
#> Type I error for TC testing: 2.5%
#> Power: 80.2%
This is (approximately) the second line in Table 2 from the paper:
optimize_design_twostage(
cP1 = tab1_D1$stagec[[1]]$P, # The allocation ratios are enforced to be
cC1 = tab1_D1$stagec[[1]]$C, # the same as in the optimal single-stage design.
cT2 = 1,
cP2 = tab1_D1$stagec[[1]]$P,
cC2 = tab1_D1$stagec[[1]]$C,
bTP1f = -Inf, # These two boundary conditions enforce no futility stops.
bTC1f = -Inf,
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE
)
#> Sample sizes (stage 1): T: 224, P: 68, C: 219
#> Sample sizes (stage 2): T: 224, P: 68, C: 219
#> Efficacy boundaries (stage 1): Z_TP_e: 2.10510, Z_TC_e: 2.27093
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.27188, Z_TC_e: 2.10568
#> Inverse normal combination test weights (TP): w1: 0.70711, w2: 0.70711
#> Inverse normal combination test weights (TC): w1: 0.70711, w2: 0.70711
#> Maximum overall sample size: 1022
#> Expected sample size (H1): 801.2
#> Expected sample size (H0): 1020.3
#> Expected placebo group sample size (H1): 82.8
#> Expected placebo group sample size (H0): 134.8
#> Objective function value: 801.2
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.20%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests
This is (approximately) the third line in Table 2 from the paper:
optimize_design_twostage(
bTP1f = -Inf, # These two boundary conditions enforce no futility stops.
bTC1f = -Inf,
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE
)
#> Sample sizes (stage 1): T: 230, P: 90, C: 224
#> Sample sizes (stage 2): T: 202, P: 106, C: 191
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04997, Z_TC_e: 2.27978
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.39960, Z_TC_e: 2.09141
#> Inverse normal combination test weights (TP): w1: 0.69161, w2: 0.72227
#> Inverse normal combination test weights (TC): w1: 0.73218, w2: 0.68111
#> Maximum overall sample size: 1043
#> Expected sample size (H1): 787.2
#> Expected sample size (H0): 1040.3
#> Expected placebo group sample size (H1): 103.1
#> Expected placebo group sample size (H0): 193.9
#> Objective function value: 787.2
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.06%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests
This is (approximately) the fourth line in Table 2 from the paper:
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = FALSE
)
#> Sample sizes (stage 1): T: 238, P: 84, C: 241
#> Sample sizes (stage 2): T: 201, P: 122, C: 185
#> Efficacy boundaries (stage 1): Z_TP_e: 2.03084, Z_TC_e: 2.27784
#> Futility boundaries (stage 1): Z_TP_f: -0.29297, Z_TC_f: 0.57221
#> Efficacy boundaries (stage 2): Z_TP_e: 2.47898, Z_TC_e: 2.08790
#> Inverse normal combination test weights (TP): w1: 0.66534, w2: 0.74654
#> Inverse normal combination test weights (TC): w1: 0.74431, w2: 0.66783
#> Maximum overall sample size: 1071
#> Expected sample size (H1): 775.4
#> Expected sample size (H0): 672.8
#> Expected placebo group sample size (H1): 97.9
#> Expected placebo group sample size (H0): 109.3
#> Objective function value: 775.4
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.43%
#> Probability of futility stop (H1): 5.33%
#> Probability of futility stop (H0): 77.96%
#> Minimum conditional power: 19.62%
#> Power: 80.01%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests
This is (approximately) the fourth line in Table 2 from the paper:
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE
)
#> Sample sizes (stage 1): T: 229, P: 90, C: 231
#> Sample sizes (stage 2): T: 217, P: 107, C: 199
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04659, Z_TC_e: 2.29485
#> Futility boundaries (stage 1): Z_TP_f: 0.23336, Z_TC_f: 0.75795
#> Efficacy boundaries (stage 2): Z_TP_e: 2.40505, Z_TC_e: 2.04331
#> Inverse normal combination test weights (TP): w1: 0.68710, w2: 0.72656
#> Inverse normal combination test weights (TC): w1: 0.72466, w2: 0.68911
#> Maximum overall sample size: 1073
#> Expected sample size (H1): 768.5
#> Expected sample size (H0): 619.9
#> Expected placebo group sample size (H1): 100.2
#> Expected placebo group sample size (H0): 103.5
#> Objective function value: 768.5
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 8.33%
#> Probability of futility stop (H0): 86.28%
#> Minimum conditional power: 34.17%
#> Power: 80.16%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
Next, we will optimize a design under a combination of null and alternative hypothesis.
This is (approximately) the third line in Table 3 from the paper:
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE,
lambda = 0.9
)
#> Sample sizes (stage 1): T: 227, P: 89, C: 231
#> Sample sizes (stage 2): T: 230, P: 98, C: 213
#> Efficacy boundaries (stage 1): Z_TP_e: 2.05198, Z_TC_e: 2.26340
#> Futility boundaries (stage 1): Z_TP_f: 0.85517, Z_TC_f: 0.77016
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34293, Z_TC_e: 2.06018
#> Inverse normal combination test weights (TP): w1: 0.69370, w2: 0.72026
#> Inverse normal combination test weights (TC): w1: 0.71238, w2: 0.70180
#> Maximum overall sample size: 1088
#> Expected sample size (H1): 771.1
#> Expected sample size (H0): 587.6
#> Expected placebo group sample size (H1): 98.2
#> Expected placebo group sample size (H0): 95.6
#> Objective function value: 758.0
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 9.18%
#> Probability of futility stop (H0): 92.17%
#> Minimum conditional power: 43.67%
#> Power: 80.15%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
Now we will optimize a design under the alternative while putting an extra penalty on placebo group sample size.
This is (approximately) the fourth line in Table 2 from the paper:
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE,
kappa = 0.5
)
#> Sample sizes (stage 1): T: 239, P: 75, C: 237
#> Sample sizes (stage 2): T: 211, P: 114, C: 204
#> Efficacy boundaries (stage 1): Z_TP_e: 2.03405, Z_TC_e: 2.25340
#> Futility boundaries (stage 1): Z_TP_f: 0.01742, Z_TC_f: 0.80964
#> Efficacy boundaries (stage 2): Z_TP_e: 2.46906, Z_TC_e: 2.06256
#> Inverse normal combination test weights (TP): w1: 0.65529, w2: 0.75538
#> Inverse normal combination test weights (TC): w1: 0.73076, w2: 0.68263
#> Maximum overall sample size: 1080
#> Expected sample size (H1): 767.5
#> Expected sample size (H0): 624.9
#> Expected placebo group sample size (H1): 89.9
#> Expected placebo group sample size (H0): 90.1
#> Objective function value: 812.4
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 8.59%
#> Probability of futility stop (H0): 85.70%
#> Minimum conditional power: 31.96%
#> Power: 80.09%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
Next, we will optimize a design under a combination of null and alternative hypothesis while including a penalty on the placebo group sample size.
This is (approximately) the seventh line in Table 2 from the paper:
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
binding_futility = TRUE,
lambda = .9,
kappa = 1
)
#> Sample sizes (stage 1): T: 235, P: 71, C: 236
#> Sample sizes (stage 2): T: 222, P: 88, C: 224
#> Efficacy boundaries (stage 1): Z_TP_e: 2.05815, Z_TC_e: 2.26759
#> Futility boundaries (stage 1): Z_TP_f: 0.75006, Z_TC_f: 0.78151
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34303, Z_TC_e: 2.05618
#> Inverse normal combination test weights (TP): w1: 0.67865, w2: 0.73446
#> Inverse normal combination test weights (TC): w1: 0.71693, w2: 0.69714
#> Maximum overall sample size: 1076
#> Expected sample size (H1): 776.6
#> Expected sample size (H0): 584.6
#> Expected placebo group sample size (H1): 83.8
#> Expected placebo group sample size (H0): 77.4
#> Objective function value: 846.6
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 9.26%
#> Probability of futility stop (H0): 91.74%
#> Minimum conditional power: 40.58%
#> Power: 80.07%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
eta = 1
)
#> Sample sizes (stage 1): T: 224, P: 84, C: 248
#> Sample sizes (stage 2): T: 190, P: 55, C: 167
#> Efficacy boundaries (stage 1): Z_TP_e: 2.25324, Z_TC_e: 2.52099
#> Futility boundaries (stage 1): Z_TP_f: -0.27777, Z_TC_f: -0.06567
#> Efficacy boundaries (stage 2): Z_TP_e: 2.09262, Z_TC_e: 2.00438
#> Inverse normal combination test weights (TP): w1: 0.76715, w2: 0.64146
#> Inverse normal combination test weights (TC): w1: 0.75346, w2: 0.65750
#> Maximum overall sample size: 968
#> Expected sample size (H1): 800.9
#> Expected sample size (H0): 711.8
#> Expected placebo group sample size (H1): 94.2
#> Expected placebo group sample size (H0): 104.3
#> Objective function value: 1768.9
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.49%
#> Probability of futility stop (H1): 1.30%
#> Probability of futility stop (H0): 62.00%
#> Minimum conditional power: 4.54%
#> Power: 80.13%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests
optimize_design_twostage(
cT2 = 1, # These three boundary conditions enforce a
cP2 = quote(cP1), # between-stage allocation ratio of one.
cC2 = quote(cC1), # The quote() command is necessary for this to work.
bTP1f = -Inf, # These two boundary conditions enforce no futility stops.
bTC1f = -Inf,
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE
)
#> Sample sizes (stage 1): T: 217, P: 87, C: 212
#> Sample sizes (stage 2): T: 217, P: 87, C: 212
#> Efficacy boundaries (stage 1): Z_TP_e: 2.06549, Z_TC_e: 2.28000
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34934, Z_TC_e: 2.10025
#> Inverse normal combination test weights (TP): w1: 0.70711, w2: 0.70711
#> Inverse normal combination test weights (TC): w1: 0.70711, w2: 0.70711
#> Maximum overall sample size: 1032
#> Expected sample size (H1): 789.8
#> Expected sample size (H0): 1029.7
#> Expected placebo group sample size (H1): 99.0
#> Expected placebo group sample size (H0): 172.3
#> Objective function value: 789.8
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.09%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests
You can replace the default objective function by any quoted
expression. In the following example, we optimize the design parameters
to minimize the expected squared sample size under the alternative
hypothesis. These expressions can make use of internal objects created
in the objective evaluation methods, check out the source code of
optimize_design_twostage
in the
optimization_methods.R
file for more information.
ASN
, ASNP
, n
and
final_state_probs
could be useful object for crafting a
custom objective function.
optimize_design_twostage(
beta = 0.2,
alternative_TP = 0.4,
alternative_TC = 0,
Delta = 0.2,
print_progress = FALSE,
objective = quote((final_state_probs[["H1"]][["TP1E_TC1E"]] + final_state_probs[["H1"]][["TP1F_TC1F"]]) *
(n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]])^2 +
(final_state_probs[["H1"]][["TP1E_TC12E"]] + final_state_probs[["H1"]][["TP1E_TC12F"]]) *
(n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]] + n[[2]][["T"]] + n[[2]][["C"]])^2 +
(final_state_probs[["H1"]][["TP12F_TC1"]] + final_state_probs[["H1"]][["TP12E_TC12E"]] +
final_state_probs[["H1"]][["TP12E_TC12F"]]) *
(n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]] + n[[2]][["T"]] + n[[2]][["P"]] + n[[2]][["C"]])^2)
)
#> Sample sizes (stage 1): T: 265, P: 86, C: 250
#> Sample sizes (stage 2): T: 157, P: 106, C: 178
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04615, Z_TC_e: 2.29343
#> Futility boundaries (stage 1): Z_TP_f: 0.05568, Z_TC_f: 0.61221
#> Efficacy boundaries (stage 2): Z_TP_e: 2.40364, Z_TC_e: 2.06634
#> Inverse normal combination test weights (TP): w1: 0.70160, w2: 0.71257
#> Inverse normal combination test weights (TC): w1: 0.77851, w2: 0.62763
#> Maximum overall sample size: 1042
#> Expected sample size (H1): 776.9
#> Expected sample size (H0): 676.6
#> Expected placebo group sample size (H1): 97.0
#> Expected placebo group sample size (H0): 103.3
#> Objective function value: 636459.1
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.45%
#> Probability of futility stop (H1): 4.93%
#> Probability of futility stop (H0): 82.46%
#> Minimum conditional power: 14.48%
#> Power: 80.06%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests
Meis, J, Pilz, M, Herrmann, C, Bokelmann, B, Rauch, G, Kieser, M. Optimization of the two-stage group sequential three-arm gold-standard design for non-inferiority trials. Statistics in Medicine. 2023; 42( 4): 536– 558. doi:10.1002/sim.9630.