--- title: "Bioequivalence Tests for Parallel Trial Designs: 3 Arms, 3 Endpoints" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Bioequivalence Tests for Parallel Trial Designs: 3 Arms, 3 Endpoints} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: 'references.bib' link-citations: yes --- ```{r setup, include=FALSE, message = FALSE, warning = FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(comment = "#>", collapse = TRUE) options(rmarkdown.html_vignette.check_title = FALSE) #title of doc does not match vignette title doc.cache <- T #for cran; change to F ``` In the `SimTOST` R package, which is specifically designed for sample size estimation for bioequivalence studies, hypothesis testing is based on the Two One-Sided Tests (TOST) procedure. [@sozu_sample_2015] In TOST, the equivalence test is framed as a comparison between the the null hypothesis of ‘new product is worse by a clinically relevant quantity’ and the alternative hypothesis of ‘difference between products is too small to be clinically relevant’. This vignette focuses on a parallel design, with 3 arms/treatments and 3 primary endpoints. # Introduction Similar to the example presented in [Bioequivalence Tests for Parallel Trial Designs: 2 Arms, 3 Endpoints](sampleSize_parallel_2A3E.html), where equivalence across multiple endpoints was assessed, this vignette extends the framework to trials involving two reference products. This scenario arises when regulators from different regions require comparisons with their locally sourced reference biosimilars. In many studies, the aim is to evaluate equivalence across multiple primary variables. For example, the European Medicines Agency (EMA) recommends demonstrating equivalence for both the Area Under the Curve (AUC) and the maximum concentration (Cmax) when endpoints. In this scenario, we additionally evaluate equivalence across three treatment arms, including a test biosimilar and two reference products, each sourced from a different regulatory region. This vignette demonstrates advanced techniques for calculating sample size in such trials, in which multiple endpoints and multiple reference products are considered. As an illustrative example, we use data from the phase 1 trial [NCT01922336](https://clinicaltrials.gov/study/NCT01922336#study-overview), which assessed pharmacokinetic (PK) endpoints following a single dose of SB2, its EU-sourced reference product (EU-INF), and its US-sourced reference product (US-INF) [@shin_randomized_2015]. ```{r inputdata, echo=FALSE} data <- data.frame("PK measure" = c("AUCinf ($\\mu$g*h/mL)","AUClast ($\\mu$g*h/mL)","Cmax ($\\mu$g/mL)"), "SB2" = c("38,703 $\\pm$ 11,114", "36,862 $\\pm$ 9133", "127.0 $\\pm$ 16.9"), "EU-INF" = c("39,360 $\\pm$ 12,332", "37,022 $\\pm$ 9398", "126.2 $\\pm$ 17.9"), "US-INF" = c("39,270 $\\pm$ 10,064", "37,368 $\\pm$ 8332", "129.2 $\\pm$ 18.8")) kableExtra::kable_styling(kableExtra::kable(data, col.names = c("PK measure", "SB2", "EU-INF", "US-INF"), caption = "Primary PK measures between test and reference product. Data represent arithmetic mean +- standard deviation."), bootstrap_options = "striped") ``` # Input Data As in [this vignette](sampleSize_parallel_2A3E.html), the following inputs are required: * Means for each endpoint and treatment arm (`mu_list`), * Standard deviations for each endpoint and arm (`sigma_list`), * Endpoint and arm names (`yname_list` and `arm_names`), * Arms to be compared within each comparator (`list_comparator`), * Endpoints to be compared within each comparator (`list_y_comparator`). In this example, we simultaneously compare the trial drug (SB2) to each reference biosimilar: * SB2 vs. EU-INF * SB2 vs. US-INF We define the required list objects: ```{r} # Mean values for each endpoint and arm mu_list <- list( SB2 = c(AUCinf = 38703, AUClast = 36862, Cmax = 127.0), EUINF = c(AUCinf = 39360, AUClast = 37022, Cmax = 126.2), USINF = c(AUCinf = 39270, AUClast = 37368, Cmax = 129.2) ) # Standard deviation values for each endpoint and arm sigma_list <- list( SB2 = c(AUCinf = 11114, AUClast = 9133, Cmax = 16.9), EUINF = c(AUCinf = 12332, AUClast = 9398, Cmax = 17.9), USINF = c(AUCinf = 10064, AUClast = 8332, Cmax = 18.8) ) ``` # Simultaneous Testing of Independent Co-Primary Endpoints In this analysis, we adopt the **Ratio of Means (ROM)** approach to evaluate bioequivalence across independent co-primary endpoints. ## Different Hypotheses across endpoints In this example, we aim to demonstrate: * Bioequivalence of SB2 vs. EU-INF for AUCinf and Cmax, and * Bioequivalence of SB2 vs. US-INF for AUClast and Cmax. The comparisons are specified as follows: ```{r} # Arms to be compared list_comparator <- list( EMA = c("SB2", "EUINF"), FDA = c("SB2", "USINF") ) # Endpoints to be compared list_y_comparator <- list( EMA = c("AUCinf", "Cmax"), FDA = c("AUClast", "Cmax") ) ``` For all endpoints, bioequivalence is established if the 90% confidence intervals for the ratios of the geometric means fall within the equivalence range of 80.00% to 125.00%. Below we define the equivalence boundaries for each comparison: ```{r} list_lequi.tol <- list( "EMA" = c(AUCinf = 0.8, Cmax = 0.8), "FDA" = c(AUClast = 0.8, Cmax = 0.8) ) list_uequi.tol <- list( "EMA" = c(AUCinf = 1.25, Cmax = 1.25), "FDA" = c(AUClast = 1.25, Cmax = 1.25) ) ``` Here, the `list_comparator` parameter specifies the arms being compared, while `list_lequi.tol` and `list_uequi.tol` define the lower and upper equivalence boundaries for the two endpoints. To calculate the required sample size for testing equivalence under the specified conditions, we use the [sampleSize()](../reference/sampleSize.html) function. This function computes the total sample size needed to achieve the target power while ensuring equivalence criteria are met. ```{r} library(SimTOST) (N_ss <- sampleSize(power = 0.9, # Target power alpha = 0.05, # Type I error rate mu_list = mu_list, # Means for each endpoint and arm sigma_list = sigma_list, # Standard deviations list_comparator = list_comparator, # Comparator arms list_y_comparator = list_y_comparator, # Endpoints to compare list_lequi.tol = list_lequi.tol, # Lower equivalence boundaries list_uequi.tol = list_uequi.tol, # Upper equivalence boundaries dtype = "parallel", # Trial design type ctype = "ROM", # Test type: Ratio of Means vareq = TRUE, # Assume equal variances lognorm = TRUE, # Log-normal distribution assumption nsim = 1000, # Number of stochastic simulations seed = 1234)) # Random seed for reproducibility ``` A total sample size of `r N_ss$response$n_total` patients (or `r N_ss$response$n_total/3` per arm) is required to demonstrate equivalence. # Simultaneous Testing of Independent Primary Endpoints ## Equivalence for At Least 2 of the 3 Endpoints with Bonferroni Adjustment In this example, we aim to establish equivalence for at least two out of three primary endpoints while accounting for multiplicity using the Bonferroni adjustment. The following assumptions are made: * Equality of Variances: variances are assumed to be equal across groups (`vareq = TRUE`). * Testing Parameter: the Ratio of Means (ROM) is used as the testing parameter (`ctype = "ROM"`). * Design: a parallel trial design is assumed (`dtype = "parallel"`). * Distribution: endpoint data follows a log-normal distribution (`lognorm = TRUE`). * Correlation: endpoints are assumed to be independent, with no correlation between them (default `rho = 0`). * Multiplicity Adjustment: the Bonferroni correction is applied to control for Type I error (`adjust = "bon"`). * Equivalence Criterion: equivalence is required for at least two of the three endpoints (`k = 2`). The comparisons and equivalence boundaries are defined as follows: ```{r} # Endpoints to be compared for each comparator list_y_comparator <- list( EMA = c("AUCinf", "AUClast", "Cmax"), FDA = c("AUCinf", "AUClast", "Cmax") ) # Define lower equivalence boundaries for each comparator list_lequi.tol <- list( EMA = c(AUCinf = 0.8, AUClast = 0.8, Cmax = 0.8), FDA = c(AUCinf = 0.8, AUClast = 0.8, Cmax = 0.8) ) # Define upper equivalence boundaries for each comparator list_uequi.tol <- list( EMA = c(AUCinf = 1.25, AUClast = 1.25, Cmax = 1.25), FDA = c(AUCinf = 1.25, AUClast = 1.25, Cmax = 1.25) ) ``` Finally, we calculate the required sample size: ```{r} (N_mp <- sampleSize(power = 0.9, # Target power alpha = 0.05, # Type I error rate mu_list = mu_list, # Means for each endpoint and arm sigma_list = sigma_list, # Standard deviations list_comparator = list_comparator, # Comparator arms list_y_comparator = list_y_comparator, # Endpoints to compare list_lequi.tol = list_lequi.tol, # Lower equivalence boundaries list_uequi.tol = list_uequi.tol, # Upper equivalence boundaries k = 2, # Number of endpoints required to demonstrate equivalence adjust = "bon", # Bonferroni adjustment for multiple comparisons dtype = "parallel", # Trial design type (parallel group) ctype = "ROM", # Test type: Ratio of Means vareq = TRUE, # Assume equal variances across arms lognorm = TRUE, # Log-normal distribution assumption nsim = 1000, # Number of stochastic simulations seed = 1234)) # Random seed for reproducibility ``` ## Unequal Allocation Rates Across Arms In this example, we build upon the previous scenario but introduce unequal allocation rates across treatment arms. Specifically, we require that the number of patients in the new treatment arm is double the number in each of the reference arms. This can be achieved by specifying the treatment allocation rate parameter (`TAR`). Rates are provided as a vector, for example: `TAR = c(2, 1, 1)`. This ensures that for every two patients assigned to the new treatment arm, one patient is assigned to each reference arm. ```{r} (N_mp2 <- sampleSize(power = 0.9, # Target power alpha = 0.05, # Type I error rate mu_list = mu_list, # Means for each endpoint and arm sigma_list = sigma_list, # Standard deviations list_comparator = list_comparator, # Comparator arms list_y_comparator = list_y_comparator, # Endpoints to compare list_lequi.tol = list_lequi.tol, # Lower equivalence boundaries list_uequi.tol = list_uequi.tol, # Upper equivalence boundaries k = 2, # Number of endpoints required to demonstrate equivalence adjust = "bon", # Bonferroni adjustment for multiple comparisons TAR = c("SB2" = 2, "EUINF" = 1, "USINF" = 1), # Treatment allocation rates dtype = "parallel", # Trial design type (parallel group) ctype = "ROM", # Test type: Ratio of Means vareq = TRUE, # Assume equal variances across arms lognorm = TRUE, # Log-normal distribution assumption nsim = 1000, # Number of stochastic simulations seed = 1234)) # Random seed for reproducibility ``` Results indicate that `r N_mp2$response$n_SB2` patients are required for the active treatment arm (SB2), and `r N_mp2$response$n_EUINF` patients are required for each reference arm. The total sample size required is `r N_mp2$response$n_total`, which is larger compared to the trial with an equal allocation ratio, for which the total sample size was `r N_mp$response$n_total`. ## Accounting for Participant Dropout In the examples above, the sample size calculations assumed that all enrolled patients complete the trial. However, in practice, a certain percentage of participants drop out, which can impact the required sample size. To account for this, we consider a 20% dropout rate across all treatment arms. ```{r} (N_mp3 <- sampleSize(power = 0.9, # Target power alpha = 0.05, # Type I error rate mu_list = mu_list, # Means for each endpoint and arm sigma_list = sigma_list, # Standard deviations list_comparator = list_comparator, # Comparator arms list_y_comparator = list_y_comparator, # Endpoints to compare list_lequi.tol = list_lequi.tol, # Lower equivalence boundaries list_uequi.tol = list_uequi.tol, # Upper equivalence boundaries k = 2, # Number of endpoints required to demonstrate equivalence adjust = "bon", # Bonferroni adjustment for multiple comparisons TAR = c("SB2" = 2, "EUINF" = 1, "USINF" = 1), # Treatment allocation rates dropout = c("SB2" = 0.20, "EUINF" = 0.20, "USINF" = 0.20), # Expected dropout rates dtype = "parallel", # Trial design type (parallel group) ctype = "ROM", # Test type: Ratio of Means vareq = TRUE, # Assume equal variances across arms lognorm = TRUE, # Log-normal distribution assumption nsim = 1000, # Number of stochastic simulations seed = 1234)) # Random seed for reproducibility ``` Considering a 20% dropout rate, the total sample size required is `r N_mp3$response$n_total`. # References