11 Scenario X: Chi-Squared Distribution
11.1 Details
In this scenario, we consider the case where the test statistic is Chi-squared distributed. It is useful when we conduct a test with more than two treatment arms or when we test two-sided with a normally distributed test statistic. In the first case, we assume binary endpoints, so we have a \(2 \times k\) contingency table. In the second case, we conduct a two-sided Z-test. As the normal distribution is symmetric around \(0\), this test is equivalent to a one-sided test with test statistic \(Z^2\).
11.2 Variant X-1: Contingency table with binary endpoints
11.2.1 Setup
Under the alternative, we assume a response rate of \(0.4\) in the first group, \(0.5\) in the second and \(0.6\) in the third group.
11.2.3 Initial design
The initial designs for the minimization process is chosen using the in-built function.
tbl_designs <- tibble(
type = c("one-stage", "group-sequential", "two-stage"),
initial = list(
get_initial_design(theta, alpha, 1- min_power,
dist = datadist, type_design = "one-stage"),
get_initial_design(theta, alpha, 1- min_power,
dist = datadist, type_design = "group-sequential"),
get_initial_design(theta, alpha, 1- min_power,
dist = datadist, type_design = "two-stage")))
11.2.5 Test Cases
We first verify that the number of iterations was not exceeded in any of the three cases.
tbl_designs %>%
transmute(
type,
iterations = purrr::map_int(tbl_designs$optimal,
~.$nloptr_return$iterations) ) %>%
{print(.); .} %>%
{testthat::expect_true(all(.$iterations < opts$maxeval))}
## # A tibble: 3 × 2
## type iterations
## <chr> <int>
## 1 one-stage 24
## 2 group-sequential 6727
## 3 two-stage 5422
We then check via simulation that the type I error and power constraints are fulfilled.
tbl_designs %>%
transmute(
type,
toer = purrr::map(tbl_designs$optimal,
~sim_pr_reject(.[[1]], .0, datadist)$prob),
power = purrr::map(tbl_designs$optimal,
~sim_pr_reject(.[[1]], theta , datadist)$prob) ) %>%
unnest(., cols = c(toer, power)) %>%
{print(.); .} %>% {
testthat::expect_true(all(.$toer <= alpha * (1 + tol)))
testthat::expect_true(all(.$power >= min_power * (1 - tol))) }
## # A tibble: 3 × 3
## type toer power
## <chr> <dbl> <dbl>
## 1 one-stage 0.0251 0.900
## 2 group-sequential 0.0249 0.900
## 3 two-stage 0.0249 0.899
We expect the sample size function \(n_2\) to be monotonously decreasing.
testthat::expect_true(
all(diff(
# get optimal two-stage design n2 pivots
tbl_designs %>% filter(type == "two-stage") %>%
{.[["optimal"]][[1]]$design@n2_pivots}
) < 0) )
Since the degrees of freedom of the three design classes are ordered as
‘two-stage’ > ‘group-sequential’ > ‘one-stage’,
the expected sample sizes (under the alternative) should be ordered
in reverse (‘two-stage’ smallest).
Additionally, expected sample sizes under both null and alternative
are computed both via evaluate()
and simulation-based.
ess0 <- ExpectedSampleSize(datadist, H_0)
tbl_designs %>%
mutate(
ess = map_dbl(optimal,
~evaluate(ess, .$design) ),
ess_sim = map_dbl(optimal,
~sim_n(.$design, theta, datadist)$n ),
ess0 = map_dbl(optimal,
~evaluate(ess0, .$design) ),
ess0_sim = map_dbl(optimal,
~sim_n(.$design, .0, datadist)$n ) ) %>%
{print(.); .} %>% {
# sim/evaluate same under alternative?
testthat::expect_equal(.$ess, .$ess_sim,
tolerance = tol_n,
scale = 1)
# sim/evaluate same under null?
testthat::expect_equal(.$ess0, .$ess0_sim,
tolerance = tol_n,
scale = 1)
# monotonicity with respect to degrees of freedom
testthat::expect_true(all(diff(.$ess) < 0)) }
## # A tibble: 3 × 7
## type initial optimal ess ess_sim ess0 ess0_sim
## <chr> <list> <list> <dbl> <dbl> <dbl> <dbl>
## 1 one-stage <OnStgDsg> <adptrOpR [3]> 184 184 184 184
## 2 group-sequential <GrpSqntD> <adptrOpR [3]> 157. 157. 154. 154.
## 3 two-stage <TwStgDsg> <adptrOpR [3]> 155. 155. 162. 162.
The expected sample size under the alternative of the optimized designs should be lower than the sample size of the initial designs.
11.3 Variant X-2: Two-sided Z-test
In the second variant, we want to conduct a two-sided test with normally distributed outcome. Unfortunately, a two-sided test in the setting of two-stage trials is more difficult to implement, as the decision whether to stop early or to continue cannot be expressed in the same way as usual (\(Z < c_f\) for futility, \(Z > c_e\) for efficacy, \(Z \in [c_f, c_e]\) for continuation). In the two-sided test scenario, we would need to consider \(|Z|\), so \(|Z| < c_f\) for futility, \(|Z| > c_e\) for efficacy, \(|Z| \in [c_f, c_e]\) for continuation. Unfortunately, this would lead to \(4\) necessary boundaries, which would not fit the adoptr-framework. If \(Z \sim \mathcal{N}(0,1)\) under the null, we can use the fact \(Z^2 \sim \chi^2\) and transform our two-sided test problem to a one-sided test. Hence, we stop early for futility if \(Z^2 < c_f\), for efficacy if \(Z^2 > c_e\) and we continue if \(Z^2 \in [c_f, c_e]\), where \(c_f\) and \(c_e\) are calculated by adoptr using a \(\chi^2\)-distribution.
11.3.1 Setup
Under the alternative, we assume an effect size of \(0.4\). We want to get a Type I error \(\alpha \leq 0.05\) and our power should be higher than \(0.8\)
11.3.3 Initial designs
The initial designs for the minimization process is chosen using the in-built function.
tbl_designs <- tibble(
type = c("one-stage", "group-sequential", "two-stage"),
initial = list(
get_initial_design(theta, alpha, 1- min_power,
dist = datadist, type_design = "one-stage"),
get_initial_design(theta, alpha, 1- min_power,
dist = datadist, type_design = "group-sequential"),
get_initial_design(theta, alpha, 1- min_power,
dist = datadist, type_design = "two-stage")))
11.3.5 Test
We check that in none of the three cases the number of iterations was exceeded.
tbl_designs %>%
transmute(
type,
iterations = purrr::map_int(tbl_designs$optimal,
~.$nloptr_return$iterations) ) %>%
{print(.); .} %>%
{testthat::expect_true(all(.$iterations < opts$maxeval))}
## # A tibble: 3 × 2
## type iterations
## <chr> <int>
## 1 one-stage 24
## 2 group-sequential 2563
## 3 two-stage 2383
We now verify the type I error and power constraints. To get better insights, we do not simulate the test statistic, but normally distributed outcomes and then square the test statistic.
N <- 10^6
simulation <- function(design, theta, dist){
count <- 0
for(i in 1:N){
n1 <- n1(design)
treatment <- rnorm(n1, mean = theta)
control <- rnorm(n1)
Z_square <- (sqrt(n1 / 2) * (mean(treatment) - mean(control)))^2
if(Z_square > design@c1e){
count <- count + 1
}
if(Z_square >= design@c1f & Z_square <= design@c1e){
n2 <- n2(design, Z_square)
treatment2 <- rnorm(n2, mean = theta)
control2 <- rnorm(n2)
Z_final <- (sqrt(n2 / 2) * (mean(treatment2) - mean(control2)))^2
if(Z_final > c2(design, Z_square)){
count <- count + 1
}
}
}
return(list(
prob = count / N))
}
tbl_designs %>%
transmute(
type,
toer = purrr::map(tbl_designs$optimal,
~simulation(.[[1]], .0, datadist)$prob),
power = purrr::map(tbl_designs$optimal,
~simulation(.[[1]], .4, datadist)$prob) ) %>%
unnest(., cols = c(toer, power)) %>%
{print(.); .} %>% {
testthat::expect_true(all(.$toer <= alpha * (1 + tol)))
testthat::expect_true(all(.$power >= min_power * (1 - tol))) }
## # A tibble: 3 × 3
## type toer power
## <chr> <dbl> <dbl>
## 1 one-stage 0.0499 0.799
## 2 group-sequential 0.0501 0.798
## 3 two-stage 0.0499 0.798
The \(n_2\) function should be monotonously decreasing.
testthat::expect_true(
all(diff(
# get optimal two-stage design n2 pivots
tbl_designs %>% filter(type == "two-stage") %>%
{.[["optimal"]][[1]]$design@n2_pivots}
) < 0) )
Since the degrees of freedom of the three design classes are ordered as
‘two-stage’ > ‘group-sequential’ > ‘one-stage’,
the expected sample sizes (under the alternative) should be ordered
in reverse (‘two-stage’ smallest).
Additionally, expected sample sizes under both null and alternative
are computed both via evaluate()
and simulation-based.
ess0 <- ExpectedSampleSize(datadist, H_0)
tbl_designs %>%
mutate(
ess = map_dbl(optimal,
~evaluate(ess, .$design) ),
ess_sim = map_dbl(optimal,
~sim_n(.$design, theta, datadist)$n ),
ess0 = map_dbl(optimal,
~evaluate(ess0, .$design) ),
ess0_sim = map_dbl(optimal,
~sim_n(.$design, .0, datadist)$n ) ) %>%
{print(.); .} %>% {
# sim/evaluate same under alternative?
testthat::expect_equal(.$ess, .$ess_sim,
tolerance = tol_n,
scale = 1)
# sim/evaluate same under null?
testthat::expect_equal(.$ess0, .$ess0_sim,
tolerance = tol_n,
scale = 1)
# monotonicity with respect to degrees of freedom
testthat::expect_true(all(diff(.$ess) < 0)) }
## # A tibble: 3 × 7
## type initial optimal ess ess_sim ess0 ess0_sim
## <chr> <list> <list> <dbl> <dbl> <dbl> <dbl>
## 1 one-stage <OnStgDsg> <adptrOpR [3]> 98 98 98 98
## 2 group-sequential <GrpSqntD> <adptrOpR [3]> 87.2 87.2 82.0 82.0
## 3 two-stage <TwStgDsg> <adptrOpR [3]> 87.1 87.1 83.2 83.2
The optimized design should have a lower expected sample size than the initial designs.