11 Scenario X: Chi-Squared Distribution

11.1 Details

In this scenario, we consider the case where the test statistic is Chi-squared distributed. It is useful when we conduct a test with more than two treatment arms or when we test two-sided with a normally distributed test statistic. In the first case, we assume binary endpoints, so we have a \(2 \times k\) contingency table. In the second case, we conduct a two-sided Z-test. As the normal distribution is symmetric around \(0\), this test is equivalent to a one-sided test with test statistic \(Z^2\).

11.2 Variant X-1: Contingency table with binary endpoints

11.2.1 Setup

Under the alternative, we assume a response rate of \(0.4\) in the first group, \(0.5\) in the second and \(0.6\) in the third group.

rate_vec <- c(0.4, 0.5, 0.6)
theta <- get_tau_Pearson2xK(rate_vec)
datadist <- Pearson2xK(3)
H_0 <- PointMassPrior(.0, 1)
prior <- PointMassPrior(theta, 1)

alpha <- 0.025
min_power <- 0.9
toer_cnstr <- Power(datadist, H_0) <= alpha
pow_cnstr <- Power(datadist,prior) >= min_power

11.2.2 Objective

The expected sample size under the alternative shall be minimized.

ess <- ExpectedSampleSize(datadist, prior)

11.2.3 Initial design

The initial designs for the minimization process is chosen using the in-built function.

tbl_designs <- tibble(
    type    = c("one-stage", "group-sequential", "two-stage"),
    initial = list(
        get_initial_design(theta, alpha, 1- min_power, 
                           dist = datadist, type_design = "one-stage"),
        get_initial_design(theta, alpha, 1- min_power, 
                           dist = datadist, type_design = "group-sequential"),
        get_initial_design(theta, alpha, 1- min_power, 
                           dist = datadist, type_design = "two-stage")))

11.2.4 Optimization

tbl_designs <- tbl_designs %>% 
    mutate(
       optimal = purrr::map(initial, ~minimize(
         
          ess,
          subject_to(
              toer_cnstr,
              pow_cnstr
          ),
          
          initial_design = ., 
          opts           = opts)) )

11.2.5 Test Cases

We first verify that the number of iterations was not exceeded in any of the three cases.

tbl_designs %>% 
  transmute(
      type, 
      iterations = purrr::map_int(tbl_designs$optimal, 
                                  ~.$nloptr_return$iterations) ) %>%
  {print(.); .} %>% 
  {testthat::expect_true(all(.$iterations < opts$maxeval))}
## # A tibble: 3 × 2
##   type             iterations
##   <chr>                 <int>
## 1 one-stage                24
## 2 group-sequential       6727
## 3 two-stage              5422

We then check via simulation that the type I error and power constraints are fulfilled.

tbl_designs %>% 
  transmute(
      type, 
      toer  = purrr::map(tbl_designs$optimal, 
                         ~sim_pr_reject(.[[1]], .0, datadist)$prob), 
      power = purrr::map(tbl_designs$optimal, 
                         ~sim_pr_reject(.[[1]], theta , datadist)$prob) ) %>% 
  unnest(., cols = c(toer, power)) %>% 
  {print(.); .} %>% {
  testthat::expect_true(all(.$toer  <= alpha * (1 + tol)))
  testthat::expect_true(all(.$power >= min_power * (1 - tol))) }
## # A tibble: 3 × 3
##   type               toer power
##   <chr>             <dbl> <dbl>
## 1 one-stage        0.0251 0.900
## 2 group-sequential 0.0249 0.900
## 3 two-stage        0.0249 0.899

We expect the sample size function \(n_2\) to be monotonously decreasing.

testthat::expect_true(
    all(diff(
        # get optimal two-stage design n2 pivots
        tbl_designs %>% filter(type == "two-stage") %>%
           {.[["optimal"]][[1]]$design@n2_pivots} 
        ) < 0) )

Since the degrees of freedom of the three design classes are ordered as ‘two-stage’ > ‘group-sequential’ > ‘one-stage’, the expected sample sizes (under the alternative) should be ordered in reverse (‘two-stage’ smallest). Additionally, expected sample sizes under both null and alternative are computed both via evaluate() and simulation-based.

ess0 <- ExpectedSampleSize(datadist, H_0)

tbl_designs %>% 
    mutate(
        ess      = map_dbl(optimal,
                           ~evaluate(ess, .$design) ),
        ess_sim  = map_dbl(optimal,
                           ~sim_n(.$design, theta, datadist)$n ),
        ess0     = map_dbl(optimal,
                           ~evaluate(ess0, .$design) ),
        ess0_sim = map_dbl(optimal,
                           ~sim_n(.$design, .0, datadist)$n ) ) %>% 
    {print(.); .} %>% {
    # sim/evaluate same under alternative?
    testthat::expect_equal(.$ess, .$ess_sim, 
                           tolerance = tol_n,
                           scale = 1)
    # sim/evaluate same under null?
    testthat::expect_equal(.$ess0, .$ess0_sim, 
                           tolerance = tol_n,
                           scale = 1)
    # monotonicity with respect to degrees of freedom
    testthat::expect_true(all(diff(.$ess) < 0)) }
## # A tibble: 3 × 7
##   type             initial    optimal          ess ess_sim  ess0 ess0_sim
##   <chr>            <list>     <list>         <dbl>   <dbl> <dbl>    <dbl>
## 1 one-stage        <OnStgDsg> <adptrOpR [3]>  184     184   184      184 
## 2 group-sequential <GrpSqntD> <adptrOpR [3]>  157.    157.  154.     154.
## 3 two-stage        <TwStgDsg> <adptrOpR [3]>  155.    155.  162.     162.

The expected sample size under the alternative of the optimized designs should be lower than the sample size of the initial designs.

testthat::expect_lte(
  evaluate(ess, 
             tbl_designs %>% 
                dplyr::pull(optimal) %>% 
                .[[1]]  %>%
                .$design ),
    evaluate(ess, 
             tbl_designs %>% 
                dplyr::pull(initial) %>% 
                .[[1]] ) )

11.3 Variant X-2: Two-sided Z-test

In the second variant, we want to conduct a two-sided test with normally distributed outcome. Unfortunately, a two-sided test in the setting of two-stage trials is more difficult to implement, as the decision whether to stop early or to continue cannot be expressed in the same way as usual (\(Z < c_f\) for futility, \(Z > c_e\) for efficacy, \(Z \in [c_f, c_e]\) for continuation). In the two-sided test scenario, we would need to consider \(|Z|\), so \(|Z| < c_f\) for futility, \(|Z| > c_e\) for efficacy, \(|Z| \in [c_f, c_e]\) for continuation. Unfortunately, this would lead to \(4\) necessary boundaries, which would not fit the adoptr-framework. If \(Z \sim \mathcal{N}(0,1)\) under the null, we can use the fact \(Z^2 \sim \chi^2\) and transform our two-sided test problem to a one-sided test. Hence, we stop early for futility if \(Z^2 < c_f\), for efficacy if \(Z^2 > c_e\) and we continue if \(Z^2 \in [c_f, c_e]\), where \(c_f\) and \(c_e\) are calculated by adoptr using a \(\chi^2\)-distribution.

11.3.1 Setup

Under the alternative, we assume an effect size of \(0.4\). We want to get a Type I error \(\alpha \leq 0.05\) and our power should be higher than \(0.8\)

theta <- get_tau_ZSquared(0.4, 1)
datadist <- ZSquared(two_armed = TRUE)
H_0 <- PointMassPrior(.0, 1)
prior <- PointMassPrior(theta, 1)

alpha <- 0.05
min_power <- 0.8
toer_cnstr <- Power(datadist, H_0) <= alpha
pow_cnstr <- Power(datadist,prior) >= min_power

11.3.2 Objective

We minimize the expected sample size under the alternative.

ess <- ExpectedSampleSize(datadist, prior)

11.3.3 Initial designs

The initial designs for the minimization process is chosen using the in-built function.

tbl_designs <- tibble(
    type    = c("one-stage", "group-sequential", "two-stage"),
    initial = list(
        get_initial_design(theta, alpha, 1- min_power, 
                           dist = datadist, type_design = "one-stage"),
        get_initial_design(theta, alpha, 1- min_power, 
                           dist = datadist, type_design = "group-sequential"),
        get_initial_design(theta, alpha, 1- min_power, 
                           dist = datadist, type_design = "two-stage")))

11.3.4 Optimization

tbl_designs <- tbl_designs %>% 
    mutate(
       optimal = purrr::map(initial, ~minimize(
         
          ess,
          subject_to(
              toer_cnstr,
              pow_cnstr
          ),
          
          initial_design = ., 
          opts           = opts)) )

11.3.5 Test

We check that in none of the three cases the number of iterations was exceeded.

tbl_designs %>% 
  transmute(
      type, 
      iterations = purrr::map_int(tbl_designs$optimal, 
                                  ~.$nloptr_return$iterations) ) %>%
  {print(.); .} %>% 
  {testthat::expect_true(all(.$iterations < opts$maxeval))}
## # A tibble: 3 × 2
##   type             iterations
##   <chr>                 <int>
## 1 one-stage                24
## 2 group-sequential       2563
## 3 two-stage              2383

We now verify the type I error and power constraints. To get better insights, we do not simulate the test statistic, but normally distributed outcomes and then square the test statistic.

N <- 10^6
simulation <- function(design, theta, dist){
    count <- 0
    for(i in 1:N){
        n1 <- n1(design)
        treatment <- rnorm(n1, mean = theta)
        control <- rnorm(n1)
    
        Z_square <- (sqrt(n1 / 2) * (mean(treatment) - mean(control)))^2
    
        if(Z_square > design@c1e){
            count <- count + 1
        }
    
        if(Z_square >= design@c1f & Z_square <= design@c1e){
            n2 <- n2(design, Z_square)
            treatment2 <- rnorm(n2, mean = theta)
            control2 <- rnorm(n2)
    
            Z_final <- (sqrt(n2 / 2) * (mean(treatment2) - mean(control2)))^2
            if(Z_final > c2(design, Z_square)){
                count <- count + 1
            }
        }
    }
    return(list(
        prob = count / N))
}


tbl_designs %>% 
  transmute(
      type, 
      toer  = purrr::map(tbl_designs$optimal, 
                         ~simulation(.[[1]], .0, datadist)$prob), 
      power = purrr::map(tbl_designs$optimal, 
                         ~simulation(.[[1]], .4, datadist)$prob) ) %>% 
  unnest(., cols = c(toer, power)) %>% 
  {print(.); .} %>% {
  testthat::expect_true(all(.$toer  <= alpha * (1 + tol)))
  testthat::expect_true(all(.$power >= min_power * (1 - tol))) }
## # A tibble: 3 × 3
##   type               toer power
##   <chr>             <dbl> <dbl>
## 1 one-stage        0.0499 0.799
## 2 group-sequential 0.0501 0.798
## 3 two-stage        0.0499 0.798

The \(n_2\) function should be monotonously decreasing.

testthat::expect_true(
    all(diff(
        # get optimal two-stage design n2 pivots
        tbl_designs %>% filter(type == "two-stage") %>%
           {.[["optimal"]][[1]]$design@n2_pivots} 
        ) < 0) )

Since the degrees of freedom of the three design classes are ordered as ‘two-stage’ > ‘group-sequential’ > ‘one-stage’, the expected sample sizes (under the alternative) should be ordered in reverse (‘two-stage’ smallest). Additionally, expected sample sizes under both null and alternative are computed both via evaluate() and simulation-based.

ess0 <- ExpectedSampleSize(datadist, H_0)

tbl_designs %>% 
    mutate(
        ess      = map_dbl(optimal,
                           ~evaluate(ess, .$design) ),
        ess_sim  = map_dbl(optimal,
                           ~sim_n(.$design, theta, datadist)$n ),
        ess0     = map_dbl(optimal,
                           ~evaluate(ess0, .$design) ),
        ess0_sim = map_dbl(optimal,
                           ~sim_n(.$design, .0, datadist)$n ) ) %>% 
    {print(.); .} %>% {
    # sim/evaluate same under alternative?
    testthat::expect_equal(.$ess, .$ess_sim, 
                           tolerance = tol_n,
                           scale = 1)
    # sim/evaluate same under null?
    testthat::expect_equal(.$ess0, .$ess0_sim, 
                           tolerance = tol_n,
                           scale = 1)
    # monotonicity with respect to degrees of freedom
    testthat::expect_true(all(diff(.$ess) < 0)) }
## # A tibble: 3 × 7
##   type             initial    optimal          ess ess_sim  ess0 ess0_sim
##   <chr>            <list>     <list>         <dbl>   <dbl> <dbl>    <dbl>
## 1 one-stage        <OnStgDsg> <adptrOpR [3]>  98      98    98       98  
## 2 group-sequential <GrpSqntD> <adptrOpR [3]>  87.2    87.2  82.0     82.0
## 3 two-stage        <TwStgDsg> <adptrOpR [3]>  87.1    87.1  83.2     83.2

The optimized design should have a lower expected sample size than the initial designs.

testthat::expect_lte(
  evaluate(ess, 
             tbl_designs %>% 
                dplyr::pull(optimal) %>% 
                .[[1]]  %>%
                .$design ),
    evaluate(ess, 
             tbl_designs %>% 
                dplyr::pull(initial) %>% 
                .[[1]] ) )