GKM
December 12, 2024
A clinical trial for the comparison of a new blood pressure lowering drug A as compared to standard medication B is planned. It is assumed that blood pressure lowering is normally distributed with standard deviation \(\sigma = 20\).
The researcher team assumes that A lowers the blood pressure on average by 20mmHg, whereas B lowers only on average by 10mmHg.
Propose a sample size for this study at level \(\alpha = 0.025\) and power 80%.
The statisticians penetratingly asks if the standard deviation can be \(\sigma = 25\), and even the difference in blood pressure lowering might be only 5mmHg (which is clinically relevant, too). So the team might be overoptimistic.
What can be proposed?
Caution
No possibility to adjust in case of
- Over or underestimation of effect size
- Over or underestimation of variability
Caution
No early stopping of the trial
Caution
No possibility to adjust in case of
- Over or underestimation of effect size
- Over or underestimation of variability
Possible
Interim looks to assess stopping the trial early either for success, futility or harm
\(\hspace{1cm}\)
Caveat
Don’t fix the subsequent sample sizes in a “data driven” way. This could lead to a serious inflation of the Type I error rate. The effects of not considering this is described in, e.g., Proschan, Follmann and Waclawiw (1992).
\(\hspace{1cm}\)
Caveat
Furthermore, you have to fix the designing parameters (e.g., shape of decision boundaries, the test statistic to be used, the hypothesis to be tested) prior to the experiment. These cannot be changed during the course of the trial.
Possible
Possibility to adjust in case of
- Over- or underestimation of effect size
- Over- or underestimation of variability
\(\hspace{1cm}\)
Possible
Interim looks to assess stopping the trial early either for success, futility or harm
\(\hspace{1cm}\)
… and even much more
Pocock and O’Brien and Fleming design
Wang and Tsiatis \(\Delta\)-class \(u_k = k^{\Delta-0.5}\). O’Brien and Fleming: \(\Delta\) = 0; Pocock: \(\Delta\) = 0.5
rpactSequential analysis with a maximum of 5 looks (group sequential design)
Wang & Tsiatis Delta class design (deltaWT = 0.25), one-sided overall significance level 2.5%, power 80%, undefined endpoint, inflation factor 1.0718, ASN H1 0.7868, ASN H01 0.9982, ASN H0 1.0651.
| Stage | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| Planned information rate | 20% | 40% | 60% | 80% | 100% |
| Cumulative alpha spent | 0.0007 | 0.0041 | 0.0098 | 0.0170 | 0.0250 |
| Stage levels (one-sided) | 0.0007 | 0.0036 | 0.0076 | 0.0120 | 0.0163 |
| Efficacy boundary (z-value scale) | 3.194 | 2.686 | 2.427 | 2.259 | 2.136 |
| Cumulative power | 0.0289 | 0.2017 | 0.4447 | 0.6544 | 0.8000 |
Many other designs
Also:
Examples of \(\alpha\)-spending functions
Examples of \(\alpha\)-spending functions. \(\alpha_1^*\) and \(\alpha_2^*\) approximate O’Brien and Fleming’s and Pocock’s design, respectively. \(\alpha_3^*(\varrho)\) is plotted for \(\varrho = 1.0\), \(1.5\), and \(2.0\); \(\alpha = 0.05\).
rpact“Confirmatory adaptive” means:
Planning of subsequent stages can be based on information observed so far, under control of an overall Type I error rate.
Definition
A confirmatory adaptive design is a multi-stage clinical trial design that uses accumulating data to decide how to modify design aspects without compromising its validity and integrity.
How to construct a test that controls the Type I error?
Proposals
Bauer and Köhne (Biometrics, 1994): Combination of \(p\,\)-values with a specific combination function (Bauer, 1989)
Proschan and Hunsberger (Biometrics, 1995): Specification of a conditional error function.




Stopping boundaries and combination functions have to be laid down a priori!
For the latter three, in general, multiple hypotheses testing applies and a closed testing procedure can be used in order to control the experimentwise error rate in a strong sense.
- Conduct phase II trial as internal part of a combined trial - Plan phase III trial based on data from phase II part - Conduct phase III trial as internal part of the same trial - Demonstrate efficacy with data from phase III + II part
The proposed adaptive procedure fulfills the regulatory requirements for the analysis of adaptive trials as it strongly controls the prespecified multiple Type I error rate (strong control of familywise error rate).
Multiple Type I error rate
Probability to reject at least one true null hypothesis.
(Probability to declare at least one ineffective treatment as effective).
\(\hspace{1.6cm}\)
Strong control of multiple Type I error rate
Regardless of the number of true null hypotheses (ineffective treatments): \[\text{Multiple Type I error rate }\le \alpha\]
Multi-Arm Multi-Stage (MAMS) Designs
None of these critisms are sustainable
Adaptive designs per se seem to be accepted by the agencies for regulatory research as long as a detailed plan is provided, e.g., requirement for prospectively written standard operating procedures.
Do not use too many interim analysis.
Do not perform too early interim for showing efficacy.
Guidance advise against operational bias, e.g., treatment effect may be deduced from knowledge of adaptive decision. Sponsor has to take care of that!
Support study design through comprehensive simulation reports.
Careful application but principal acceptance
See also:
Twenty-five years of confirmatory adaptive designs: opportunities and pitfalls P. Bauer, F. Bretz, V. Dragalin, F. König, and G. Wassmer. Featured Article in Statistics in Medicine 35, 325-347, 2016. http://dx.doi.org/10.1002/sim.6472 (Open Access)
With invited discussion by Hung, Wang and Lawrence; Mehta and Liu; Vollmar; Maurer


