Abstract
Large randomized trials provide the highest level of clinical evidence. However, enrolling large numbers of randomized patients across numerous study sites is expensive and often takes years. There will never be enough conventional clinical trials to address the important questions in medicine. Efficient alternatives to conventional randomized trials that preserve protections against bias and confounding are thus of considerable interest. A common feature of novel trial designs is that they are pragmatic and facilitate enrollment of large numbers of patients at modest cost. This article presents trial designs including cluster designs, real-time automated enrollment, and practitioner-preference approaches. Then various adaptive designs that improve trial efficiency are presented. And finally, the article discusses the advantages of embedding randomized trials within registries.
Maintaining the health and well-being of populations is a universal priority. Healthcare is among the largest employers and consumers of resources in our global economy. However, it is also clear that the current growth in healthcare expenditures is unsustainable, already being approximately 10% of the economy in the United Kingdom, Australia, and Canada—and 17.8% in the United States.1 Governments around the globe are therefore seeking greater efficiency and better outcomes.
The highest level of clinical evidence is generally thought to be large randomized trials or a systematic review of several large trials.2 Large trials generally provide reliable results because they limit confounding and, when blinded, also limit measurement bias. Large multicenter trials typically enroll patients in diverse study settings and are therefore also generalizable (fig. 1).3–6 Trial results are more likely to be rapidly implemented when generalizability is high and important patient-centered outcomes were included.6,7
Trade-offs between explanatory trials (can it work in ideal settings?) and pragmatic trials (does it work in the real world?).
Trade-offs between explanatory trials (can it work in ideal settings?) and pragmatic trials (does it work in the real world?).
The difficulty is that enrolling large numbers of randomized patients across numerous study sites is expensive and often takes years. There will never be sufficient resources dedicated to conventional clinical trials to address the important questions in medicine. The cost, time, and difficulties of conventional trials have led to questions about their viability and a search for alternative approaches.8,9 Consequently, there are calls for improved efficiencies in medical research10 that have led to growing interest in novel trial designs8,11,12 such as low-cost, large, pragmatic randomized trials.13,14
Perhaps the most obvious alternatives to randomized trials are case-control or cohort studies based on “big data” registries with thousands to tens of millions of patient records. Large registries, combined with sophisticated analysis strategies such as propensity scoring, now provide valuable information, and for rare conditions or outcomes, they may be the only workable approach.15,16 However, even with the best analyses, observational data will always be subject to unknown degrees of bias and confounding that diminishes the reliability of the analyses and confidence in the conclusions.
Large number of participants, few exclusions
Real-world setting(s)
Patient-centered outcome(s)
Practical results, applicable to your own practice
Efficient alternatives to conventional randomized trials that preserve protections against bias and confounding are thus of considerable interest. Among the most useful are pragmatic trials conducted in “real-world” settings that enhance their external validity and clinical value, sometimes referred to as effectiveness trials. A common feature of novel trial designs is that they are pragmatic and facilitate enrollment of large numbers of patients at modest cost. However, as with all aspects of trial design, there are important trade-offs to alternative designs that may or may not be worthwhile in various circumstances (table 1). Generally, parallel-group randomized and blinded trials should be considered the default, with alternative designs being adopted only when there are compelling reasons.
Cluster Trials
Systematic healthcare-related interventions such as implementation of clinical pathways or electronic records cannot be turned on or off for individual patients, which precludes conventional randomized trial designs. Such interventions are thus often evaluated using before-and-after designs. However, that approach is weak because it suffers from three major sources of unquantifiable error, each of which makes interventions appear beneficial even if they are not.
The first error is time-dependent confounding, which results because healthcare outcomes generally improve over time consequent to multiple small changes, many of which are unrecognized or poorly quantified. For example, concern about surgical site infections may prompt more frequent handwashing, better skin preparation, glove changes at various stages of surgery, restrictions on operating room traffic, air conditioning filter changes, etc. Because many factors change over time, there is no basis for attributing all improvement to a specific intervention. The second error is the Hawthorne effect, which refers to improvements that result from focusing attention on a particular outcome. The third bias is regression to the mean, which occurs when an intervention is prompted by high incidence of the outcome. However, the observed high incidence, say of infection, may simply be a random variation that will return to the mean level with or without intervention. It is consequently unreasonable to attribute observed benefit from one intervention, for example changing the warming system, because it may well have occurred without any intervention whatsoever. For additional details about these sources of error, see recent reviews.17–20
Cluster designs eliminate many of the problems inherent in before-and-after studies. They are defined by exposure being allocated to groups of subjects. Typically, groups are defined by hospital or unit (fig. 2). Cluster approaches are especially useful for systematic changes such as implementation of rapid response teams that cannot be allocated individually or easily reversed. Cluster trials can be randomized or controlled without randomization and some involve crossover periods or sequential exposure. In this section, we will present just a few of many approaches to cluster trials.
Cluster designs comparing treatment A (yellow) and treatment B (green). The clusters may be randomized and treated in parallel (A) or in a crossover design that is best randomly assigned but could simply be alternated (nonrandomly) according to local logistics and patient numbers (B), or in steps to grow a “wedge” of active sites in which all clusters receive the experimental treatment (C).
Cluster designs comparing treatment A (yellow) and treatment B (green). The clusters may be randomized and treated in parallel (A) or in a crossover design that is best randomly assigned but could simply be alternated (nonrandomly) according to local logistics and patient numbers (B), or in steps to grow a “wedge” of active sites in which all clusters receive the experimental treatment (C).
Parallel Group Randomized Cluster Trials
Most conventional cluster trials simultaneously randomly assign groups of patients to one of two or more exposures. They thus resemble conventional parallel-group individually randomized trials, except that the randomization is on the basis of units rather than individuals. As with conventional trials, crossover designs can be incorporated into cluster designs. Clusters can be entire hospitals,21 units within a hospital,22 or even patients under the care of single physicians.23 In each case, however, all patients in a particular group are given the same treatment (fig. 2A) and are thus normally conducted with waived consent.
An advantage of cluster randomized trials without crossover is that they reduce the Hawthorne effect and largely eliminate learning that is a common feature of other designs including conventional parallel-group randomized trials. For example, consider a conventional randomized trial of guided fluid management for prevention of intraoperative hypotension. Individual clinicians may quickly learn from cases in which fluid administration is guided that they normally give too much or too little fluid. To the extent that they adjust their practice in patients not randomized to guided fluid management, the difference between the treatment groups will diminish—perhaps to the point where it is no longer possible to demonstrate a real benefit of guided management. In contrast, there would be little opportunity for clinicians to learn from one treatment and apply the information to the other condition in a cluster trial because each site/clinician would be exposed only to one treatment.
Sample-size estimation for all types of cluster trials is complicated but depends importantly on the number of clusters. Enrolling many patients within a cluster, which is usually relatively easy, does not substitute for having many clusters.24 The major challenge is thus usually establishing a sufficient number of clusters.
Randomized Stepped Wedge Trials
The term “stepped wedge” refers to the fact that at the beginning of the trial, the experimental treatment is used at none of the study sites. One randomly selected site then implements the experimental treatment. After a suitable interval, perhaps several months, another randomly selected site converts to the experimental treatment (fig. 2C). The process continues until all study sites are active, thus producing a growing “wedge” of active sites.25–27 An advantage of the approach is that all trial units eventually convert to the experimental treatment (such as a specialized stroke service) that might even be nationally mandated. In contrast, the experimental treatment is only available to half the units during the trial period with a parallel-group cluster approach.
For example, the TRACE (routine posTsuRgical Anesthesia visit to improve patient outComE) trial is a prospective, multicenter, stepped-wedge, cluster-randomized trial being conducted in The Netherlands to evaluate routine postoperative visits by an anesthesiologist to reduce the risk of postoperative complications.25 All hospitals start simultaneously with a control phase in which standard care is provided. Sequentially, in a randomized order, hospitals cross over to the intervention phase (routine visits). The trial is currently underway and is recruiting 5,600 adult patients at high risk of postoperative complications (trial No. NTR5506).
With some loss of rigor, nonrandomized stepped wedge designs can also provide value. These might be used when investigators cannot reliably control when interventions are initiated at various sites. For example, switching to electronic records is a complex process that most hospitals will not initiate at a particular time to please an investigator. However, to the extent that electronic records are implemented on a pseudorandom basis (say based on funding availability), investigators could reasonably analyze their effects in a group of hospitals as they switch over time. Stepped wedge studies require complex statistical methodology, especially with nonrandomized designs that need to be adjusted for potential confounding factors.28
Alternating or Randomized Multiple Crossover Cluster Controlled Trials
A recently developed pragmatic approach is an alternating cluster controlled trial. Effectively, these are cluster trials with interventions distributed in time rather than in space. In these trials, an intervention is implemented for a limited period, say 2 weeks, and then removed for a comparable amount of time. That 1-month block might be considered a before-and-after study with all the serious limitations inherent to that approach. However, the key element to alternating cluster trials is to repeat the study unit enough times to wash out time-dependent confounding from background improvements in healthcare and regression to the mean (fig. 2B). The Hawthorne effect is also limited because the observation period includes times with and without intervention, rather than just after an intervention. Furthermore, over a typical study period of a year or more, the trial becomes part of the clinical routine rather than a one-off focus on particular exposures and complications.
A defining aspect of alternating intervention cluster trials is that entire hospital units are studied and that the intervention of interest is applied to clusters of all or no patients within a given unit during the designated periods. For example, an alternating cluster trial might include all patients in a set of operating rooms or a particular surgical ward. A corollary is that consent is usually waived because exposure within a given unit is based only on time period rather than patient characteristics or consent.
Alternating cluster trials are controlled because exposure allocation is not determined by patient or physician preference. However, they are not randomized. (The order of the exposure periods could be randomized, although doing so adds little value to the trial.) Unless surgeons specifically schedule patients to particular time periods, exposure will effectively be random. In practice, this approach provides most benefits of randomization but without the cost and difficulty.29
Limitations of the approach include a lack of concealed allocation and lack of blinding. Additional limitations include clinician learning during intervention periods that improves care during nonintervention (or alternate intervention) periods. How likely such “contamination” is depends on the intervention(s) and the extent to which they can be controlled by investigators. Alternating cluster trials are thus best for interventions that truly are considered comparable by clinical teams and where the outcomes are objective and not exposed to detection bias—preferably recorded electronically.
Alternating cluster trials work best when the exposure can be tolerated by nearly all patients because it is impractical to enforce inclusion and exclusion criteria. (Enrolling nearly everyone does not preclude restricting analysis to an a priori defined population, nor does it prevent clinicians from using alternative approaches when an exposure may be suboptimal for a particular patient.) A consequence of broad enrollment is excellent generalizability. It is also possible to perform two or more simultaneous alternating cohort studies in the same unit(s) if the interventions and outcomes do not conflict. Similarly, the approach works perfectly well for factorial trials; for example, see trial No. NCT03657368, which uses an alternating cluster approach to factorially evaluate two different tidal volumes and two different positive end-expiratory pressure settings.
Because alternating cluster trials enroll large numbers of patients, baseline factors are inevitably well balanced. Statistical analysis can thus be as simple as chi-square or t tests. However, because the trials occur over a fairly long period, it is generally prudent to include time in a multivariable model and then estimate the independent effect of the intervention.
In recent years, alternating cluster designs have been used to compare isoflurane versus sevoflurane on the duration of hospitalization (n = 1,584),30 30% versus 80% intraoperative oxygen on infection and healing-related complications (n = 5,749),31 and normal saline versus balanced salt solutions in emergency departments (n = 13,347),32 operating rooms (NCT02565420, n = 8,600), and critical care units (n = 15,802).33 Each was characterized by rapid enrollment, and the trials were far less expensive than a conventional approach with individual patent randomization.
Multiple crossover cluster trials can also be randomized. That is, instead of simply alternating treatment allocations, to assign them randomly. Because it is usually unlikely that patients get into clusters because of trial allocation, random assignment is probably less important than for conventional trials. However, when practical, it is the preferred approach. Perhaps more importantly, allocation should be blinded when possible to avoid selection and measurement biases. Blinding can only be maintained with random allocation and is a compelling reason to randomize rather than alternate treatment interventions.
For example, consider a cluster trial of supplemental operating room air filtration and sterilization on surgical site infections. The treatment can be blinded by internally deactivating the supplemental filters. Combined with randomized allocation periods in each operating room (the clusters), blinding will prevent measurement bias.
Real-time Automated Enrollment and Randomization
There are conditions in which immediate treatment is necessary, and it is not possible or practical to obtain consent. The most obvious examples are sudden and unpredictable emergencies such as cardiac arrest or major trauma. For a time, concerns about subject autonomy prevented most such research. However, it was obvious that avoiding research in highly lethal emergency situations was in no one’s best interest. Nearly all regulatory systems now therefore permit research in such conditions with more or less scrutiny and oversight. For example, many critical care trials have relied upon a deferred consent model, whereby next-of-kin may be informed of the research (but usually cannot legally consent) and given an opportunity to opt-out, and included surviving patients are later asked to provide consent for their data to be included in the trial.34–36
There are other, perhaps less obvious, situations that preclude obtaining prior consent. Consider, for example, most any unexpected intraoperative event such as anaphylaxis (1 event per 677 cases37 ), severe airway problems, or serious hypotension. Because it would be impractical or impossible to obtain consent from all surgical patients in expectation of randomizing rare qualifying patients, waived consent or modified consent is necessary to study such conditions. Outright waiver is often appropriate when the test interventions are low risk or perhaps likely to be helpful compared with routine care. Alternatively, an institutional review board might request a modified consent, such as providing information in advance and the ability to opt out or requesting a posteriori consent from qualifying subjects and including their data in analysis only with approval.
A recent example of real-time automated enrollment and randomization was a trial that evaluated an alert for clinicians about intraoperative hypotension.38 More than 14,500 operations were screened in real time to identify 3,955 surgeries during which systolic blood pressure was less than 80 mmHg for 3 consecutive min. Thus, nearly four patients would have been asked for consent with a conventional approach to identify each who had a qualifying hypotensive episode. When episodes were identified, patients were randomized by a decision-support computer to either no alert or to alerts warning of hypotension that were posted to the electronic record screen and to the pagers of the in-room clinician and attending anesthesiologist. There was no required response to the alerts. The primary outcome was the duration of hypotension.
In another example, 36,670 patients were electronically screened in real time to identify 7,569 who experienced triple low events (a combination of mean arterial pressure of 75 mmHg or less, minimum alveolar fraction of less than 0.8, and Bispectral Index of less than 45).39 In this case, nearly five patients would have been asked to consent to identify each who had a qualifying episode, which would have made a conventional randomized trial impossible. Patients who experienced qualifying episodes were enrolled and randomized in real time by a decision-support computer to no alert or to alerts warning: “A triple low (MAP [mean arterial pressure], MAC [minimum alveolar concentration], and BIS [Bispectral Index]) condition has been detected. Consider hemodynamic support.” Again, there was no required response to the alerts. The primary outcome was 90-day mortality.40
Both trials were considered ethical, and consent was waived by the Cleveland Clinic (Cleveland, Ohio) institutional review board because alerts were unlikely to have been harmful, were possibly helpful, and because the trial would be impractical without waived consent. In both cases, the alerts proved to be unhelpful but also harmless. Real-time automated enrollment and randomization is a useful trial design for evaluating responses to relatively uncommon intraoperative events where it would be impractical to obtain consent from enough patients in advance.
Practice Preference Randomization
Variations in “standard” clinical care are widespread. For example, some sites routinely use midazolam to prevent intraoperative awareness during cardiac surgery, whereas others rarely do. Variations are most apparent across countries but can also be found within a single country or city. Natural clinical variation can be harnessed in a novel trial design termed practice-preference randomization.41 Practice-preference trials are a variation of “play-the-winner” rules used in some adaptive designs42,43 but without a dynamic component.44,45 In this case, the “winner” is an unequal allocation favoring existing practice (fig. 3). The approach is being used in a trial evaluating high-dose dexamethasone in cardiac surgery, which includes sites in Australia (that rarely use high-dose dexamethasone) and The Netherlands (where high-dose dexamethasone is used routinely).41
Practice preference randomized consent trial. First, trial sites or individual physicians are clustered into groups according to their current practice. Next, eligible patients are randomly assigned in an unequal ratio, say 2:1 favoring current practice for each site. Finally, enrolled patients are approached for consent, but only those in the nonstandard care arm at each site. ITT, intention to treat.
Practice preference randomized consent trial. First, trial sites or individual physicians are clustered into groups according to their current practice. Next, eligible patients are randomly assigned in an unequal ratio, say 2:1 favoring current practice for each site. Finally, enrolled patients are approached for consent, but only those in the nonstandard care arm at each site. ITT, intention to treat.
There are three steps in practice-preference trials. Initially, trial sites or individual physicians are clustered into groups according to their current practice routine. The next step is to randomly assign eligible patients in an unequal ratio, say 2:1 favoring current practice for each site. Therefore most patients at a given site receive standard treatment for that site. The third step is to enroll patients, obtaining consent only for patients in the nonstandard arm at each site.
These trials are efficient because only a fraction, say a third, of the patients are approached for consent at each site. However, because at least several sites with differing definitions of routine care participate, overall enrollment is roughly balanced across the entire trial. Because clinician preference is respected, trial engagement is likely to be enhanced. A requirement for waived consent in the standard-care group is trial-related measurements that are routine or at least minimally disturb participants.
Baseline imbalance and confounding are more likely with practice-preference than conventional randomized designs.46 Selection bias is also possible because patients are approached for consent after random assignment to nonroutine care, whereas in a conventional trial consent is obtained from all patients before randomization. Statistical methods should be used to evaluate and, as necessary, adjust for both types of error.
Adaptive Designs
Because large clinical trials usually take years to complete, it is common for new and relevant information to be published during the conduct of a trial. Similarly, important information may accrue during a trial and be detected at interim analyses. New knowledge sometimes suggests that aspects of an ongoing trial should be modified. For example, it may be necessary or advisable to alter enrollment criteria, drug dose, and which data are collected. It may also be necessary to recalculate the required sample size if the original assumptions prove incorrect.
Conventional clinical trials often include prespecified thresholds that determine whether a trial should be stopped or continued, which is perhaps the simplest type of adaptive design. Better, though, is the ability to systematically review accruing data and alter the protocol as necessary to reduce participant risk and enhance the clinical value of the results. Adaptive trial designs may include preplanned decision rules that permit changes to study population, assignment ratio, sample size, or study drug administration or dose (fig. 4).12 Adaptive designs are likely safer for participants because patients at special risk can be excluded based on new information, and ineffective or excessive doses can be corrected. Furthermore, funding agencies and investigators are more likely to get return on their investments because adaptive trials are likely to achieve meaningful results with fewer patients in a shorter timeframe than conventional approaches.
Adaptive trial design. In this example, there are four planned treatments being compared (treatments A, B, C, and D) with placebo and potentially with each other. At defined time points there is an interim analysis using accrued data to calculate a conditional probability of effect (efficacy, adverse effects) for each treatment. Any treatment with low probability of benefit or worrying probability of harm leads to a decision to stop that treatment or adjust the dose. If new possible treatments become apparent during the conduct of the trial (e.g., treatment E), they can be added in at a later stage.
Adaptive trial design. In this example, there are four planned treatments being compared (treatments A, B, C, and D) with placebo and potentially with each other. At defined time points there is an interim analysis using accrued data to calculate a conditional probability of effect (efficacy, adverse effects) for each treatment. Any treatment with low probability of benefit or worrying probability of harm leads to a decision to stop that treatment or adjust the dose. If new possible treatments become apparent during the conduct of the trial (e.g., treatment E), they can be added in at a later stage.
Although generally efficient, adaptive designs introduce considerable complexities. For example, statistical analyses of adaptive trials need to account for multiple testing because of the frequent interim analyses and confounding caused by the baseline imbalance consequent to small numbers of participants and temporal trends.47 Statistical simulations are often necessary, and Bayesian approaches (combining prior probability with observed results to estimate effect size) are sometimes used because prior information informs interpretation of sparse accruing data.48 Another important factor is that adaptive options should be preplanned and incorporated into an a priori protocol with appropriate statistical considerations. Unplanned protocol or analysis changes are often necessary, but they should be accurately presented as post hoc decisions rather than masked as “adaptive designs.”
Altering the Study Population
New information from other similar trials or the results of interim analyses often suggest that enrollment criteria should be modified. (A common enrollment change is to broaden criteria when trial enrollment is slower than expected, but this is a trivial case of adaptive design.) For example, new information might identify a subpopulation at especially high risk of toxicity. Simply prudence would suggest that patients at special risk be excluded—or at least be monitored especially carefully if the treatment is likely to overall beneficial.
Similarly, subpopulations might be identified that apparently receive little or no benefit from treatment or require a higher-than-typical drug dose to achieve comparable efficacy. Optimally designed protocols allow modifications that incorporate such obviously relevant information. It might also be necessary to increase trial size to provide sufficient power within clinically important subgroups. Adaptive designs increase the likelihood of the treatment being shown to be effective overall and of identifying subpopulations with substantively different responses.
Changing Treatment Group Assignment Ratios
Response-adaptive randomization refers to modifying random assignment ratios of multigroup trials when interim analyses suggest differential benefit. Consider, for example, a three-arm trial has an initial assignment ratio of 1:1:1 for treatments X, Y, and Z. At the first interim analysis, X and Y appear to provide more benefit than Z, even if the assessment is hardly definitive given that enrollment is so far modest.
Nonetheless, a reasonable approach, when permitted by the original protocol, is to change the assignment ratio to “play the winner.” In this case a higher proportion of patients are assigned to an apparently more successful treatment. For example, the randomization might be altered to assign X, Y, and Z in a 2:2:1 ratio. Subsequent enrollment will therefore favor X and Y, thus providing extra precision about those treatments. Treatment Z may turn out to be comparable, inferior, or superior, but there is logic in directing resources to the treatments that look best. A similar approach can be used for trials that test multiple doses of a single drug. Random assignment favoring apparently more effective treatments is ethically appealing because it helps maintain equipoise throughout the trial as information accumulates.
Changing Sample Size
Sample size for formal trials is based on estimates of population variance (for continuous outcomes) or baseline incidence (for dichotomous outcomes), along with the treatment effect. Treatment effect, that is the true difference between the treatment groups, is the most important determinant of sample size, with the number of participants increasing exponentially as treatment effect diminishes. The difficulty, of course, is that the purpose of proposed trials is to determine the treatment effect, so investigators perforce do not know what it will be during the design phase.
Investigators should also consider what treatment effect is likely to be clinically meaningful, and that threshold should be specified in advance. In some cases, such as mortality, most any improvement would be considered meaningful. However, for trials evaluating process measures and mediators, statistically significant differences may not be clinically important. In such cases, statistically significant differences should be considered negative results when their magnitude does not reach a previously specified threshold.
Skilled investigators consider various strategies for increasing treatment effect such as enriching the population, using large exposures (e.g., high drug doses), and selecting continuous or composite outcomes. However, within a given trial design, an almost overwhelming temptation is to anticipate a treatment effect large enough to make the trial practical with respect to available subjects, time constraints, and especially available funding. The difficulty, of course, is that the true treatment effect is determined by biology, not the investigators’ various constraints. It is thus common to complete planned enrollment and find a point estimate for treatment effect that is physiologically important but not quite statistically significant. Arguably, this is one of the worst outcomes for a trial because the investigators cannot claim a statistically significant benefit, but nor can they claim that there is no clinically important effect.
One cause for insufficient power is baseline variance or incidence being smaller than anticipated. That is relatively easy to deal with because sample size can be reestimated without statistical penalty on the basis of variance or incidence across the entire study population.49,50 Prudent investigators thus monitor outcome incidence and variance across the study population (without distinguishing among groups) and reestimate the sample size if necessary.51,52 A more complicated situation arises when trial results are uninterpretable because treatment effect is smaller than anticipated. There are at least two ways to reduce that risk, both being variations of group-sequential designs.
One way to reduce the risk of insufficient power consequent to a smaller-than-anticipated treatment effect is to overpower the trial while including many interim analyses with stopping rules that allow enrollment to stop early if benefit or futility are clearly demonstrated. Typically, results of interim analyses are evaluated by an independent executive committee or data and safety monitoring board, and even then usually blinded to allocation (that is, on a “group A” vs. “group B” basis). If benefit is as anticipated or stronger, the trial will stop early at lower or comparable cost compared with a trial without interim analyses. However, if the treatment effect proves to be smaller than anticipated, the original design allows enrollment to continue—possibly to the point of clearly identifying benefit or futility. A second way to reduce risk of underpowering trials is to specify in the original protocol that sample size will be recalculated at given point or specified intervals based on (usually blinded) predefined outcomes of interest. Reestimation of the maximum sample based on observed treatment effects is complex, and it can be challenging to avoid both type 1 and type 2 statistical errors.53
Changing Study Interventions: Platform Trials
Evaluation of new drugs is a costly process. Phase II drug development studies, for example, often evaluate several doses in an effort to determine an optimal one for testing in a phase III trial. It is hardly unusual for interim analyses to identify unexpected intolerance or toxicity or differential efficacy in subgroups. Many promising drugs fail early-phase testing because initial studies were unreliable; other drugs are found to be ineffective or harmful only because the wrong study population or dose was used or the sample size provided inadequate statistical power. Adaptive designs reduce risk by introducing flexibility into the evaluation process while maintaining statistical rigor.54
An extension of the adaptive design concept is to establish an overarching platform protocol that permits introduction of additional treatments, for some treatments to be dropped, and for additional therapeutic questions to be tested without a set end date.55,56 The master protocol defines one or more target populations with a common system for patient selection, study procedures, process and outcome templates, and data management.
The approach is efficient because platform trials allow for concurrent or sequential evaluations of multiple treatments within a single drug, condition, or surgical population.55,56 These have a common control arm, and many different treatment arms that are included or removed from the trial as futility or efficacy are demonstrated, often according to Bayesian decision rules.1 Platform trials therefore avoid the need to design separate trials for each drug/treatment evaluation and to separately obtain ethical and regulatory approvals. An additional advantage is that accumulated infrastructure, procedures, and experience and knowledge is valuable and can roll over from one treatment to another within an overarching platform trial. The same general approach can be extended to cohort designs.57
Subtypes of the platform design, also using master protocols and currently most often used in oncology, include basket and umbrella trials.58,59 A basket trial evaluates a specific treatment in a variety of diseases (e.g., different types of cancer) defined by a particular molecular marker. An umbrella trial evaluates multiple treatments in one or more diseases, defined by an expectation that each included disease state could have a beneficial and similar response to any particular treatment. Here the set of diseases (e.g., tumor types) is the “umbrella.”
Platform, basket, and umbrella trials share the common characteristic of addressing multiple questions in a single study program, maximizing efficiency by obtaining more information in a shorter time. Adaptive designs in general, and especially platform, basket, and umbrella trials, are statistically challenging and require complex sample-size estimates and analyses. Professional statistical collaboration is essential.
Embedding Randomized Trials within Registries
Randomized trials are rightly considered to provide the most reliable evidence of treatment effects, but trial entry criteria and stringent methodologies can limit generalizability. Pragmatic randomized trials include a broader range of patients and often in diverse healthcare settings, thereby offering greater generalizability. Even so, there are often groups of patients (e.g., those with cognitive impairment or language difficulty) or healthcare settings (e.g., low resource areas) that are not representative. Large amounts of data kept in electronic medical records and various registries are typically available for research. Clinical information systems and electronic health records can facilitate routine collection of process and outcome data to support low-cost clinical research.60 To the extent that necessary outcome data are available from registries, trials can be conducted simply by controlling exposure allocation.61
Introduction of random assignment into a clinical registry combines the strengths of a large pragmatic trial while costing much less than a typical trial, which requires individual participant evaluation, completion of case-report forms, and subsequent transfer of information to the trial database (fig. 5).62 Registry-based randomized trials represent the epitome of comparative effectiveness research: generalizability of the findings, rapid enrollment, high completion, and often longer-term follow-up. The mechanics of randomization will depend on the context but will most often be similar to conventional randomized trials such as a central web-based system. However, alternative types of randomization or pseudorandomization can also be used when appropriate.
Registry-based randomized trials nonetheless present important challenges including broad capture to avoid exclusion bias, variable quality data in registries, and ethical concerns about lack of patient autonomy (if explicit consent is not obtained) and privacy (keeping research data confidential). For example, data quality in registries virtually never approaches the accuracy level of audited and monitored case-report forms. As in any other study using registry data, the validity of the underlying data is key. Investigators thus need to have a good understanding of the extent to which measurement bias, random error, and missing data might falsify analyses. Registry-based trials are nonetheless a disruptive technology in clinical research63 because they simplify and speed enrollment while much reducing cost. An additional advantage of trials conducted within registries is that it is easy to subsequently evaluate the extent to which trial results are adopted into everyday practice.
Alternating cohort and other cluster trials often use data from electronic records to evaluate outcomes,30,31 but it is also possible to conduct otherwise conventional clinical trials in which all or nearly all outcomes are obtained from electronic records rather than case reports. At the Cleveland Clinic, for example, we routinely conduct trials in which all substantive outcomes are obtained from various electronic registries.64 The efficiency of this approach is obvious, and it much reduces the cost of conducting trials.
Basic Health and Human Services Policy for Protection of Human Research Subjects (Available at: https://www.ecfr.gov/cgi-bin/retrieveECFR?gp=&SID=83cd09e1c0f5c6937cd9d7513160fc3f&pitd=20180719&n=pt45.1.46&r=PART&ty=HTML; accessed October 7, 2019)
Consort 2010 statement: Extension to cluster randomised trials. BMJ 2012; 345:e5661
U.S. Food and Drug Administration. Informed Consent: Guidance for IRBs, Clinical Investigators, and Sponsors. July 2014 (Available at: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/informed-consent; accessed October 7, 2019)
U.S. Food and Drug Administration. Adaptive Designs for Clinical Trials of Drugs and Biologics: Guidance for Industry (Available at: https://www.fdanews.com/ext/resources/files/2018/2/09-28-18-AdaptiveTrials.pdf?1538162421; accessed October 7, 2019)
An Introduction and Overview of Innovative Trial Design (Available at: https://www.nhlbi.nih.gov/events/2018/introduction-and-overview-innovative-trial-design; accessed October 7, 2019)
Conclusions
There will never be enough conventional clinical trials to address even a small fraction of important clinical questions. Novel trial designs, some with waived or modified patient consent, are increasingly being used to answer research questions more efficiently. Modifications to conventional trial designs that introduce flexibility and efficiency are also becoming more common. Many of the enhancements we review are attractive because they speed enrollment and lower the cost of research. They also raise new challenges in terms of planning, conduct, ethical oversight, and statistical analysis.
Research Support
Support was provided solely from institutional and/or departmental sources.
Competing Interests
The authors declare no competing interests.