Intraoperative triple-low events (mean arterial pressure less than 75 mmHg, Bispectral Index less than 45, and minimum alveolar fraction of anesthetic less than 0.8) have been found to be associated with increased risk of mortality
A randomized electronic alert of triple-low events to treating clinicians did not reduce 90-day mortality
The alerts minimally influenced clinician responses, assessed as vasopressor administration or reduction in end-tidal volatile anesthetic partial pressure, and there was no association between response to alerts and mortality
Triple-low events predict mortality but do not appear to be causally related
Triple-low events (mean arterial pressure less than 75 mmHg, Bispectral Index less than 45, and minimum alveolar fraction less than 0.8) are associated with mortality but may not be causal. This study tested the hypothesis that providing triple-low alerts to clinicians reduces 90-day mortality.
Adults having noncardiac surgery with volatile anesthesia and Bispectral Index monitoring were electronically screened for triple-low events. Patients having triple-low events were randomized in real time, with clinicians either receiving an alert, “consider hemodynamic support,” or not. Patients were blinded to treatment. Helpful responses to triple-low events were defined by administration of a vasopressor within 5 min or a 20% reduction in end-tidal volatile anesthetic concentration within 15 min.
Of the qualifying patients, 7,569 of 36,670 (20%) had triple-low events and were randomized. All 7,569 were included in the primary analysis. Ninety-day mortality was 8.3% in the alert group and 7.3% in the nonalert group. The hazard ratio (95% CI) for alert versus nonalert was 1.14 (0.96, 1.35); P = 0.12, crossing a prespecified futility boundary. Clinical responses were helpful in about half the patients in each group, with 51% of alert patients and 47% of nonalert patients receiving vasopressors or having anesthetics lowered after start of triple low (P < 0.001). There was no relationship between the response to triple-low events and adjusted 90-day mortality.
Real-time alerts to triple-low events did not lead to a reduction in 90-day mortality, and there were fewer responses to alerts than expected. However, similar mortality with and without responses suggests that there is no strong relationship between responses to triple-low events and mortality.
Intraoperative hypotension,1,2 low Bispectral Index (BIS),3–5 and low minimum alveolar concentration (MAC) fractions6 have each been associated with mortality. Perhaps unsurprisingly, the combination of any two low values,7,8 and especially the combination of all three, a “triple low,”8,9 are strong predictors of postoperative mortality as summarized in a recent meta-analysis.10 A remarkable aspect of triple-low events is that they are defined by thresholds that are individually unremarkable, specifically mean arterial pressure (MAP) of less than 75 mmHg, BIS less than 45, and MAC fraction of less than 0.8.
The potential importance of the individual triple-low components is that they distinguish between the normal physiologic response to volatile anesthetics and patients at risk. For example, low MAC fractions normally provoke high BIS and high MAP. The opposite response (low MAP and low BIS) is unexpected and thus identifies patients who could be described as sensitive to anesthesia—possibly because of underlying fragility or illness.
Mild hypotension (i.e., MAP ≈ 75 mmHg) is usually considered to be harmless in most patients,1,2 and few anesthesiologists would consider such pressures to be alarming. However, just as otherwise-healthy patients can hypoperfuse their brains in the beach-chair position,11 relatively sick patients who are mildly hypotensive may have inadequate cerebral perfusion while supine. Even mildly low MAP may thus be associated with inadequate brain and organ perfusion in some patients. In theory, low MAC should be associated with high BIS. When it is not, brain hypoperfusion is one potential explanation—especially when hypotension is observed—and possibly explains why triple-low states are stronger predictors of death than mild hypotension alone.
As suggestive as the observational results are, causal conclusions regarding the impact of early intervention for triple-low events require a randomized trial design. A challenge is that only about one in five adults having noncardiac surgery experiences a triple-low event. A conventional randomized trial would thus need to enroll many patients for each who experiences a triple-low event, making the study impractical. We thus conducted an innovative comparative-effectiveness trial using real-time randomization based on decision- support technology.
We tested the theory that smart alarms for the triple-low state incorporated into a decision-support system prompts clinicians to intervene earlier in situations that would otherwise provoke little concern and that the alerts reduce 90-day mortality. Specifically, we tested the hypothesis that providing triple-low alerts reduces 90-day mortality. Secondary outcomes included the effects of alerts on 30-day and 1-yr mortality and the duration of hospitalization. We also evaluated the fraction of alerts that generated early clinician responses and consequent resolution of triple-low conditions. Finally, we evaluated the fraction of triple-low events that generated helpful clinician responses, independent of group status, and the relationship between helpful responses and mortality.
Materials and Methods
The trial was registered in October 2009 at ClinicalTrials.gov: NCT00998894. The protocol is available from the investigators. With institutional review board approval and waiver of informed consent, we considered consecutive adults having noncardiac surgery with volatile general anesthesia and BIS monitoring that started within 30 min of induction. There were no restrictions on the type of volatile anesthetic used; concomitant neuraxial anesthesia and nerve blocks were permitted. Patients were enrolled from July 16, 2010, to October 5, 2016, at the Cleveland Clinic Main Campus (Cleveland, Ohio).
Patients meeting these requirements were screened throughout anesthesia at 1-min intervals, with oscillo metric pressures carried forward when no new value was available. Triple-low events were identified when MAP was less than 75 mmHg, BIS was less than 45, and MAC fraction was less than 0.80. MAC fractions were calculated based on MAC values of 6.6% for desflurane, 1.17% for isoflurane, and 1.8% for sevoflurane. MAC values were not adjusted for age because a previous unpublished analysis indicated that adjustment did not substantively improve mortality prediction.
Patients who experienced triple-low events were randomized without stratification in real time using computer-generated codes generated by the statistical team using the PLAN procedure in SAS 9.2 (SAS Institute, USA) that were not available to investigators. Allocation was thus completely concealed. In the control group, triple-low events were electronically recorded, but no alert was given; in the remaining 50% of patients, clinician alerts were generated through our clinical decision-support system. Alert conditions were indicated by flashing a “DSS” button on the electronic anesthesia display, with the specific alert being identified when clinicians touched the button. An electronic pager alert was also generated that was sent to the in-room clinician and to the attending anesthesiologist. The text of the alerts read: “A triple-low (MAP, MAC, and BIS) condition has been detected. Consider hemodynamic support.”
If a triple-low event remained uncorrected, an additional alert was generated 10 min after the initial alert was acknowledged. Randomization was on a per-patient basis rather than by event. Consequently, subsequent triple-low events in a given patient were assigned the same randomization.
Implementation of the study was preceded by meetings and discussion within the Department of General Anesthesiology, so faculty, residents, and certified registered nurse anesthetists were well aware of the study, its basis, and its purpose. Clinicians were entirely free to act on the alert, ignore the alert, or consider the provided information without acting on it. Furthermore, the suggestion to consider raising MAP did not specify how pressure might be treated; clinicians accepting the suggestion might thus do so by giving a vasopressor, reducing anesthetic administration, augmenting vascular volume, putting the patient into the Trendelenburg position, or using a combination of approaches. Availability of alerts and clinicians’ responses to them therefore reflected real-world conditions rather than being guided by a strict efficacy-type protocol.
Randomization, the anesthesia record, a detailed record of triple-low events, alerts provided, clinician responses, MAP response, and in-hospital mortality were captured in our electronic record and decision-support system. When the study started, mortality (our primary outcome) was readily available from the Social Security Death Index. During the study, access was restricted, so we developed a two-pronged approach to obtaining vital status. First, we searched the Cleveland Clinic electronic records to find evidence of appointments and procedures subsequent to the index surgery, which indicated that the subject remained alive. Second, we queried the Centers for Disease Control National Death Index.
The randomized groups (alert vs. no alert) were descriptively compared for balance on baseline risk variables (demographics, past medical history/comorbidities, surgery type, etc.) using absolute standardized difference, defined as the absolute difference in means, mean ranks, or proportions divided by the pooled SD. Any variable with an absolute standardized difference of at least 0.045 (i.e., ) was considered imbalanced and adjusted for in all analyses.
Randomized groups (alert vs. no alert) were compared on the primary outcome, 90-day mortality, using Kaplan–Meier analysis with a log-rank test. The primary analysis included a Cox proportional hazards model to adjust for any imbalanced baseline variables which were also associated with outcome. Patients who were still alive at 90 days were censored at that time in the analysis.
We further assessed whether the treatment effect depended on key baseline variables including sex, age (greater than 60 yr vs. less than or equal to 60 yr), American Society of Anesthesiologists (ASA) status of I or II versus III or higher, and duration of case (more than 2 h vs. at most 2 h) by assessing the treatment-by-covariate interactions in separate Cox proportional hazards models and displaying a hazards ratio (97.5% CI) for each subgroup in a forest plot. We conducted sensitivity analyses using mortality data only from our hospital versus the primary analysis of also incorporating death information from the Centers for Disease Control National Death Index and found the estimate hazard ratios to be very similar for each interim analysis.
Secondary analyses assessed the effects of the alert on 30-day and 1-yr mortality and the duration of hospitalization using Cox proportional hazards regressions. For patients who died in the hospital (n = 277), duration was designated to be the longest observed hospital stay plus 1 day.
Helpful responses to triple-low events were defined by administration of a vasopressor within 5 min of the triple-low onset or a 20% reduction in end-tidal volatile anesthetic concentration within 15 min. The relationship between a helpful response to triple-low events and 30- or 90-day mortality was evaluated using a multivariable Cox proportional hazard model, adjusting for randomized group and unbalanced baseline variables. A Cox proportional hazards model was used to evaluate the time that elapsed between the initial episode alert until the triple-low condition resolved. We did not use Bonferroni correction for the analyses of the secondary outcomes: response to triple-low events and effect of responses.
This trial followed a group sequential design in which eight analyses (seven interims and a final) were planned, using the gamma error spending function.12 During the study, three of the interim analyses were inadvertently omitted because of a combination of the speed of enrollment and the “hidden” nature of the database alerts. Results for the final analysis presented here used interim-adjusted CIs incorporating the Z-statistic efficacy boundary of 2.077 (corresponding to P-value criterion of 0.038) for the n = 7,584 patients included. Throughout we refer to them as “adjusted 95% CIs” to indicate that the significance level was controlled at 5% for the primary outcome over the entire study (i.e., across the interim analyses).
Sample Size Considerations.
In our preliminary analysis from an observational study, risk-adjusted 90-day mortality in patients who experienced a triple low without clinician responses was 2.97%. We thus expect about ≈3% (3.2%) mortality without responses (no response or late response) in both randomized groups (alert and no alert). In contrast, 90-day risk-adjusted mortality was 1.97% in patients who experienced a triple low and were given vasopressors within 5 min. We thus expected a 90-day mortality rate of about 1.8% in patients in whom clinicians responded quickly to the triple low in either randomized group.
On the basis of other (nonrandomized) alerts currently in our system, we expected a large proportion of clinicians would respond effectively to the alert (i.e., increase MAP to at least 75 mmHg). In general, we expected 80% response to the triple low in the alert group and 20% in the nonalert group and 80% of responses to be effective in each group. The aforementioned assumptions implied that 90-day mortality would be 2.1% in patients with alerts and 2.9% in those without alerts, for a relative risk of 0.71. The maximum (across eight potential interim analyses) sample size of 14,443 was therefore based on having 80% power at the 0.05 significance level to detect a difference of 2.9% versus 2.1% in 90-day mortality for the alert and no-alert groups, respectively, for a relative 28% reduction. The incidence estimates were based on retrospective analyses and thus subject to various reporting and confounding biases.
Interim analyses were evaluated on a group A versus group B basis and were thus blinded to outcome direction. Clinicians participating in the study were not privy to interim results. At the second interim analysis (in August 2013), the maximum sample size was reassessed based on the observed incidence of 90-day mortality of 7.9% in the worst group. We thus resized the study using an internal pilot study design in which the incidence in the control group, which might be considered a “nuisance parameter,” was updated using the observed study data to that point.13 To combine our initial estimates with the observed incidence at the second interim, we assumed that the true baseline incidence in the worst group had a β distribution. Using that structure, we estimated the true incidence as a function of our original estimate (3%) and the observed 8%, giving 90% weight to the 8% and 10% weight to the initial estimate. This resulted in an estimate incidence of 7.6% for the worst group. To be conservative, we based the new sample size on 7% in the worst group, which also corresponds to the lower limit on a 95% CI on the observed 8%.
In the reassessment, assuming an incidence of 7%, a maximum of 7,060 patients were needed to have 80% power to detect a 25% reduction in 90-day mortality at the 0.05 significance level assuming eight interim analyses (as in original plan) and a gamma spending function with γ parameters of −4 for efficacy and −1 for futility (also as in original plan). We also redefined our sample size to be 7,060 patients in whom we could determine 90-day vital status, which represents our original intent. Because it was unclear in how many patients vital status would be available and because the Center for Disease Control National Death Index releases data on a yearly basis, we stopped enrollment at the end of 2016. This approach provided a cushion of about 500 extra patients under the assumption that vital status would not be available for some. Statistical analyses were conducted with SAS 9.2 and East 5 (Cytel, Inc., USA) software.
Figure 1 shows the enrollment, exclusions, and patients available for analysis. of qualifying patients, 21% (7,569 of 36,670) experienced at least one triple-low event and were thus randomized. Of 7,569 randomized patients, 3,764 (49.7%) were assigned to alerts and 3,805 (50.3%) to the nonalert group. In total, 95% (7,215 of 7,569) of patients had an arterial catheter.
Table 1 shows that baseline variables were well balanced between two groups except for drug abuse (absolute standardized difference = 0.048, which is higher than the criteria of 0.045) and type of surgery (0.094). However, the differences were tiny and not clinically meaningful (for example, the differences in each level of the surgical types was less than 1%). We therefore did not adjust for any baseline characteristics in our analyses.
More than 96% of triple-low alerts (or triple-low measurements in the nonalert group) were accurate. For technical reasons, about 11% of the alerts took more than 2 min to be generated and displayed or not per randomization. The averages of MAP, BIS, and MAC at the first alert (or would-be alert) did not differ in the two groups, with mean ± SD of 66 ± 7 mmHg for MAP, 38 ± 6 for BIS, and 0.65 ± 0.14 for MAC in the alert group.
The observed incidence of 90-day mortality was 8.3% in the alert group and 7.3% in the nonalert group, a difference that was not statistically significant with a hazard ratio (95% CI) of 1.14 (0.96, 1.35); P = 0.12 (table 2; fig. 2). The boundaries for futility were crossed with the prespecified P-value boundaries for futility of P > 0.038 (fig. 3). The treatment effect of the alert on the primary outcome of 90-day mortality did not depend on sex (interaction P = 0.46), age more than 60 yr (P = 0.31), ASA status of I or II versus III or higher (P = 0.17), or duration of case more than 2 h versus 2 h or less (P = 0.49; fig. 4).
No difference was found between the groups on 30-day or 1-yr mortality. The observed incidences of 30-day mortality were 4.8% in the alert group and 4.3% in the nonalert group with a hazard ratio (95% CI) of 1.10 (0.88, 1.38); P = 0.36. The observed incidences of 1-yr mortality were 14.9% in the alert group and 15.2% in the nonalert group with a hazard ratio (95% CI) of 0.98 (0.86, 1.10); P = 0.70 (table 2).
The length of hospital stay (discharge alive) did not differ significantly in the alert and no-alert groups, with a hazard ratio (95% CI) of 0.98 (0.94, 1.03); P = 0.50. The observed median (Q1, Q3) length of hospital stay was 7 (4, 11) days in each group (table 3).
Response to Triple-low Events
Helpful response to triple-low events, defined as vasopressor use within 5 min of the alert and/or a 20% decrease in end-tidal volatile anesthetic concentration any time during in the 15 min after alert, was 51% in the alert group and 47% in the nonalert group, for a relative risk (95% CI) of 1.08 (1.03, 1.14); P < 0.001 (table 3). Although highly statistically significant, the difference between 47 and 51% is not clinically important. The median (25th, 75th quartiles) number of minutes from the first alert to termination of the triple-low event did not differ between groups, with a hazard ratio (95% CI) of 1.04 (0.99, 1.09); P = 0.09.
Further, the alert did not change the proportion of patients with a 20% increase in MAP after either 5 min (P = 0.44) or 15 min (P = 0.40; table 4). In addition, the mean maximum change in MAP within 5 min after alert was not different between the alert and no-alert groups (12 ± 14 vs. 12 ± 14 mmHg), with a mean difference (95% CI) of 0 (−0.2, 1.1); P = 0.17. A sensitivity analysis using a 15-min interval gave similar results.
Relationship between Response to Triple-low Events and Outcomes
No relationship was observed between helpful responses to triple-low events (defined as vasopressor use in 5 min or 20% decrease in anesthetics by 15 min) and 90-day mortality adjusting for randomized group and covariates in the table 1. There was also no interaction between the response to triple-low and alert group on 30-day mortality (P = 0.83). The overall response rate was 49%. The observed 30-day mortality was 4.9% in the response group and 4.4% in the nonresponse group, with a covariable-adjusted hazard ratio (95% CI) of 1.08 (0.87, 1.34); P = 0.45. Similarly, we did not find a relationship between the response to triple low and 90-day mortality (hazard ratio = 1.06, 95% CI, 0.90 to 1.25, P = 0.52). Finally, there was no interaction between helpful responses and randomized group on 90-day mortality (P = 0.75). For patients who received an alert, the hazard ratio of 90-day mortality for the response group compared with nonresponse was 1.03 (95% CI, 0.81 to 1.31); for patients who did not give an alert, the hazard ratio of 90-day mortality for response compared with nonresponse was 1.08 (95% CI, 0.86 to 1.36).
Observational studies indicate that double-7 and triple-low8–10 events are strong predictors of postoperative mortality (with one exception14 ). Despite adjustment for known confounding factors, much of this association presumably results from selection of high-risk patients. Frailty, for example, is an important predictor of death15 but is not generally formally evaluated or recorded in electronic records. We could not directly assess whether triple-low events cause mortality because all enrolled patients had triple-low events. Instead, our major clinical question was the extent to which alerts and consequent interventions in response to triple-low events might be causally related to mortality; that is, whether intervening to limit mild hypotension, low MAC fraction, and low BIS might reduce mortality. Causality can only be established with reasonable certainty from an interventional trial such as ours.
Broadly speaking, all major outcomes were negative. Electronic alerts for triple-low events did not reduce 90-day mortality (our primary outcome), nor did they reduce 30-day or 1-yr mortality, which were our secondary outcomes. Nonetheless, interpreting our trial results requires some nuance because clinicians largely ignored the alerts. Clinicians responded helpfully (defined as vasopressor use in 5 min or 20% decrease in anesthetics by 15 min) to about half of the triple-low events, with or without alerts, and the duration of triple-low events did not differ in the alert and no-alert groups. The results were similar in a previous trial of alerts for double-low events, in which clinicians also largely ignored the alerts.16
In many respects, our trial therefore failed to adequately test whether helpful interventions for triple-low events improve outcomes. There was no apparent relationship between helpful responses to triple-low events in either group and adjusted 90-day mortality. Overall, our results do not support for the hypothesis that alerts for triple-low events reduce mortality.
Normally protocols are fairly tightly controlled in clinical trials to reduce response variability and thereby enhance internal reliability. A reasonable question is thus why the protocols for our current and previous trials of alerts for double- and triple-low events did not mandate specific responses such as vasopressor administration and reducing volatile anesthetic administration (which normally increases blood pressure and the BIS)? The answer comes from the trials’ unique designs, both of which used electronic systems to randomize qualifying patients in real time. Triple-low events are relatively rare, occurring in only about one of five surgical patients at the Cleveland Clinic Main Campus. Using a conventional approach, we would thus have had to consent more than 36,000 patients to accrue the 7,569 who were actually randomized, an obviously impractical number. Furthermore, efficacy trials, with their highly selected patients and rigid protocols, generalize poorly to routine clinical practice in broad populations. They are also limited in that mortality and many other serious complications are too rare to study except in the largest conventional randomized trials.
We therefore requested and obtained approval for waived consent from our Institutional Review Board based on national guidelines because (1) obtaining individual consent would be nearly impossible; (2) the provided alert was not currently the standard of care; (3) the recommended intervention (consider hemodynamic support) was low risk and likely to prove beneficial; (4) there was no prohibition against intervention in the control group nor a requirement to respond in the treatment group; and (5) part of the research was to determine acceptance of the decision-support recommendation, which would be impossible if only selected clinicians participated. A consequence of this approach was that we were unable to mandate specific responses in patients randomized to alerts, nor to prohibit responses in the no-alert control group. We expected clinicians to respond more aggressively to alerts than they did; we also expected fewer responses in the no-alert group. In fact, response rates were similar in each group. Being unable to control responses therefore turned out to be the trial’s major limitation.
The randomized patients were relatively sick. About 90% had ASA status of III to IV, and 95% had arterial catheters. Furthermore, 30-day mortality exceeded 4%, which is about twice the national average for noncardiac surgical inpatients. It is thus apparent that patients who experienced triple-low events were especially sick, which is perfectly consistent with such events being strong predictors of postoperative death. That triple-low events are associated with mortality is now well established10 but could not be confirmed in our present study because enrollment and our analysis were restricted to patients who had triple-low events. A limitation of our electronic records is that total fluids are tracked for each case, but timing of administration is not. It is thus possible that some clinicians responded to triple-low events, with or without alerts, by giving fluid boluses.
Our statistical methods were robust, including a group sequential design that controlled the overall type I error at 5% and power at 80% while conducting several interim analyses. A further strength was the inclusion of an internal pilot study at the second interim analysis, in which we reassessed the incidence of the primary outcome in the control group and then resized the maximum sample size for the study. This technique, in which either the planned SD for a continuous outcome or the proportion with the event in one of the groups for a binary outcome (because the variance of a proportion is a function of that proportion) is re-assessed during a trial, is a statistically sound and judicious method to adjust an initial sample size calculation.13 It is particularly helpful when, as often is the case, initial estimates of variability are only rough estimates based on existing data. As appropriate, our reassessment was done without observing or taking into account the estimated treatment effect, with only the variability estimate.
Decision-support alerts, even those that might seem obviously beneficial, may not trigger the expected behaviors and may not improve outcomes even when they do. For example, alerts for severe hypotension are not helpful because clinicians respond equally quickly and effectively without alerts.17 Similarly, a recently developed sophisticated decision-support system that provides substantial guidance to clinicians provoked less response than might have been expected and did not significantly improve outcomes.18 In our present study, clinicians largely ignored the alerts; that is, the expected response to the alert was only observed about half the time. These are just three of many reasons why alerts can fail to ultimately provide patient benefit. A corollary is that decision-support systems should be treated just like other devices and be formally validated.19 Failure to require adequate validation of electronic guidance and alerts will surely result in a proliferation of such systems that might actual worsen patient care by distracting clinicians.
In summary, real-time alerts to triple-low events did not reduce 90-day mortality, although there were fewer responses to the alerts than expected. However, similar mortality with and without helpful responses, independent of randomized group, suggests that there is little or no relationship between responses to triple-low events and mortality. Decision-support alerts, even those that might seem obviously beneficial, may not trigger the expected behaviors and may not improve outcomes even when they do.
Supported by Aspect Medical (Norwood, Massachusetts), Covidien (Fridley, Minnesota), Medtronic (Fridley, Minnesota), and the Drown Foundation (Los Angeles, California).
Dr. Sessler is a consultant for Medtronic. All other authors declare no competing interests.