“The call ... is twofold: create readily comprehensible, valid interpretations of anesthesia performance metrics … and develop meaningful metrics that can be tracked and improved.”
“Try – not! Do! Or do not! There is no try!”
—Yoda
Anesthesiology has striven to improve its performance since its inception as a medical specialty, perhaps because the evidence of failure in our specialty’s infancy was often death or grievous injury. We are beyond the era when survival was considered evidence for success and high quality in most anesthesia subspecialties. The focus now is on value, seeking to maximize the ratio of quality over cost. How to measure quality? Especially, how do we measure quality attributable specifically to the anesthesia provider? In this issue of Anesthesiology, Bayman et al.1 take on the quality question, focusing on an outcome that might map cleanly to the anesthesia team’s decision-making (the time from end of surgery to extubation, hereinafter called time to extubation). This outcome is certainly visible to the eyes in the operating room, surgery center, and hospital administration trained upon “Anesthesia.”
The article by Bayman et al. contains several implicit warnings for anesthesiology as a field. First, beware of simplistic statistical approaches to comparing quality. Means, medians, and 95th centiles as well as multiple comparisons abound on dashboards of healthcare quality. Probability dictates that someone will be on the wrong side of the cutoff line, and simplistic approaches may sweep up clinicians whose practice is normative, but simplistic analyses without thoughtful follow-up identify them as outliers.2,3 Anesthesiology as a field can commend itself for bringing rigor to many operational measurements,4–7 where simplistic analyses initially lead to inappropriate conclusions. It is important to make these rigorous approaches intellectually accessible to administrators, policy makers, and bureaucrats. The gold standard of simplicity should be the mean, median, or target decile. Observed-to-expected ratios have arguably entered this class of readily understood metrics. Moving Bayesian analyses into the common lexicon of quality and value measurement is an important, unfinished task. Frequent, clear reinforcement is an important pedagogical objective.
Second, beware of simplistic indicators of quality. The authors provide explanations why 15-min time to extubation is the limit of acceptability, due to the cost of operating room time and the inconvenience to the surgical team, but so many extubation times exceeded 15 min that this cutoff seems debatable. Moreover, is time to extubation a metric that we want to use for valuing our specialty? In a patient-centered health system, we should attend to the outcomes that patients care about. Moreover, are anesthesiologists so subsidiary to every other part of the perioperative process that time waiting for emergence from anesthesia (arguably a pharmacologic–physiologic process that we indirectly control) is something that we should actively minimize any more than surgeons try to manage the time to accomplish the operation? We certainly should diligently work to minimize unnecessary anesthetic exposure and inefficient resource consumption, but should our perioperative colleagues be led to believe that anesthesia is something we should be able to turn on and off like water from a faucet? If this is true, are we sure there is a basis for a medical specialty built around anesthesia, or can it be turned over to technicians? (This rhetorical question is intended to be provocative.) It seems more likely (or more hopeful) that there are more meaningful outcomes of anesthesia to measure and improve than time to extubation. Bayman et al. deflect this simplistic measure by revealing its minimal discriminant power, but at the risk of demonstrating that yet another potential indicator of anesthesia quality actually has no utility.
A third warning is the unintended but distressing message arising from this work, along with several other recent publications, specifically, that few of the process-oriented quality or value measures anesthesiologists have focused on can be readily improved. Some elements of performance in control of the anesthesiologist can be improved (apparently),2,8–12 but more careful statistical analyses indicate that the changes are small at best at the individual provider level, and that there are few, if any, provider outliers,3 or that the observed clinical effects of performance variations are minimal.13 Anesthesiologists run the risk of painting a picture to regulators, policy makers, and payers of not being able or willing to improve anything.
Bayman et al. provide a more subtle message about quality and value improvement in anesthesiology that attending anesthesiologists alone are rarely the main drivers of the anesthetic outcome of interest, in this instance, time to extubation. Here is the important conclusion of this and many other studies of anesthesia process measures: Most cases were performed by two providers (at least), both the attending anesthesiologist and the in-room clinician, either a resident or a certified registered nurse anesthetist. The in-room provider exerts varying, but unmeasured control over decision-making about each case. Programs are training their residents to become autonomous, and certified registered nurse anesthetists enjoy limited, although unquantified, autonomy in the care team model. Therefore, the impact of attending anesthesiologists on such a shared outcome as time to extubation is likely to be muted.
This observation highlights an important constraint in the design of prospective whole-population studies of anesthesia outcomes or efforts to improve anesthesia performance. Because there are almost always two providers (at least in North American sites performing outcomes research), because the influence of each provider on the conduct of the case is substantial (probably larger than faculty like to admit), and because provider pairings change almost every day,* it is problematic to randomize individual clinicians when studying an intervention. The effect of intervention is likely to be modulated (likely downward, but possibly carried across to faculty randomized to the alternate group) by the participation of in-room clinicians who are not randomized and change daily. Similarly, randomizing both groups creates a 2 × 2 table (even before considering hand-offs for shift relief) in which half of the pair combinations are discordant with respect to randomization. Both possible randomization designs quickly expose the study allocations (and hence, probably the hypotheses) to the study participants, further degrading the methodologic strength of randomization.
What then is the solution? Studies of anesthetic interventions must be larger than single departments. For example, with a few departments, it is possible to construct a modified case control design with “case” departments making a before and after intervention with some departments serving as a control.11 Even better, with multiple departments or units, separated by enough geography to prevent exchange of personnel, but serving similar populations, it is possible to achieve cluster randomization in pragmatic trials where clinicians are aware of the study hypotheses.14,15 Given that these studies are likely to be very expensive, what should we study to measure and potentially improve anesthesia quality and value?
Recent literature provides some interesting examples: Evaluation of Nitrous Oxide in the Gas Mixture for Anaesthesia (ENIGMA II)16 and Perioperative Ischemic Evaluation (POISE) 117 and 218,19 are “classic” drug versus drug designs that led to surprising and interesting conclusions and raised as many questions as they answered. However, they are not designed to assess the impact of anesthesia efficiency or operational interventions. Anesthesiologists, as system-embedded physicians, practice in an environment where most interventions are linked to multiple outcomes, and vice versa, and where multiple actors influence outcomes simultaneously. Given the difficulty of separating even the attending anesthesiologist vis-a-vis in-room clinician dyadic impacts on study outcomes, it seems virtually impossible to design an incisive study of anesthesia interventions on relevant patient outcomes. One final complexity is the imperative to move to holistic functional outcome measures of medical interventions rather than process outcomes.20 This adds to the complexity of teasing out what contribution, if any, that anesthesiology performance makes to overall patient outcomes, although there have been recent attempts to do so.21
It is incumbent on anesthesiologists to engage in the performance improvement and value maximization conversation, despite the difficulty in attributing the observed effects. If we do not, bureaucrats who understand simplistic analyses will define and analyze performance metrics for us. The call, then, is twofold: create readily comprehensible, valid interpretations of anesthesia performance metrics that, for better or worse, have gained traction, and develop meaningful metrics that can be tracked and improved.
Funding to find innovative approaches to defining, tracking, and analyzing meaningful quality metrics that map to anesthesia seems on its face to be difficult to find. In an apparent vacuum, departments, anesthesiology groups, payers, and regulators are self-funding grass roots efforts, which is commendable. The Patient-Centered Outcomes Research Institute and the Agency for Healthcare Research and Quality are potential Federal sources, and the American Society of Anesthesiologists itself is another potential source of substantial support. In the current resource-constrained environment, substantial, unrestricted funding is required to support dedicated, scientific efforts in search of important metrics and valid methods that focus on meaningful, patient-centered opportunities to measure performance and improve value. Challenging remains this question of funding, for now.
Acknowledgments
Support was provided solely from institutional and/or departmental sources.
Competing Interests
The author is not supported by, nor maintains any financial interest in, any commercial activity that may be associated with the topic of this article.
Creating fixed pairings of attending anesthesiologists and in-room clinicians, especially with trainees who must progress through a series of rotations over three Clinical Anesthesia training years, is a potential but impractical solution.