Administrators need simple tools to quickly identify even small changes in the performance of perioperative systems. This applies both to established systems and to impact assessments of deliberate perioperative system design changes.
Statistical process control was originally developed to detect nonrandom variation in manufacturing processes by continuous comparison to previous performance. The authors applied the technique to assess the nonoperative time performance between successive cases for same surgeon following themselves in a redesigned operating room. This operating room specifically implemented a new patient care pathway that improves throughput by reducing the nonoperative time. The authors tested how quickly statistical process control detected reductions in nonoperative time. They also tested the ability of statistical process control to detect successively smaller performance changes and investigated its utility for longitudinal process monitoring.
Statistical process control detected a clear reduction in nonoperative time after the new operating room had been used for only 2 days. The method could detect nonoperative time changes of between 5 and 10 min per case for a single operating room within one fiscal quarter. Nonoperative time for the new process was globally stable over the 31 months analyzed, but late in the analysis period, the authors detected small performance decrements, mostly attributable to factors external to the new operating room.
Statistical process control is useful for detecting changes in perioperative system performance, represented in this study by nonoperative time. The technique is able to detect changes quickly and to detect small changes over time.
CURRENT trends in medicine require hospitals to be increasingly efficient in their use of resources. Hospitals, in turn, are renovating their facilities and changing their work processes in attempts to best meet their patients' needs. In the constrained fiscal climate of health care, it becomes important not only to monitor processes for their ongoing effectiveness, but also to have a sensitive (i.e. , detects small changes in performance) tool for detecting desired as well as any undesired effects as quickly as possible after new processes have been introduced.
We recently described a project, dubbed the Operating Room of the Future (ORF),1in which a working induction room, a dedicated early recovery area, and additional personnel were used to establish a new perioperative workflow wherein many activities occur in parallel, rather than following each other sequentially (fig. 1). The ORF processed cases more quickly and achieved greater throughput effectiveness than comparable standard operating rooms (SORs). Underlying this improvement in effectiveness, Nonoperative Time was reduced by approximately 40%.1
Fig. 1. Timeline comparison of the process flow in the standard operating rooms (SORs) and the Operating Room of the Future (ORF). Time is shown to scale in the horizontal dimension. Induction and Emergence are denoted, both in location and in temporal relation to other activities. Intra-Op = intraoperative patient care; OR = operating room; PACU = postanesthesia care unit; P-Op = postoperative patient care; Pre-Op = preoperative patient care.
Fig. 1. Timeline comparison of the process flow in the standard operating rooms (SORs) and the Operating Room of the Future (ORF). Time is shown to scale in the horizontal dimension. Induction and Emergence are denoted, both in location and in temporal relation to other activities. Intra-Op = intraoperative patient care; OR = operating room; PACU = postanesthesia care unit; P-Op = postoperative patient care; Pre-Op = preoperative patient care.
The analysis leading to this conclusion was conducted by retrospectively comparing the performance of the ORF surgeons working in the ORF with their own performance in the SOR.1The data for the analysis were collected over the course of 13 months, a seemingly long time to determine whether the ORF achieved higher throughput than SORs. An administrator monitoring such a project might wonder, “Could the result be obtained more quickly, or even concomitantly with the data collection?” In this article, we attempt to answer this question by using statistical process control (SPC) to examine nonoperative times between successive cases by a given surgeon in the new perioperative system.
Statistical process control was initially developed by Shewhart2in the 1920s at Bell Laboratories. It is a statistically based, easy-to-use quality improvement tool for distinguishing systematic variation from random variation in a process.3SPC is currently in use throughout a number of industries, including health care (see Benneyan et al. 4for an introduction). It may be applied by itself or as one of several tools adopted in “6-sigma” or other structured process improvement initiatives. In the perioperative arena, SPC identifies nonrandom variation in clinical and operational process outcomes.5–8Nonrandom variation indicates a systemic change in the process even for small samples and allows these changes to be detected as they are occurring, rather than after a period of data gathering.
We applied SPC retrospectively to the ORF data set to address the following questions:
Had SPC methodology been in place at the inauguration of the project, how quickly could a difference in nonoperative time (which led to increased throughput) of the magnitude produced by the ORF have been detected?
Was there a learning curve, i.e. , a gradual improvement in performance over time, associated with the adoption of the more complex patient flow shown in figure 1?
How small a difference in nonoperative time could have been detected within a short period of ORF inauguration, e.g. , within one fiscal quarter?
How durable is the improvement in nonoperative performance over the long run? That is, did the improvement in performance in the ORF include an effect due to enhanced team motivation that might have disappeared over a period of months or years?
Previously, we have shown that increased operating room (OR) effectiveness (i.e. , reduced time per case and more cases per day) came about by reducing the time required for the nonoperative portion of the perioperative process.1Furthermore, control of nonoperative time or its constituents is widely believed to contribute to OR throughput.6,7Therefore, in this report, we use Nonoperative Time (i.e. , the sum of all time spent not performing surgery) as a reporter for OR throughput effectiveness.
Materials and Methods
Source of Data
This retrospective study was conducted with the approval of the Massachusetts General Hospital Institutional Review Board (Boston, Massachusetts). Three general surgeons, a gynecologic surgeon, and a urologist used the ORF. All personnel worked in both the ORF and the SOR. We obtained the data for this study by integrating two sources. Our institution uses an internally developed computerized system called the Nursing Perioperative Record for perioperative documentation. The Nursing Perioperative Record includes time stamps for key milestone events. These time stamps have been found to be accurate.1The definitions of the relevant milestones and the intervals calculated from them are given in table 1. We also use an Anesthesia Information System (Saturn; Draeger Medical, Telford, PA), which replicates some of the milestone time stamps in the Nursing Perioperative Record and provides anesthetic milestones as well as demographic data.
Statistical Process Control Methods
Any process will experience natural variability due to unintended and uncontrollable sources of variation. On the other hand, a process may experience more systematic variability in key performance measures. Such variability often arises from nonrandom “assignable causes,” and a process operating in this state is called “unstable” or “out of control.” A process experiencing only chance variation is said to be in “statistical control” or “stable.” Processes may be stable for long periods. However, there may be either a gradual or sudden shift in performance (in this case, nonoperative time) that requires detection. In this article, we will use SPC to evaluate the impact of the ORF on nonoperative times, separating the effect of systematic process changes from random performance variation.
For all surgeons doing cases in both the ORF and SORs, nonoperative times from both environments were collected for the entire study period. We included all cases performed by each surgeon who had worked in the SOR followed by the ORF environments during the period encompassing September 1, 2001, to February 28, 2005. (These dates were chosen because they encompassed all cases performed by the beginning of data analysis for this report.) A very small subset of these data is reported in a previous analysis of ORF performance.1
We are interested in the ORF's performance in a realistic context, and in the ORF's performance as a hospital unit rather than in the performance of the individual surgeons who work in the room. Hence, we conducted SPC analyses for all of the ORF surgeons taken together as a group, rather than for individuals.
Statistical process control uses charts (frequently plotting performance as a function of time) to detect the out-of-control state of a process. There exist a number of variations of SPC charts (also called control charts) and methods for constructing them.3For simplicity, we have applied the same SPC methods for all charts in this article. We constructed SPC charts (a widely used form of SPC charts for variables) for this study by sequentially tabulating the nonoperative times for all of the cases performed in the ORF for the period of the study. We also tabulated sequential nonoperative times for the ORF surgeons working in the conventional OR environment for the 6 months before ORF inauguration. Sequential nonoperative times were then binned, with bin sizes ranging from three to seven individual nonoperative times, as described below. Next, we computed the mean and range of the nonoperative times in each bin. Finally, the means of the binned nonoperative times were plotted as a function of time to create the SPC chart. Therefore, each data point in the SPC analysis represents the average number of minutes of nonoperative time between the procedures in the bin.
Our hospital uses a block booking system in which a single surgeon controls an OR for the entire day. Therefore, the nonoperative time data are for cases performed back-to-back by the same surgeon. Bins were created by grouping cases sequentially. Hence, the bins of nonoperative times almost always contained data from at least two surgeons and at least two sequential days.
The SPC charts contain a horizontal center line representing the average values of the quantity (in this case, nonoperative time) being studied. Therefore, the process is said to be “in control” if accumulating data points are randomly distributed around the center line.
Typically, points in the SPC chart are connected by straight lines, allowing the user to more easily appreciate the development of trends over time. The SPC charts also contain additional horizontal lines to assist in determining whether data points are sufficiently close to the center line to be considered in control. These lines are called the upper and lower control limits (UCL and LCL). The center line and control limits are constructed as described below3:
where
and R1, R2, …, Rmare the ranges of the m bins. The constant A2is tabulated in Shewhart tables (available in most SPC texts, e.g. , Montgomery3) for various bin sizes. For example, a bin size of five gives A2= 0.577. Standard texts typically recommend bin sizes in the range of three to five as a good balance between sensitivity and noise in the data.
The UCL and the LCL are constructed to be +3 and −3 SDs above and below the center line, respectively. Thus, when a process is in control, 99.7%, or nearly all points of the baseline process being studied (representing, in this case, bins of consecutive nonoperative times) fall between them. If a point falls outside the control limits, this indicates that the process is out of control, and the advent of such a point dictates that the process must be examined for the cause of such a change in performance. Even when all of the points on an SPC chart are within the control limits, they might still form systematic patterns that indicate the process is out of control. For easier identification of such systematic patterns, SPC charts often contain additional horizontal lines indicating 1- and 2-sigma (where sigma is another term for SD) deviations from the center line.
To systematize the visual identification of systematic performance changes in SPC charts, formal rules have been developed. We used the Western Electric Rules3for analyzing the SPC charts (table 2), seeking to identify nonrandom variation. The Western Electric rules are a series of tests that can be visually applied to an SPC chart without requirements for calculations or performing statistical tests. However, the rules are based on sound statistical reasoning and reliably detect when processes are out of control.9
Bin size strongly impacts the sensitivity of SPC charts.3Furthermore, a variable bin size (obtained, for example, by using the number of cases performed per day as the bin size) makes for more complicated calculations as well as variable control limits. Therefore, we chose to use fixed bin sizes for this study to keep the method simple and accessible. To assess the influence of bin size on the results for nonoperative time performance in ORs, we conducted multiple analyses with bin sizes of three, four, five, six, and seven consecutive nonoperative times per bin. Based on the results of this first analysis, we chose a constant bin size of five for subsequent analyses, representing a compromise giving adequate sensitivity and simplicity.
Assessment of Raw Data Quality
The SPC methodologies described above assume the data in the analysis are normally distributed, although studies show the technique is very robust to deviations from normality.10,11To assess normality of the raw data, we plotted histograms and probability plots and performed Shapiro-Wilk tests using the JMP statistical software package (Cary, NC). OR time data are frequently skewed to the right and hence are better described by a log-normal than by a normal distribution.12Accordingly, we then log transformed the data and repeated the tests. Finally, the transformed data were plotted in control charts, and these were compared with control charts for the untransformed data to assess the functional consequence of deviations from normality on the results.
A second condition for using control charts is that the data are not autocorrelated, i.e. , each data point represents an independent observation. We tested for autocorrelation for lags of one through five behind each data point. To do this, we produced scatter plots and used the autocorrelation function for time series in the JMP statistical software package.
After these assessments of the data, we used modifications to basic SPC methodologies to conduct the following three specific studies.
Time to Proven Performance Difference
To answer the first question, “How quickly could the ORF's impact have been detected using SPC techniques?” we conducted the following analysis. We used 6 months' of data for the ORF surgeons as a group to construct a baseline for nonoperative performance before ORF opening. For the SPC analyses, we used bin sizes of three, four, five, six, and seven, seeking a reasonable compromise between robustness of the method at the expense of some loss of sensitivity (see below). Using the pre-ORF baseline data, we calculated the center line, control limits, and 1- and 2-sigma lines. We then added further data points (constant bin size of nonoperative times for same-surgeon, back-to-back cases in the ORF), seeking the minimum elapsed time (in days) after ORF inauguration at which a systematic difference in performance could be confirmed using any one of the Western Electric rules. We repeated this process with bin sizes ranging from three to seven nonoperative times per bin.
Sensitivity Analysis
To judge the sensitivity of SPC as a method for analyzing nonoperative times, we manipulated the postintervention nonoperative times to determine the smallest difference that could be detected within 3 months (i.e. , one fiscal quarter) of opening an ORF type project. We did this by successively adding 5-min increments to each ORF nonoperative time, repeating the SPC analysis (bin size of five) on the newly adjusted test data set and assessing the performance of the method in detecting a difference between the ORF and SOR nonoperative times using any one of the Western Electric rules. We repeated this approach with ORF data sets successively manipulated until the nonoperative performance improvement observed in the actual ORF data was almost completely erased.
Longitudinal Study
To assess the durability of the ORF intervention over time (years), we conducted a slightly different SPC analysis. Using the same methodology as in the “time to proven performance difference” study above, we constructed SPC charts for the entire period of the ORF's operation. The center line, control limits, and sigmas were calculated from the second through the seventh months of ORF operation. Data from the initial month of ORF operation were omitted from these calculations to eliminate the impact of any early learning effects as the ORF team became familiar with the system. Bin size was set at five consecutive nonoperative times per bin. The chart was analyzed by the Western Electric rules, and points indicating that the process was out of control were investigated by reviewing administrative and medical records, as well as personnel deployment records, to determine the source (institutional, patient related, or staff related) of out-of-control periods, if possible.
Results
Tests of Normality and Independence
Statistical process control techniques usually assume that the data in control charts are normally distributed and that each observation is independent of others in the sample. Therefore, before using the technique to evaluate a process, it is important to understand whether the data are normally distributed (or whether departures from normality have functional consequences for the results) and independent.
We used scatter plots, probability plots, and the Shapiro-Wilk test to assess the normality of three data sets: (1) “pre-ORF” data from the 6-month period before ORF inauguration, (2) “post-ORF intervention” time data from the 6-month period after ORF inauguration, and (3) a “post-ORF longitudinal” data set comprised of all data from the 31 months after ORF inauguration. The Shapiro-Wilk test indicated that the pre-ORF and post-ORF longitudinal data sets were not normally distributed (P < 0.05). Although not normally distributed, scatter plots and probability plots showed the data were not highly skewed. Schilling and Nelson11studied the effect of nonnormality on the control limits of charts and argue, “when used with nonnormal distributions, reasonably small probabilities of a type I error should suffice for construction of the chart.” For the nonnormal distributions studied by Shilling and Nelson,11the risk of a point falsely falling outside the control limits was shown to be 1.4% or less when plotting samples comprised of four or more points and using the normal assumption.
Operating room time data are sometimes better modeled by a log-normal distribution.12To assess whether this was true of our data, we log transformed each of the data sets above and repeated tests for normality. After transformation, the Shapiro-Wilk test indicated that all three data sets could be described by a normal distribution. To assess the functional consequence of deviations from normality on our results, we plotted the transformed data in control charts, and these were compared with control charts for the untransformed data. The charts for log transformed and untransformed data revealed no difference in results (data not shown). Because charts have been shown to be robust to the assumption of normality and because log transforming the data did not change our results, we decided to present the untransformed data for simplicity and accessibility.
Individual values for OR times are frequently assumed to be independent, but Dexter et al. 13found that successive turnover times could be correlated, thus invalidating statistical approaches that assume independence of sample points. Pooling the sample data are a solution to this problem.13To test for independence of the sample points in our data, we plotted each point against its immediately preceding point in the data set (i.e. , we made a scatter plot of the data vs. lag = 1). We repeated this process for lags of two through five. For each of the data sets described above, each of the lag scatter plots showed no correlation between data points (not shown). Statistical tests for autocorrelation for lag one through five confirmed the lack of correlation. Therefore, we concluded that SPC is a valid technique for use with our OR time data.
Time to Proven Performance Difference
In our previous work, more than a year elapsed between ORF inauguration and the analysis that demonstrated a marked reduction in ORF nonoperative times and the consequent increase in OR throughput.1In the current investigation, we applied SPC to determine how quickly the systematic change in performance manifested by the ORF could be detected using continuous process monitoring. Figure 2is a representative nonoperative time–versus –date SPC chart for ORF surgeons encompassing the period around the ORF inauguration. Each data point represents the average number of minutes, in a bin size of five, of nonoperative time between two procedures performed back-to-back by the same surgeon. Using the formulas presented previously in the article, the upper and lower control limits were determined and are shown in the chart.
Fig. 2. Statistical process control chart showing the nonoperative performance of Operating Room of the Future (ORF) surgical teams working in standard operating rooms during the 6 months leading up to ORF inauguration and then continuing for the first 6 months of ORF utilization. Each data point is the average for a bin of five consecutive nonoperative times for which a surgeon followed themselves. The Center Line denotes the average nonoperative time for the 6 months before ORF inauguration. The 1-Sigma, 2-Sigma , and Control Limit lines for the baseline performance are drawn 1, 2, and 3 SDs, respectively, above and below the Center Line.
Fig. 2. Statistical process control chart showing the nonoperative performance of Operating Room of the Future (ORF) surgical teams working in standard operating rooms during the 6 months leading up to ORF inauguration and then continuing for the first 6 months of ORF utilization. Each data point is the average for a bin of five consecutive nonoperative times for which a surgeon followed themselves. The Center Line denotes the average nonoperative time for the 6 months before ORF inauguration. The 1-Sigma, 2-Sigma , and Control Limit lines for the baseline performance are drawn 1, 2, and 3 SDs, respectively, above and below the Center Line.
Control limits should be based on a process that is in control. Therefore, we investigated the few points out of control in the baseline data (fig. 2), seeking common features that might indicate a nonrandom cause for their occurrence. No consistent assignable cause was found for these data points. Considering that a few out-of-control points will not distort the control limits significantly,3we chose to retain these data and keep the control limits unchanged.
Studying the run of data from June 3, 2002, through March 3, 2003, in figure 2, a singular decrease in nonoperative time appears in September 2002, coinciding with the inauguration of the ORF. The decrease is significant, from a former average of 61 min of nonoperative time between operations (for the period March 4, 2002, to August 27, 2002) to 39 min (for the period September 5, 2002, to March 3, 2003). We used the four Western Electric rules (table 2) for analyzing this SPC chart. Table 3shows the dates by which SPC analysis first indicated a systematic change in the nonoperative process of the ORF relative to SORs at our institution for bin sizes three through seven and for each of the four rules. For the ORF, a systematic shift in the process outcome (nonoperative time) was apparent by SPC methodologies at day 2 after inauguration. With the benefit of hindsight in such a retrospective study, we naturally know where to look for any anomalies in the SPC chart. However, inspection of figure 2and application of the Western Electric rules indicates that the magnitude of the decrease in nonoperative time is sufficiently large that the change would have been instantly identified even without knowing that there had been an intervention.
With respect to a learning curve associated with adopting the new patient flow process shown in figure 1, there was only a singular decrease in Nonoperative Time during the first weeks and months of ORF operation. There was no gradual decline in Nonoperative Time that would be consistent with a learning curve (fig. 2).
Sensitivity Analysis
Typically, changes in OR and hospital performance are much less dramatic, but they still arise from (and point to) an intervention or some other event leading to changed performance. Hence, it is important to have a sensitive tool for detecting changes in process efficiency, i.e. , a tool for distinguishing systematic variation from random variation. To this end, we investigated how sensitive SPC methods are for detecting smaller changes in ORF process performance. We tried to simulate smaller performance changes by adding time to the measured nonoperative times after the ORF was opened, narrowing the gap between apparent ORF performance and the previous SOR performance.
We conducted the sensitivity analysis on the same data used to determine how quickly the ORF intervention could be detected. We iteratively repeated the SPC analyses, using a bin size of five, modifying the nonoperative times for the ORF to simulate successively smaller improvements in nonoperative performance. The results are shown in figure 3, and the dates by which systematic variation could first be detected for the manipulated data using the Western Electric rules are shown in table 4. In figure 3, the top tracing replots the actual SPC results given in figure 2. In the second tracing down from the top, 5 min has been added to the ORF nonoperative times, thus reducing the gap in nonoperative times before and after the introduction of the ORF by almost 25%. Even only a perfunctory visual inspection of figure 3reveals that the break point denoting ORF inauguration is easily identifiable. By adding 10 min to the post-ORF nonoperative times (fig. 3, middle tracing), the gap is further lessened, but with the break still clearly visible. In the next tracing down, 15 min has been added, thus closing the gap between SOR and ORF nonoperative times by almost 75% of the true performance improvement. At this stage, one can hardly claim that a simple glance at the SPC chart reveals the break point. However, when applying SPC properly, changes are detected by subjecting the continuous measurements to the predefined Western Electric rules for spotting process changes. Even in this case, the change in nonoperative time is quickly detected by a majority of points falling below the center line in a nonrandom fashion.
Fig. 3. Sensitivity analysis of the statistical process control methodology conducted by modeling successively smaller differences between the Operating Room of the Future (ORF) and standard operating rooms. The top tracing is the statistical process control chart of the actual ORF impact relative to historical performance of ORF teams working in standard operating rooms. In the succeeding traces, 5, 10, 15, and finally 20 min have been added to each data point after ORF inauguration. LCL = lower control limit; UCL = upper control limit.
Fig. 3. Sensitivity analysis of the statistical process control methodology conducted by modeling successively smaller differences between the Operating Room of the Future (ORF) and standard operating rooms. The top tracing is the statistical process control chart of the actual ORF impact relative to historical performance of ORF teams working in standard operating rooms. In the succeeding traces, 5, 10, 15, and finally 20 min have been added to each data point after ORF inauguration. LCL = lower control limit; UCL = upper control limit.
Finally, in the bottom tracing of figure 3, 20 min have been added to the ORF nonoperative times, bringing the simulated ORF average nonoperative time up to 59 min, close to the historic average of 61 min. This represents only approximately 3.5% difference between the modeled ORF performance and the real SOR performance. The SPC chart shows no apparent break point. Application of the Western Electric rules does not quickly reveal any systematic improvement in ORF performance relative to the SOR.
Table 4indicates that the Western Electric rules have different advantages depending on the outcome under study. For example, rule 1 detects big changes the fastest, but when the differences in performance are smaller, rules 2, 3, and 4 perform better. Therefore, the best rule to use may not always be readily apparent. All four rules detect patterns that are statistically highly unlikely for any stable process, i.e. , they detect nonrandom behavior.
Longitudinal Study
Next, we present a longitudinal study of ORF nonoperative performance for 31 continuous months. In this longitudinal study, we searched for systematic deviations from typical ORF performance. We also correlated known changes in ORF working conditions that might impact ORF nonoperative performance with the SPC results from the periods when these changes were made.
Figure 4is a control chart (with a bin size of five) showing nonoperative times for the ORF surgeons for the 3 months before ORF inauguration (for reference) and for the subsequent 31 continuous months. Personnel changes in the ORF are also denoted in the figure. There are several features of figure 4that are worthy of discussion. First, the enhanced performance of the ORF relative to SORs persists over time. Second, the ORF's performance seems to degrade slightly beginning in June 2004. Third, the ORF performance improvement seems to be robust to personnel changes.
Fig. 4. Nonoperative performance of Operating Room of the Future (ORF) surgical teams working in standard operating rooms during the 3 months leading up to ORF inauguration, and then for 31 consecutive months of ORF utilization. The Center Line shows baseline ORF performance to which subsequent data are compared. Control Limit lines for the baseline performance are drawn 3 SDs above and below the Center Line . Substitutions of attending surgeons, substitutions of anesthesiology residents for the certified registered nurse anesthetist (CRNA) staff, and multiple replacements of attending anesthesiologists on the team are illustrated.
Fig. 4. Nonoperative performance of Operating Room of the Future (ORF) surgical teams working in standard operating rooms during the 3 months leading up to ORF inauguration, and then for 31 consecutive months of ORF utilization. The Center Line shows baseline ORF performance to which subsequent data are compared. Control Limit lines for the baseline performance are drawn 3 SDs above and below the Center Line . Substitutions of attending surgeons, substitutions of anesthesiology residents for the certified registered nurse anesthetist (CRNA) staff, and multiple replacements of attending anesthesiologists on the team are illustrated.
The ORF was initially organized as a special project, with volunteer staff and a more generous staffing ratio than the SORs. Therefore, the improved performance by the ORF should include an effect due to enhanced team motivation that would be expected to disappear over a period of months or years. However, longitudinal review of performance data demonstrates that the improvement in nonoperative performance persists. That is, there is minimal relaxation back toward the nonoperative performance of the SORs. Furthermore, the nonoperative performance of the SORs (ORF surgeons performing back-to-back cases in the SOR) has remained constant (data not shown). Therefore, the reduced nonoperative time required by the ORF relative to the SOR seems to be insensitive to an early “team” effects. Moreover, the process improvement seems to be permanent, at least at a gross level.
However, figure 4shows a decrement in performance starting in June 2004. Inspection of figure 4suggests that nonoperative times began to increase coincidentally with the addition of new anesthesia attending staff. Therefore, we searched the longitudinal data using the Western Electric rules over 1-month epochs, seeking changes in performance that could have been introduced in correlation with discrete events such as changes in the ORF personnel team. No such correlated performance changes were found. Indeed, closer examination of figure 4demonstrates that the first rule exceptions were on June 28, 2004, well before the anesthesia attending staff changes began to occur. Therefore, the performance decrement does not seem to be related to personnel changes. We then investigated each case with a nonoperative time greater than UCL in bins triggering the Western Electric rules (i.e. , bins showing systematic variation from in-control performance). Detailed analysis of the exception-triggering cases revealed that in one third of the cases, delay in getting a postanesthesia care unit (PACU) space due to congestion in the PACU was the cause. Because the nonoperative time includes the interval from Surgery Finish to Patient Out of Room, delays in obtaining a space in the PACU would impact nonoperative performance. The next leading causes were difficult regional anesthesia and surgeon in another OR. Together, these three potential causes could explain all of the points above the UCL. After removing cases with PACU delays from the data set, the decrement in nonoperative performance largely disappeared (not shown), implying that congestion in the PACU impacted OR performance.
Finally, review of the longitudinal SPC in figure 4indicates that isolated changes in the anesthesia or surgical personnel have not had systematic impacts on ORF nonoperative performance.
Discussion
We have used SPC methodology in the context of the ORF project to investigate four questions germane to evaluating the impact of changing hospital processes. These were: How quickly after an intervention can a change in performance be detected? Is there a learning curve associated with conversion to parallel processing of patient flow? How sensitive is the technique to small changes in highly variable processes? and How durable is this change in performance over time?
Operating room time intervals, including nonoperative times, have been studied using different methods (e.g. , generalized linear modeling14or by studying large samples13). When the goal is to detect changes quickly, SPC methodology has advantages over methods that are more traditionally applied in the medical setting. For example, in our previous analysis of ORF process effectiveness, we used a case-matched, balanced design in which nonoperative times for the same surgeons performing the same case mix in the two different environments (ORF vs. SOR) were compared. However, most of the surgeons working in the ORF performed the bulk of their cases in that room after inauguration. Therefore, we had to wait to perform the analysis until sufficient numbers of matching cases by ORF surgeons working in the SOR had been accumulated, and in some cases, we were unable to find sufficient contemporaneous controls even after 13 months. Furthermore, we had no reasonable estimates of the effect size before opening the ORF, so we were unable to perform a sample size calculation before beginning data collection. SPC methodologies avoid these difficulties. Used prospectively, SPC methods will detect changes in performance quickly by accumulating data continuously and identifying systematic shifts in a process. In the case of the ORF, we were able to identify a systematic shift in the nonoperative process 2 days after project inauguration. The location of all subsequent bins (well below the center line) in figure 2supports the notion that the intervention reduced nonoperative time.
In the ORF, the patient movement and treatment pattern is more complex than that practiced in our SORs (fig. 1). The temporal spacing as well as the physical locations of induction and emergence differ between environments. In addition, events in the ORF happen on a compressed time scale relative to the SOR. The implementation of parallel processing for nonoperative activities, including the use of an induction room and the transportation of anesthetized patients between rooms, seemed to represent a fundamental change in practice. For the staff, the ORF amounted to a wholesale redesign of their work routine. Therefore, in addition to any process efficiencies that might be gained from the new system, we expected to see a learning curve, i.e. , an improvement in performance during the early part of the project as the team began to use the space. Learning curves have been observed in other perioperative systems, such as migration from laparoscopic to robot-assisted prostatectomy.15Examination of figure 2(see period just after September 1, 2002) indicates that any early learning effect on the ORF Nonoperative Time was minimal if present at all.
Statistical process control analysis of the ORF for the 31 months following project inauguration indicates that after a singular improvement in nonoperative time (coinciding with room opening), the performance has been grossly stable over time. That is, the improvement in nonoperative performance has the added advantage of being durable. This durability of the effect is important, given the potential impact of “attention” on the performance measurements in the ORF. The design and objectives of the ORF were well known to everyone involved, planned well in advance, and characterized by some level of prestige. Knowing that the results of the ORF would be measured to assess the “return on investment,” it is likely that ORF teams put extra effort into their performance, thus contributing to the reduction in nonoperative time. However, SPC charts of the entire period of ORF use show little deviation from the performance benchmark established during the 7 months after project inauguration (at least when controlling for long OR emergence times that were influenced by PACU congestion). If a “teamwork” effect were contributing significantly to the ORF performance, we would expect more relaxation back to “average” performance as the novelty and prestige of the project wore off. Therefore, either the teamwork effect is prolonged, or the ORF performance improvement is inherently durable. In either case, the result is the same: SPC analysis confirms a sustained improvement in performance relative to the previous baseline.
Nonoperative performance was stable despite personnel changes in the ORF team that might have manifested as learning periods in the nonoperative process (fig. 4). The reasons for the absence of performance changes with the substitution of personnel in the ORF are unclear. Either the process is easier to learn than appears from the workflow diagram in figure 1, or the rest of the ORF team readily compensates during the learning period of a new team member, or both mechanisms are operant. The absence of a distinct learning curve during the early phase of ORF use, when the process was new to all users, suggests that the process is much more comprehensible to users on the ground than the process flow diagrams suggest. Therefore, it seems intuitive that a single-person substitution should be readily absorbed, and our SPC analysis validates this impression.
We have not been able to test SPC as a monitoring tool applied continuously in a prospective mode, but this is a logical continuation of our work. However, we have tried to estimate the power of the technique to detect smaller changes in performance with highly variable data. We did this by gradually decreasing the difference between ORF and SOR nonoperative time by adding 5, 10, 15, and 20 min to the ORF figures. Thus, by creating test data sets with small incremental performance differences, we have to some extent simulated the application of SPC to more subtle process changes. It should be noted that our sensitivity analysis cannot be simply translated to another clinical setting, where the variations in nonoperative times might be much larger (or smaller). However, the sensitivity of the SPC technique could be estimated in a different environment by essentially reversing our sensitivity analysis. That is, one could create an SPC chart using baseline data, define a measurement interval, and then incrementally adjust the baseline data enclosed by the measurement interval, noting the magnitude of the change at which SPPC rules began to indicate a systematic change.
The ORF intervention was readily detected in all but the +20 min case, indicating the ability to detect systematic changes of far less magnitude than the ORF, even for processes with substantial random variation. Failure to detect the +20 min example was expected for a process with substantial random variability when the performance difference is manipulated to make it so small. In fact, the SPC chart reveals only one instance of nonrandom performance (table 4, rule 3, 1/31/03). If the process were being monitored continuously, this instance would be subject to a search for an assignable cause of the nonrandom behavior. One might perform a t test to check for statistically significant differences between ORF performance and the previous controls. Such a test (comparing the 6 months of nonoperative times before ORF inauguration with the subsequent 6 months of nonoperative time performance) reveals a nonsignificant difference (P > 0.05). A likely conclusion would be that there was no detectable effect of the intervention.
The detection of minor but real changes in OR process variables hinges upon the “rules” used to analyze the SPC charts. These rules represent practical adaptations of statistical considerations. Choice of bin size impacts the control limits of the SPC charts, and thus also the sensitivity of the charts as well as the risk for type I and type II errors. Choice of rules for analyzing control charts also impact the risk for type I and type II errors. Therefore, deciding on bin size and rules for analyzing the charts are critical decisions that must be made in designing control charts. (See Montgomery3and references therein10,11for a full discussion on these considerations.) In our data, we have chosen to consistently apply the Western Electric rules to denote a change in performance. If applying a suitable set of rules, drift in processes should be quickly detected by the one or more rules that are breached in any given situation.
Considering the SPC technique itself, there are previous examples of successful application to healthcare settings.4–8The original application of SPC was to monitor product quality levels in terms of adherence to tolerances for geometric parameters of components. Therefore, it was not obvious that applying it to time parameters in a completely different setting would work as well. This is in part because the variability in perioperative process times is so much larger than the variabilities found in typical SPC applications (i.e. , manufacturing). In this study, we have attempted to assess the utility of SPC for the analysis or perioperative process changes. For analyzing effects of process changes on perioperative time intervals such as the nonoperative time, SPC methods are fully capable of revealing such an impact.
The power of SPC lies in its ability to give an “early warning” that systematic changes are taking place in the process being monitored. This applies to even seemingly insignificant shifts in the average value of the performance variable studied. Such an effect of the ORF (i.e. , a gradual improvement from SOR to ORF performance over an extended learning period) might have been more difficult to detect, but experience from other industries has shown that even slow “drift” away from the former average performance is detected by SPC.16Therefore, it seems clear that SPC can be applied for postintervention assessment of perioperative process changes, whether the performance change is sudden (as in the inauguration of the ORF) or gradual (as in the decrement in ORF performance caused by PACU congestion). The ease with which SPC charts can be applied as a continuous monitoring tool as well as the simplicity of interpreting the SPC charts are other distinct advantages. Thus, SPC charts can provide timely feedback, even on a daily basis, to the people upon whose efforts any meaningful process improvements must depend.
The authors thank Bethany Daily, M.H.A. (Administrative Director, OR Information Systems, Massachusetts General Hospital, Boston, Massachusetts), for access to administrative data.