Although automated closed-loop control systems may improve quality of care, their safety must be proved under extreme control conditions. This study describes a simulation methodology to test automated controllers and its application in a comparison of two published controllers for Bispectral Index (BIS)-guided propofol administration.
A patient simulator was developed to compare controllers. Using input scripts to dictate patient characteristics, target BIS values, and the time course of surgical events, the simulator continuously monitors the infusion pump under control and generates BIS values as a composite of modeled response to drug, perceived stimulation, and random noise. The simulator formats the output stream of BIS data as input to the controller under test to emulate the serial output of the actual BIS monitor. A published model-based controller and a classic proportional integral derivative controller were compared when using the BIS value as a controlled variable. Each controller was tested using a set of 10 virtual patients undergoing a fixed surgical profile that was repeated with BIS targets set at 30, 50, and 70. Controller performance was assessed using median (absolute) prediction error, divergence, wobble, and percentage time within BIS target range metrics.
The median prediction error was significantly smaller for the proportional integral derivative controller than for the model-based controller. The median absolute prediction error was smaller for the model-based controller than for the proportional integral derivative controller for each BIS target, reaching statistical significance for targets 30 and 50.
When simulating closed-loop control of BIS using propofol, the use of a patient-individualized, model-based adaptive closed-loop system with effect site control resulted in better control of BIS compared with a standard proportional integral derivative controller with plasma site control. Even under extreme conditions, the modeled-based controller exhibited no behavioral problems.
ONE proposed benefit of automated, closed-loop anesthesia delivery systems is that continuous, responsive control of anesthesia may improve quality of care compared with intermittent control (i.e. , standard practice). 1However, one concern is that unsupervised, automated controllers may be unsafe. This article describes a simulation methodology to test automated controllers and presents the results of applying this methodology in a comparison of two published controllers.
Two different closed-loop algorithms using the Bispectral Index (BIS) value as the controlled variable to steer propofol administration have been published recently. Absalom et al. 2developed a closed-loop system using a proportional integral derivative (PID) controller and tested it during orthopedic surgery. Struys et al. developed a patient-individualized, model-based controller and tested it during sedation 3and during major surgery (laparotomy). 4Anesthesia was administered in three phases during clinical trials of both systems. First, propofol was administered in open-loop mode during induction (i.e. , the initial set point was a target concentration rather than a target effect). Second, the loop was closed for the surgical phase when BIS reached its set point. Finally, the automated controller was interrupted when surgery was completed, and the anesthesiologist guided the recovery phase using standard practice. In the model-based approach, the relation between predicted propofol effect site concentration and BIS value was determined during the induction phase and was used to construct a patient-specific pharmacodynamic Emax model (or Hill curve) as a component of the controller during the surgical phase.
In an editorial that accompanied the clinical results of the model-based controller, Glass et al. 5questioned whether the controller was safe for a broader range of surgery or clinical interventions because all subsequent adjustments in drug administration were based on a static Hill curve derived during induction, and only a single target BIS value of 50 during adequate analgesia (i.e. , spinal blockade 3or a continuous infusion of remifentanil 4) had been studied. Therefore, the editorial stated that the closed-loop system had to be tested under extreme circumstances to establish fully the safety, efficacy, reliability, and utility of closed-loop anesthesia before adoption into the clinical setting. It might be considered inappropriate to stress human subjects under target effect settings and surgical stimuli beyond those accepted as good clinical practice. Although animal studies could be used to test extreme or uncommon circumstances, we believed that computer simulation of patients and intraoperative events would enable a more thorough characterization of controller responses to variation in patient types and interventions. Computer simulations are frequently used in various disciplines to evaluate control systems. For example, simulations using real human data are often used in pharmaceutical research and regulatory decision making. 6Risk calculations using realistic computer simulations are considered state of the art in aerospace engineering and testing. 7
A number of basic components are required to develop a satisfactory patient simulator for closed-loop testing: (1) It should calculate (simulate) an appropriate effect in response to drug administration, based on an internal combined pharmacokinetic–pharmacodynamic model. Ideally, this model is based on the relation between drug and effect determined from previous clinical studies. (2) It should provide a means to simulate noxious stimuli to trigger closed-loop control actions. (3) It should provide a means to simulate monitoring delay because each monitoring device introduces some delay between drug effect and the monitor’s updated estimate of the effect parameter. Any delay in a controlled-loop system may severely influence the behavior and stability of a closed-loop controller. (4) It should provide a means to vary patient model parameters (e.g. , to vary the relation between drug and effect) to simulate a patient population for the controller under test. (5) It should be able to simulate effect responses to interventions/events unrelated to drug changes (e.g. , add a bias and/or random variation to the expected drug effect to simulate patient movement, responses to minor random stimuli, and others) to verify controller stability in pseudo– steady state situations. (6) It should enable ease of use by integrating these features into standard input configurations to enhance reproducibility and standardization. (7) Finally, it should interface with existing closed-loop control systems without requiring modifications to the systems.
The aims of this study were (1) to develop a simulation methodology to stress closed-loop anesthesia control systems, and (2) to apply this methodology to compare the performance of two previously published control systems.
Materials and Methods
Software and Hardware Configuration
As shown in figure 1, the complete simulation trial system consists of two computers and an infusion device. The closed-loop system operates on the first computer, whereas the patient simulator operates on the second computer. Incoming BIS data from the patient simulator are used as the controlled variable by the controller to calculate an accurate propofol infusion regimen to maintain a preset target BIS value. The infusion commands are continuously sent from the controller to the infusion pump to infuse propofol. They are captured as well by the patient simulator to adjust the simulated effect. The details of operation are explained below.
First Computer: Closed-loop System.
In this study, two different closed-loop systems were compared. For both systems, the RUGLOOP II application framework was used, as developed by two of the coauthors (T. D. S. and M.S.). **This program is a modular application frame with a means of standardized data exchange between modules.
In the model-based group, the closed-loop system was equipped with the controller developed and described in detail by Struys et al. 4It uses incoming BIS data as the controlled variable and is equipped with a patient-individualized model-based controller to steer the propofol administration. In the PID group, the closed-loop system was equipped with the controller developed and described in detail by Absalom et al. 2This control system also uses the BIS as the controlled variable but has a PID algorithm. Both controllers acted at least every 5 s.
Second Computer: Patient Simulator.
As shown in figure 1, the patient simulator executes four functions. First, it monitors changes in patient or surgical characteristics as input by the user or a script file. Second, it monitors the pump to assess the volume of drug infused in the virtual patient. Third, it estimates the resultant BIS value as a composite of response to drug, perceived stimulation, and random noise. Finally, it formats the output as input to the controller to emulate the serial output of the actual monitor (e.g. , an A2000 BIS® monitor; Aspect Medical Systems Inc., Newton, MA).
The following steps are required for derivation of the final BIS value: (1) estimate the propofol plasma and effect site drug concentrations from the history of drug administration using the pharmacokinetic–pharmacodynamic model of Schnider et al. 8,9; (2) produce an initial (drug-specific) BIS estimate by converting the effect site concentration into a BIS value using an Emax pharmacodynamic model; (3) delay the initial BIS estimate over a number of seconds to simulate monitoring delay; (4) add a random, normally distributed noise value (with a mean value of 0 and an SD of 3) to simulate inherent BIS variability; and (5) offset the calculated BIS value with an error signal (i.e. , a bias shift in the average BIS value) when simulating response to stimuli or changes in surgical circumstances.
The delay time, Emax model, error signal, and noise amplitude can be dynamically altered through the user interface of the simulator or via a file containing a script of time-stamped changes in parameter values. Dynamic interaction allows the user to experiment with the simulation to identify conditions that stress the controller. Simulations run via script control ensure that various virtual patients and controllers are tested in a reproducible fashion.
We composed a simulation protocol to evaluate controller performance by adjusting parameters within the three components that comprised a study: the virtual patient, a stimulus profile, and the BIS target level.
To generate the virtual patient population, the patient simulator was fed with 10 different pharmacodynamic profiles. We defined a pharmacodynamic profile for a virtual patient as a certain drug effect site concentration–versus –effect relation (i.e. , an Emax model) combined with a certain additional delay that could be imposed by certain monitor types. To obtain realistic values, we used the Emax models derived from our previous clinical work as calculated at the end of the induction phase using data points measured during the induction phase. 4The delay in BIS was taken from previous work by Schnider et al. 8,9This resulted in the set of virtual patients used in our study (table 1).
We also required a standard theoretical stimulus profile to apply to a virtual patient during the controller evaluation to emulate the patient arousal reflexes during surgical procedures. Several methods of simulating arousal reflexes were considered. We selected the most straightforward way by translating the stimulus level into an offset imposed on the simulated BIS value. A BIS offset time profile was composed to emulate a typical stimulus trajectory of a surgical case. The total case time is exactly 1 h, including induction and time after skin closure. The BIS offset profile used for all simulations is shown in figure 2.
The simulation trial for each virtual patient was run at three different control targets: target BIS values of 30, 50, and 70.
Evaluation of Controller Performance and Statistics
To compute the percent of time the BIS value was under acceptable control during maintenance (i.e. , starting 10 min after the start of induction), acceptable BIS control was defined as maintaining the BIS value within ± 10 BIS units of the target value. The percent of time of acceptable BIS control, as well as percentages of time when the BIS value was above or below the range, were calculated at each target. Significance between controllers was tested using the paired t test (SPSS 10.0; SPSS Inc., Chicago, IL).
According to the method of Varvel et al. , 10previously applied by Kansanaho et al. 11for the performance of a closed-loop system for muscles relaxants, the overall performance of both controllers was characterized on the basis of the following parameters for the period when the variable was being controlled. First, using all observations within the period, the performance error (PE) was calculated according to the formula:
Subsequently, bias (median performance error [MDPE]), inaccuracy (median absolute performance error [MDAPE]), divergence, and wobble were calculated as follows.
The MDPE is a measure of bias and describes whether the measured values are systematically either above or below the target value. MDPE was calculated from the measured samples j:
where N iis the number of values PE obtained for the ith subject.
The MDAPE reflects the inaccuracy of the control method in the ith subject:
where Niis the number of values PE obtained for the ith subject. Divergence describes the possible time-related trend of the measured effects in relation to the targeted values. It is defined as the slope of the linear regression equation of PE against time and is expressed in units of percentage divergence per minute. A positive value indicates progressive widening of the gap between targeted and measured values, whereas a negative value reveals that the measured values converge on the targeted values.
Wobble is another index of the time-related changes in performance and measures the intrasubject variability in PEs. In the ith subject, the percentage of wobble is calculated as follows:
For PE, MDPE, MDAPE, divergence, and wobble, the SE was calculated using the two-stage approach as described by Varvel et al. 10Differences between groups were calculated using the paired t test. We also calculated the amount of propofol theoretically used for both controllers at each target level.
All 60 virtual operations (i.e. , 10 patients × 3 target BIS values × 2 controllers) were simulated. However, because the PID controller had no means to control the induction phase, and neither controller could control the recovery phase, we analyzed the controller performance only during the maintenance phase to obtain an equivalent base for comparison.
Table 2shows the percentage of time during accurate BIS control (i.e. , BIS value within ± 10 BIS units of the target value) and the percentages of time when the BIS value was outside of this range (i.e. , too high or too low), indicating less accurate control. Results for a typical case (using patient 1) for each controller and for each BIS target are shown in figure 3for the PID and model-based controllers, respectively. In each plot, the top trend displays the results for a target BIS value of 70, the middle one is the target 50 trend, and the bottom trend shows the results for a target BIS value of 30. This example shows that the model-based controller provided tighter control of BIS values near each of the BIS targets and was more responsive to changes in BIS values due to an increase or withdrawal of perceived stimulation.
The simulation results are shown in table 3. A significantly smaller mean MDPE value was observed for the PID controller compared with the model-based controller. In contrast, the model-based controller showed better MDAPE results than the PID controller for each BIS target, reaching statistical significance for targets 30 and 50.
The PID controller produced better divergence results than the model-based controller, whereas the model-based controller showed an improved wobble. Interpretation in context together with the MDAPE results shows that the relative improvement over time was greater for the PID controller because its performance was much worse initially; therefore, it had a greater opportunity to improve over time.
The intracontroller performance difference over the three targets showed a globally improving behavior at higher BIS values for both controllers, as shown in table 3. This might be caused by the division by the target value within the PE calculations, yielding lower relative errors for higher targets for a given difference between measured and targeted BIS values.
Figure 4shows the volume of propofol used for both controllers at each target. No significant differences were found between control systems.
The purpose of this study was to test the previously described patient-individualized, model-based adaptive closed-loop system under several usual and extreme circumstances and to compare it with the PID-based control system. Because it was impossible to create these conditions in clinical practice, a patient simulator was developed to simulate virtual patients. The use of a set of 10 virtual patients in our patient simulator enabled a comparison of the overall performance of both controllers. Because a controller monitors recent BIS values to influence subsequent BIS values, it is not possible to directly use, for example, a set of previously recorded BIS trends to evaluate a controller. The only way to use historical data are to craft a model to describe the relation between drug concentration, stimulation profiles, and resultant BIS values. To create realistic simulation trials, real patient pharmacodynamic profiles from our previous closed-loop work were used. 4
The clinical performance goal of any closed-loop system is to provide tight control. When defining an adequate level of control as having a BIS value within ± 10 BIS units of the target value, table 2shows that the percent of time during adequate BIS control was significantly higher when using a model-based controller than a PID controller, reaching significance at targets 30 and 50. The BIS value was not always controlled within 10 points of the target BIS because the surgical profile in this study was designed to test controllers during a number of rapid and extreme changes in patient state. The model-based controller was able to adapt more quickly to these events, thus providing a larger percent of time near the BIS target. For each targeted BIS level, significantly longer periods of BIS levels above the target were recorded using the PID controller compared with the model-based controller. This might lead to a higher risk of awareness when targeting BIS at deeper levels of anesthesia. At target 70, when subjects are expected to be aware, longer periods of BIS values below the target of 70 were observed when using model-based control. There were no significant differences between controllers in the duration of the BIS value being too low when targeting BIS levels of 30 and 50.
Previously, O’ Hara et al. 12proposed the goals of automated control in anesthesia. These goals were defined as (1) keeping the average value of the controlled variable within defined limits; (2) minimizing oscillations in the controlled variable within these limits; and (3) guaranteeing stability so that over time the size of the oscillations either becomes smaller or remains constant at an acceptable level, rather than increasing, which would allow the controlled variable to swing wildly. A mathematical interpretation of these criteria can be found in Varvel et al. 10for computer-controlled infusion pumps. As was demonstrated earlier, these criteria can be applied to closed-loop controller performance after minor modifications. 11
As stated above, MDPE indicates the bias of the controller. It reveals information neither on dynamic or higher-frequency behavior nor on the amplitude of possible oscillations in control. The MDPE is a signed value and thus represents the direction (overprediction or underprediction) of the PEs rather than the size of the PEs, which is represented by MDAPE. 10Even though MDAPE does not indicate the sign of a possible bias, it describes both the amplitude of possible bias and all other errors that prevent the controller from approaching the control target. In our study, it was observed that MDPE for both controllers is negative, which indicates that both controllers tend to overdose, leading to BIS levels below target. This can be attributed to the fact that both controllers perform, in essence, an asymmetric control operation. They only govern the infusion, not the elimination of drug from the body, which is a slower process. This phenomenon has been observed in our earlier studies as well. 4The overdose could be solved relatively easily, without modifying the control operation or dynamics, by setting an increased virtual target that equals the current target plus the average absolute value of the MDPE currently observed. Because shifting the target would most probably increase the MDAPE, this solution was not retained.
Table 3shows a better MDAPE at targets 30 and 50 for the model-based controller compared with the PID controller, demonstrating a better performance in approaching the target value and elimination of control errors. As a clinician, one could expect tighter control to the target BIS value from a system with a smaller MDAPE. This may reduce periods of excessive anesthesia at the deep end or reduce risk of awareness at the lighter end. This tighter control can be clearly observed in figure 3, where the model-based system’s faster responsiveness and tighter control result in an overall better stability of the anesthetic depth, even under extreme circumstances. This is also demonstrated by the data shown in table 2.
Divergence and wobble can be related to the oscillation of the controller behavior (wobble) and the tendency of the controller to converge on the target over a longer time (divergence).
A negative divergence number indicates convergence to the target, and a positive one indicates divergence. The absolute value indicates the speed of convergence or divergence. The values of divergence obtained using the two controllers at the three different targets are shown in table 3. This shows that the PID controller produced more negative values for the divergence than the model-based controller. These negative values for divergence mean that the size of the PEs decreased appreciably with time when the PID controller was used. In the early course of control, the errors were larger, and as the controller continued to operate, these errors became smaller. This time-dependent change in PEs was much less when the model-based controller was used. Comparing the wobble for both controllers shows that the model-based controller has much less overall oscillation than does the PID controller. Combining the information on wobble and divergence for the two controllers indicates that the PID controller initially performs worse than the model-based controller. This can be verified in figure 3, in which one can observe the initially large oscillations of control. These oscillations may introduce alternating periods of excessively deep and excessively light anesthesia with the risk of hemodynamic instability and awareness and are thus undesirable in clinical applications of closed-loop systems.
To search for the underlying reasons resulting in the observed controller performance, we might start with the differences in the controllers. First, the PID controller uses constants that were previously tuned for auditory evoked response–guided closed-loop control (target auditory evoked potential index of 35). 13Therefore, it is interesting to note (fig. 3) that when the target BIS value was 30, although there were large deviations from the target, the actual BIS value spent a greater proportion of time closer to the target than it did when the target was 50 or 70. Thus, retuning the constants for the different control variable (BIS value) and set points might result in improved control. Second, the PID controller uses the plasma concentration as an intermediate controller, whereas the model-based controller uses the effect site concentration. Because we preferred to evaluate published controllers, we applied plasma concentration control for the PID controller as previously reported, even though the authors of the PID controller commented that the control performance could be improved by alterations to the gain factors in the PID controller or by using an effect site–targeted, target-controlled infusion propofol system. 2The PID controller will be updated toward effect site steering in further research by one of the coauthors (A. R. A.). One must realize that using effect site instead of plasma concentration control without other modifications creates a faster controller but results in more overshoot. This overshoot can be compensated by using other gain factors in the PID control or, as is our belief, by applying an adaptive, model-based, individualized controller. The results of this study show that combining effect site concentration control with model-based operation can actually result in better control. This effort did not result in more propofol used, as seen in figure 4. Overall, similar amounts of propofol were used by both controllers.
One might question whether a pure feedback controller always results in worse performance than a human controller because it cannot anticipate future stimulating events, as can an anesthesiologist. One might expect to observe more sudden increases in BIS values in response to surgical stimulation (which may relate to possible arousal events) with an automated controller than with a human anesthesiologist. However, it is known that that many anesthesiologists in clinical practice tend to bring their patient to a (too) deep level to avoid any risks of arousal, thus causing possible side effects, which could be considered as poor performance as well. To the best of our knowledge, it is currently not known which of the two anesthetic techniques results in the best postoperative outcome and patient satisfaction.
Furthermore, one could wonder whether a simulation study as presented here would in any way be able to predict the results in clinical practice because a simulated patient model can never be as complicated as a real patient. We have tried to simulate real patients by crafting historic data into a model and by introducing population variability through a “set” of patients. We realize that the result of this simulation study is limited by the selected set of patients. For further research, we might consider using alternate simulated patients, increased random noise, randomized target levels, and offset.
It is generally accepted in literature that the Emax model can be used to accurately describe patient pharmacodynamics. When designing the patient simulator, we decided to use the best-described and best-validated type of pharmacodynamic model for propofol (i.e. , Emax model) in both the simulator and the controller. We admit that using the same type of model (albeit with different parameter values) could generate a study bias in favor of the model-based controller. However, the wide variety of simulated patients, combined with the random noise and the delay, should partially compensate for this. Moreover, we reasoned that using an inferior model might produce poor patient simulations, possibly resulting in worse accuracy of simulating actual, clinical performance of the controllers.
When evaluating the performance of two previously published closed-loop control systems for propofol administration using the BIS value as the controlled variable, it can be concluded that the additional mathematical effort imposed by using a patient-individualized, model-based adaptive closed-loop system with effect site control can result in a better controller compared with a standard PID controller with plasma control. Even under extreme conditions, the model-based controller exhibited no behavioral problems.