In this study, the authors determined the success and failure rates for interns learning bag-and-mask ventilation and orotracheal intubation. Their goal was to determine the amount of experience needed to perform these procedures correctly.

The authors recorded 695 bag-and-mask ventilations and 679 orotracheal intubations performed by 15 inexperienced interns during their 3 month-long anesthesia rotations. Learning curves for each procedure for each intern were constructed with both the standard and risk-adjusted cumulative sum methods. The average number of procedures required to attain a failure rate of 20% was estimated for each technique.

Fourteen of 15 interns attained acceptable failure rates at bag-and-mask ventilation after 27 +/- 13 procedures, with a median (95% confidence interval) of 25 (15-32) procedures to cross the decision limit when considering all 15 interns. Nine of 15 interns attained acceptable failure rates at orotracheal intubation after 26 +/- 8 procedures, with a median of 29 (22-not estimable) procedures to cross the limit when considering all interns. The proportion of interns who attained acceptable failure rates for mask ventilation was greater than for tracheal intubation (93% vs. 60%, P = 0.025). Overall, our interns achieved a bag-and-mask ventilation failure rate of 20% or better after a median of 25 procedures; approximately 80% of interns achieved the goal after 35 procedures or less.

Participating interns developed mask ventilation skills faster than orotracheal intubation skills, and there was more variability in the rate at which intubation skills developed. A median of 29 procedures was required to achieve an 80% orotracheal intubation success rate.

## What We Already Know about This Topic

❖ Learning curves for endotracheal intubation have been examined, but not so much for bag-and-mask ventilation and using modern, cumulative sum statistical methods

## What This Article Tells Us That Is New

❖ In 15 inexperienced interns in specialties other than anesthesia, 14 achieved an 80% success rate after a median of 25 bag-and-mask ventilation procedures

❖ Only 9 of 15 interns achieved an 80% success rate for endotracheal intubation, and these did so after a median of 35 procedures

MAINTENANCE of the patient airway is a primary responsibility of anesthesiologists and all physicians required to provide patient care in emergency medicine or critical care settings. Interruption of gas exchange, even for a few minutes, can result in catastrophic outcomes including brain damage or death. In the closed-claims analysis, 28% of anesthesia-related deaths or brain damage were because of respiratory-related damage events.1Commonly used airway management techniques include bag-and-mask ventilation and laryngoscopic orotracheal intubation. Learning curves for various anesthesia procedures have been evaluated using various statistical approaches,2–8but few studies have investigated learning curves for bag-and-mask ventilation.

Control charts are statistical tools for analyzing data during production or research in industry; the values of the quality characteristics are plotted in sequence. The distribution of the plotted values in relation to the control limits provides statistical information on the process under study. In cumulative sum (cusum) control charts, the cumulative differences of the quality characteristic from a target level are plotted in sequence, leading to a tighter control of a given process and allowing detection of deviations from preestablished standards.9Cusum charts have been used for surgical skill assessment.10The control chart is especially sensitive to short runs of failures and identifies them quickly. Hence, the trainee who has difficulty in a particular technique can be detected, and immediate corrective measures can be taken.

The purpose of this study was to construct learning curves for two basic airway management techniques: bag-and-mask ventilation and laryngoscopic orotracheal intubation. Specifically, we sought to determine the amount of experience required for each technique and to determine whether one airway management strategy was significantly easier to learn than the other.

## Materials and Methods

Interns from Tokyo Women's Medical University were recruited as subjects at the beginning of their 3-month anesthesia rotation. Participating interns were committed to a nonanesthesia specialty and presumably had no special interest in airway management. Among the 15 interns included, six specialized in internal medicine, one in family practice, two in general surgery, one in cardiothoracic surgery, one in urology, one in emergency medicine, and three in pediatrics. They were involved in the study during the first 9 months of their internship. With the approval of the Tokyo Women's Medical University Institutional Review Board (Tokyo, Japan) and written informed consent from the interns, we asked the interns to complete a questionnaire to assess their previous experience with training in the performance of mask ventilation and laryngoscopic tracheal intubation. Subjects who had previously performed more than 10 of either procedure were excluded.

### Protocol and Measurements

All interns received a formal lecture on performing mask ventilation and laryngoscopic tracheal intubation. They were permitted to practice these procedures on a mannequin before their first attempt in the clinical setting. The period of mannequin practice was not restricted, and the procedures were performed under an instructor's supervision with remarks about the technique allowed. At the beginning of the training period, the interns were instructed about criteria for failure and success for each procedure. The interns completed a data collection form immediately after the procedure that was contemporaneously reviewed and signed by the supervising instructor.

Adult general surgical patients, except for cardiothoracic cases, were included in the study. Patients with anticipated difficult intubation or mask ventilation, with history of previous difficult intubation or mask ventilation, or with significant cervical spine pathology were excluded from the study, as were those in whom awake intubation was indicated for any reason. Tokyo Women's Medical University institutional review board waived consent from the patients for participation in the study, as it was clearly explained to all patients at admission that trainees were performing the procedures under the supervision of attending physicians in this teaching hospital.

Head position was standardized with a 7-cm-high anesthesia pillow and sniffing position. A head band was not used for mask ventilation. Mask ventilation was assessed after anesthesia induction with or without muscle relaxation. When patients required cricoid pressure and rapid sequence intubation, face mask ventilation was not assessed. Mask ventilation was deemed successful when it produced chest excursion sufficient to maintain capnograph waveforms with plateau formation, and the oxyhemoglobin saturation was maintained at the preinduction level (the value after preoxygenation). Use of an oral airway was allowed. Need for any physical assistance by the supervising instructor was considered a failure.

Laryngoscopic orotracheal intubation was performed with cuffed tracheal intubation under direct laryngoscopy with Macintosh blades. The size of endotracheal tube and laryngoscope blade were chosen by the interns depending on the overall size of the patient. The interns were allowed to ask for external laryngeal pressure to improve the view during laryngoscopy. An intubation attempt was defined by one entry of the blade into the mouth. If one blade was switched for another after the initial attempt, it was considered a second attempt. If the first attempt of intubation failed, the intern was allowed one additional attempt; after two failed attempts, the supervising instructor took over the procedure and secured the airway. Successful intubation was confirmed by chest movement, auscultation, and capnography. If intubation was not successful within two attempts by the participating intern, the effort was considered a failure.

### Data Analysis

We calculated both standard and risk-adjusted cusum charts for each intern and each intubation method.

#### Standard Cusum.

Standard cusum charting requires specification of acceptable (p0) and unacceptable (p1) failure rates for the process under study and type I and type II errors (α and β, respectively). A type I error is the probability of crossing the acceptable failure rate limit when the true failure rate is not in the acceptable range, whereas type II error is the probability of failing to cross the acceptable limit when the true failure rate is in the acceptable range. Upper and lower decision limits (h1 and h0, respectively), corresponding to unacceptable and acceptable failure rates, are functions of p0, p1, α, and β and are calculated as follows:

The resulting chart begins at 0, and for each success, the amount S = Q/(P + Q), a function of p0 and p1, is subtracted from the previous cusum value; for each failure, the amount 1 − S is added to the previous cusum value. A cusum chart is formed by connecting the cusum values over time for each participant. When the chart crosses the upper decision limit (h1) from below, the participant's failure rate is deemed significantly greater than the acceptable failure rate. When the chart crosses the lower decision limit (h0) from above, the true failure rate is deemed to be as low or lower than the acceptable failure rate, with false positive probability α. When the cusum line is maintained between the decision limits, no statistical inference can be made.

The incidence of difficult mask ventilation is reported to be as high as 5%.11,12But considering that our interns were nonanesthesiologists rotating in anesthesia for only 3 months and that we did not permit physical assistance from instructors to facilitate mask ventilation, we set the acceptable failure rate (p0) for mask ventilation at 20% (*i.e.* , four times the reported failure rate among experienced anesthesiologists). As is customary for cusum analysis, the unacceptable failure rate (p1) was set at twice p0 or 40%. The probability of type I (α) and II (β) errors were each set to 0.1.

The reported incidence of a suboptimal laryngoscopic view for intubation (*e.g.* , Cormack and Lehane grade 3 and 4) has been reported to be as high as 10%.11–14Again considering the inexperience of our interns, we set the acceptable failure rate (p0) for laryngoscopic tracheal intubation at 20% and unacceptable failure rate (p1) at twice p0 (40%) and type I and II errors at 0.1. These parameters were identical to those used in a similar study by de Oliveira Filho.2

Cusum calculations were performed for mask and intubation procedures using the aforementioned formulas (and in table 1) and the chosen values of p0, p1, α, and β. In table 1, average sample sizes of runs having acceptable failure rates (p0) of 5%, 10%, and 20% were estimated as 105, 48, and 19 procedures, respectively. For runs having unacceptable failure rates (p1) of 10%, 20%, and 40%, average sample sizes were estimated as 85, 40, and 17 procedures, respectively. Assuming that the interns would experience 40 to 50 procedures for both mask ventilation and intubation during 3 months, the study would provide enough power for acceptable failure rate (p0) of 20% and unacceptable failure rate (p1) of 40%.

Time to crossing h0, the lower decision boundary indicating acceptable failure rate, was summarized in a Kaplan-Meier time-to-event plot for each technique for the prespecified rate of p0 and error rates. Success proportions from before and after crossing the lower decision limit were compared by a generalized estimating equation Z-test to adjust for the within-subject correlation over repeated attempts within-intern. The proportion of interns eventually crossing the lower decision limits for mask ventilation and tracheal intubation were compared using McNemar's test for paired proportions.

#### Risk-adjusted Cusum.

In addition to the standard cusum chart, we report the learning curves adjusted for varying levels of difficulty presented by each patient. We created risk-adjusted charts using the observed minus expected (O − E) cusum method.15A risk score for each patient was first calculated as the estimated probability of failure predicted as a function of traditional difficult mask ventilation and intubation risk factors14,16,17(tables 2 and 3) using logistic regression. The risk-adjusted cusum chart was then formed by adding 1 minus the individual patient risk score to the cumulative score for each failure and subtracting the risk score for each failed attempt. The cusum at time t is, thus, c^{t}= c^{t − 1}+ (x^{t}− x^{0}), where c^{t − 1}is the cusum through the previous attempt, x^{t}is 1 for failure and 0 for success (observed), and x^{0}is the estimated risk for the patient being intubated or ventilated at time t (expected). Interns whose accumulated observed rate of failure is generally consistent with the expected values for the patients they intubated will have scores near the zero line, whereas those performing better or worse than expected will have scores below and above the zero line, respectively.

Data are presented as mean ± SDs unless otherwise stated. Individual interns are represented by randomly assigned capital letters. SAS statistical software (SAS Institute, Inc., Cary, NC) and the R programming language were used for all analyses.

## Results

A total of 706 patients were involved in mask ventilation and/or intubation. Standard and risk-adjusted cusum lines for mask ventilation and orotracheal intubation are presented in figure 1.

### Mask Ventilation

The 15 interns attempted a total of 695 mask ventilations, with 592 (85%) being successful (table 4). An oral airway was used on 136 occasions, and muscle relaxant was used on 669 occasions of the attempts. An instructor took over ventilation in 103 (15%) of the cases. The mean number of mask ventilation procedures per intern was 46 ± 15 (range, 27–84). All but one intern (intern “O”) crossed the 20% acceptable failure rate line (h0) for mask ventilation (fig. 1A). The 14 successful interns did so after 27 ± 13 (range, 12–59) procedures. The cusum line relative to intern O remained between the decision limit lines after 48 procedures. Using a more stringent 10% specification of p0, no interns crossed the mask ventilation acceptable failure decision limit. Considering all 15 interns, the Kaplan–Meier estimated median (95% confidence interval) time to crossing h0 was 25 (15–32) attempts (fig. 2) using the 20% acceptable failure rate. Pooled success rates before and after reaching h0 under the given criteria were 78% and 94%, respectively (Z = 5.8, *P* < 0.001).

Patient risk scores for mask ventilation were calculated from a logistic regression model predicting success from four baseline variables: body mass index more than 30 kg/m^{2}, Mallampati 3 or 4 (*vs.* 1, 2), age more than 57 yr, and Mandibular protrusion (severely limited), which have been shown to be associated with difficult mask ventilation. In the risk-adjusted observed − expected learning curves for mask ventilation (fig. 1B), 10 interns (66%) finished their attempts below the zero line and, thus, on average performed better than expected, given the level of difficulty of the patients they encountered. This figure is quite different in shape to the standard cusum in figure 1Afor most interns, reflecting the distinct cusum formulae (and interpretations) and the varying levels of patient difficulty experienced across interns.

### Orotracheal Intubation

Tracheal intubation was attempted on 679 patients, and 528 (78%) of these were successful within two attempts (table 4). An instructor took over 151 (22%) procedures. Among all patients, 16 had rapid sequence inductions. For these patients, intubation data were included in the cusum calculation. There were 45 ± 13 (range, 28–72) procedures per intern. Nine of 15 interns (60%) crossed the 20% acceptable failure rate line (h0). These nine did so after 26 ± 8 (range, 15 to 42) procedures. Cusum lines relative to interns G, H, and N remained between decision limit lines after 49, 45, and 53 procedures, respectively, whereas lines for interns J, K, and O remained greater than the 40% unacceptable failure rate limit (h1) after 33, 28, and 48 procedures, respectively (fig. 1C). Considering all 15 interns, the Kaplan–Meier estimated median (95% confidence interval) time to crossing h0 with the 20% acceptable failure rate was 29 (22, upper limit inestimable) attempts (fig. 2). Pooled success rates before and after reaching h0 were 71% and 90%, respectively (Z = 4.3, *P* < 0.001). Under a more stringent specification of p0 = 10%, only one intern crossed the intubation acceptable failure rate decision limit.

Patient risk scores for orotracheal intubation were calculated from a logistic regression model predicting success from seven baseline variables, which have been shown to be associated with difficulty, including the four used for mask ventilation plus mouth opening less than 3 cm, thyromental distance less than 6 cm and neck movement less than 90°. Using the risk-adjusted O − E cusum, nine interns (60%) remained below the zero line at the end of their intubation attempts, indicating better-than-expected performance given the level of difficulty assigned to their patients (fig. 1D).

The proportion of interns who attained an acceptable failure rate for mask ventilation (93%) was greater than those of tracheal intubation (60%), *P* = 0.025 (McNemar's test).

## Discussion

By using a simple graphical method for detecting the number of successes and failures on a sequential basis, Lawler *et al.* 6suggested that 20 consecutive, successful tracheal intubations might be appropriate for a student to perform solo anesthesia. However, this approach does not allow statistical inference. Three additional studies examined the learning process for tracheal intubation using statistical approaches. Kopacz *et al.* ,5using the pooled cumulative success rate at groups of five attempts, demonstrated that a 91% tracheal intubation success rate was achieved after 45 attempts. Konrad *et al.* ,4using a least-square fit model and Monte Carlo procedures, demonstrated that 90% success rates for tracheal intubation were achieved after 57 attempts. However, these studies did not report the number of attempts at procedures corresponding to certain percentiles of the subjects attaining acceptable failure rate on a time-to-event curve. de Oliveira Filho,2using the same statistical approach as our study, demonstrated that four of seven participants attained a 20% acceptable failure rate at intubation, and they did so after mean of 43 attempts.

The cusum method consists of relatively simple calculations, and statistical inference can be made from the observed successes and failures. The method nonetheless requires several considerations. For example, the criteria for success and failure must be clearly defined and represented by a binary variable. Our criteria for successful tracheal intubation (allowing two laryngoscopy attempts and permitting the instructor to provide external laryngeal pressure) was less stringent than those of de Oliveira Filho,2who required successful intubation after a single laryngoscopy without any physical assistance, or Konrad *et al.* ,4who required a successful procedure within two attempts without physical assistance. However, we consider it appropriate for interns to request external laryngeal pressure to optimize the laryngoscopic view and regard it as a reflection of their understanding of upper airway anatomy. Thus, we consider our criteria of successful intubation to be clinically relevant. Furthermore, even experienced intubators sometimes require a second attempt; we, thus, do not consider a second attempt to be an overall failure to intubate.

The standard cusum method also requires specification of an acceptable failure rate (p0). The acceptable rate may be taken from actual institutional rates, published studies, or expert consensus.3,4,7,18In this study, acceptable failure rate for tracheal intubation was set at twice the maximum reported incidence of suboptimal laryngoscopic view for intubation (*i.e.* , Cormack and Lehane grades 3 and 4).11–14This rather generous acceptable failure rate was chosen because our interns were nonanesthesiologists and presumably would only occasionally be required to manage airways. This acceptable failure rate is also used in a previously performed similar study by de Oliveira Filho,2making it possible to directly compare our interns' performances with anesthesia residents from another site.

As seen in table 1, the recommended number of attempts per individual required to adequately assess an individual's learning curve using the cusum method depends on the specified parameters: namely type 1 and 2 errors and the acceptable and unacceptable failure rates. Our study provided acceptable statistical power for our *a priori* chosen 20% limit, but would have been underpowered for a stricter 10% limit.

Using our chosen parameters, an average of 19 attempts per intern would normally be required to reach the acceptable failure rate threshold. We far surpassed this with a mean (range) of 46 (27–84) for mask ventilation and 45 (28–72) for intubation procedures. An acceptable failure rate of 10% would have required an average of 48 attempts per intern, which is close to our average. Many of our interns had less than 48 attempts, and we would have been underpowered to assess the 10% rate.

Given the data, we observed that only one intern would have crossed the h0 decision limit under a 10% acceptable failure rate for intubation and none for mask ventilation.

At tracheal intubation, 60% of the interns crossed the acceptable (lower) decision limit after 26 ± 8 procedures. Although we used the same statistical method and same acceptable and unacceptable failure rates, our interns crossed the lower decision limit after a smaller number of procedures than those in the study by de Oliveira Filho.2This may be explained by the difference in the criteria of successful intubation between the two studies. De Oliveira Filho2allowed only one attempt at laryngoscopy with no physical assistance for intubation to be considered successful; in contrast, we allowed two laryngoscopy attempts and physical assistance in the form of external laryngeal pressure, which perhaps better reflects how intubation is normally approached.

We did not have an adequate number of interns to directly compare the number of attempts to cross the acceptable failure rate decision limit (*i.e.* , h0) with mask and intubation. We nonetheless observed a difference in the proportion of interns who crossed the decision limit for the respective methods: with mask ventilation, 93% of our interns crossed the lower decision limit after 27 ± 13 procedures, which was significantly greater than the proportion for intubation, which was 60%. Median time to cross h0 with a 20% acceptable failure rate was similar between mask ventilation and intubation, but intubation can be a technique associated with more interindividual variability in initial skill acquisition than mask ventilation. Although the higher proportion of interns crossed the acceptable failure rate decision limit for mask ventilation than intubation, for those infrequently managing the patient's airway, not only ease of skill acquisition but also ease of skill maintenance is of clinical importance. A previous report showed occasional performance of intubation does not ensure skill maintenance.19No comparable reports for mask ventilation are available, and the evaluation of skill maintenance in mask ventilation is warranted. In the current study, many patients were paralyzed, which facilitates mask ventilation. Per protocol, patients who had difficult airways were excluded because they would have been designated for awake intubation. Therefore, caution needs to be used to apply these data to the urgent floor airway situation where patients are not fasting, bed position is suboptimal, and an efficient respiratory circuit of an anesthesia machine is not available.

The standard cusum method is limited in that it does not allow weighting of the cusum score according to the expected difficulty at each procedure. Therefore, we also calculated risk-adjusted learning curves in the form of observed − expected (O − E) cusum charts with patient variables shown in the literature to be related to difficult mask ventilation and orotracheal intubation. For the O − E cusum method, interns whose performance reflects the average across interns for patients of similar difficulty will have charts with values close to the zero line. This risk adjustment uses the average success rate for interns across the entire learning curve as the “expected” success rate for a patient. Ideally, the expected scores would be derived from data external to the study at hand. Such external data might also make attractive more sophisticated forms of risk adjustment, such as the log-likelihood method,20where a baseline probability of failure is specified for particular patients or groups of patients, and the curve monitors *a priori* reduction or increase from baseline.

In summary, participating interns achieved 80% mask ventilation success rate faster than 80% orotracheal intubation success rate, and there was more variability in the rate at which intubation skills were developed. The achievement of 80% orotracheal intubation success rate required a median of 29 procedures.