Intergroup comparisons of clinical productivity are important for strategic planning and evaluation of clinical and business operations. However, in a preliminary study, comparisons of two anesthesiology groups using "per full-time equivalent" measurements were confounded by different concurrencies or staffing ratios, whereas measurements based on "per operating room (OR) site," "per case," and "billed American Society of Anesthesiologists (ASA) units per hour of care" permitted meaningful comparisons despite differing concurrencies. The purpose of this study was to determine whether these measurements would allow for meaningful comparisons when applied to multiple groups.
Annual totals of total ASA units (tASA), 15-min time units, and the number of cases billed, as well as the average number of daily anesthetizing sites (OR sites) staffed and the average number of anesthesiologists required to the staff sites, were collected from each group that participated. All anesthesia care billed with ASA units was included, except for obstetric care. Any clinical service not billed using ASA units was excluded. Productivity measurements (concurrency, tASA/OR site, hours billed per OR site per day, hours billed per case, tASA billed per hour of anesthesia care, and base units per case) were calculated. Median and range for all groups and for private-practice and academic groups were determined.
Eleven private-practice and nine academic groups from 12 states participated in the study. Productivity measurements that are influenced by duration of surgery (hours billed per case, tASA billed per hour of anesthesia care) differed significantly between groups, with private-practice groups having shorter duration than academic groups (median hours billed per case, 1.5 2.6, respectively). Although tASA/OR site measurements were similar in private-practice and academic groups, academic groups worked significantly longer hours billed per OR site per day (median, 6.0 h 7.8, respectively) to achieve the same level of tASA/OR site. Hourly billing productivity (tASA billed per hour of anesthesia care) correlated highly with surgical duration (hours billed per case).
This study demonstrates a method of comparing departmental clinical productivity between anesthesiology groups. Private-practice groups provided care for cases of shorter duration than academic groups. This difference was evident in several productivity measurements.
MEASURING the clinical productivity of medical groups has been used to manage business operations and distribute compensation through comparison with other groups with similar characteristics (i.e. , benchmarking). 1,2Although both academic and private-practice anesthesiology groups have an interest in comparing clinical productivity, 3–5most current comparisons use “per full-time equivalent (FTE)” measurements. 6–8Unfortunately, “per FTE” comparisons do not account for differences in staffing ratios (i.e. , concurrency) and therefore lead to inaccurate conclusions about the clinical productivity of anesthesiology groups. 9In addition, a variety of anesthesia-independent factors, including speed of surgery, type of surgery, and scheduling efficiency, influence the number of units billed for anesthesia care. 10–12
We previously suggested, based on a limited sample of anesthesiology groups, that multiple measurements based on “per operating room (OR) site,”“per case,” and “total American Society of Anesthesiologists (ASA) units per hour of anesthesia care” (tASA/h) may be more useful than “per FTE” measurements in comparing clinical productivity among anesthesiology groups. 9The purpose of this study was to determine whether these measurements would facilitate comparisons between a larger, more diverse sample of practices.
Methods
We collected clinical activity and billing data for 1 yr from private-practice (n = 11) and academic (n = 9) anesthesiology groups (fiscal year 1999–2000). tASA billed for 1 yr, time units billed for 1 yr, number of cases billed (case) for 1 yr, average daily number of anesthesiologists who staffed the operating rooms (OR FTE), and average daily number of anesthetizing sites staffed (OR sites) were obtained from participating groups (table 1). All anesthesia care billed with ASA units was included except for obstetric anesthesia care. Pain management and critical care were excluded because these services are billed with resource-based relative-value system units. Any clinical activity that was not billed (e.g. , preoperative outpatient assessment clinic) was also excluded. For groups that provided care in more than one hospital, the group could choose to report their data as a total for the whole group or for each hospital separately. If reported separately, each of these subgroups was considered as one group.
The productivity measurements calculated from the data included staffing ratio (concurrency), total ASA units per OR site (tASA/OR site), billed hours per OR site per day (h/OR/day), cases per OR site (case/OR site), average hours per case (h/case), total ASA units billed per hour of anesthesia care (tASA/h), and base units per case (base/case). The productivity measurements, abbreviations, and formulas are shown in table 2. Median and range of group productivity were determined for all groups and for private-practice and academic groups.
As an example of the use of these measurements to compare performance of anesthesiology groups, the overall group median was compared with measurements from 4 of the 20 groups—two academic and two private-practice. Useful comparisons were defined as those that clarified the following questions:
How does our productivity compare to other groups? To quantify overall productivity of a group, a “final output” measurement is used. Cases/OR site does not account for variability of base and time units between cases. Therefore, tASA/OR site was chosen to measure overall productivity
Do we run our ORs longer? Although tASA/OR site quantifies overall productivity, this one measurement does not address the possible reasons why a group's tASA/OR site is higher or lower than the median of all groups. One possible reason for greater overall output is that a group's average day may be longer. Billed time units per OR site can be used. To make this measurement more clinically relevant, we derived h/OR/day from time units/OR site (table 2). It is important to note that h/OR/day and time units/OR site underestimate the actual time providing care because nonbilled time (preparation and turnover time) is not included.
How does surgical duration affect our productivity? Surgical duration (h/case) is compared. The effect of h/case on overall productivity (tASA/OR) is reflected in the hourly billing productivity (tASA/h). tASA/OR site is equal to the product of h/OR and tASA/h. If two groups work same amount of time (h/OR/day), then the differences in tASA/h will be the result of the differences in billed base units. Two factors determine base units billed: the base/case and the number of cases performed per unit of time. If surgical duration (h/case) is shorter, more cases can be performed and more base units can be billed per hour.
Are our cases longer because they are more complex? Although many factors influence the difficulty of providing anesthetic care, measurement of base/case represents the only readily accessible data describing the relative complexity of anesthetic or surgical care.
Statistical Analysis
For each productivity measurement (concurrency, tASA/OR site, h/OR/day, cases/OR site, h/case, tASA/h, and base/case), the difference between private-practice groups and academic groups was assessed using the Wilcoxon two-sample test at the 0.05 level of significance. Spearman rank correlation was used to analyze tASA/h and h/case. The analyses were conducted using SAS version 8 (SAS Institute, Cary, NC) at the Office of Biostatistics, The University of Texas Medical Branch.
Results
Twenty groups—11 private practice and 9 academic—from 12 different states (California, Colorado, Hawaii, Illinois, Missouri, New Jersey, New Mexico, North Carolina, Pennsylvania, Texas, Virginia, and West Virginia) participated in the study. These groups billed 4,270,683 total ASA units and 2,438,621 time units for 289,340 anesthetics. The groups accounted for a total of 324 OR sites with 234 anesthesiologists. The median and range for the data for all groups and for the private-practice and academic subsets are shown in table 3. Median and range for productivity measurements are shown in table 4.
Although four private-practice groups were physician-only groups (i.e. , concurrency of 1.0), the large variation of staffing ratios resulted in no difference in the median concurrency as compared with academic groups (table 4). Overall productivity (tASA/OR site) for private-practice and academic groups did not differ significantly (table 4). On the other hand, the academic groups on average provided staff for each anesthetizing site more hours (h/OR/day) than the private-practice groups to achieve the same tASA/OR site (table 4). Base/case averages are similar between groups. In contrast, surgical duration differed significantly between academic and private-practice groups (fig. 1). The hourly billing productivity (tASA/h) correlated (r =−0.85) with differences in duration of surgery (fig. 2).
In the limited number of groups in this study, a large range of clinical practices, as defined by the number of OR FTEs or the number of OR sites (table 3), suggests that subgroup analyses of the private-practice groups and the academic groups may allow for helpful comparisons in a larger survey. In addition to private-practice versus academic groups, another subgrouping—by number of OR sites—is illustrated in table 5. Because of the small sample size in each group, the data are presented to illustrate how this breakout may be done in an industry-wide survey. No inferential statistical analysis was performed.
Use of Overall Group Data to Compare Groups
To illustrate the use of productivity measurements to evaluate individual groups, four groups (groups T, O, F, G) and their specific productivity measurements were compared with the overall median of the 20 groups (table 6). The four groups were chosen to better illustrate how the measurements could be used to compare groups. Groups T and O were chosen because they have similar overall productivity, but one was a private-practice group (group T) and one was an academic group (group O). Groups F and G were chosen in a similar manner. These productivity numbers are then used to answer the questions posed in Methods.
How Does Our Productivity Compare with That of Other Groups?
Using tASA/OR site as the measurement of overall productivity, all four groups listed produced more than the median of all 20 groups (table 6). Groups T and O were more productive than groups F and G.
Do We Run Our Operating Rooms Longer?
“Hours per OR per day” represents billed time units per OR site for the year that may represent the sum of hours billed per weekday (including evening and night hours) and billed hours for cases performed on weekends. Actual time performing care will always be underestimated by h/OR/day. The median billed h/OR/day of the 20 groups was 7.2. Group O's h/OR/day (9.1 h) was higher than the median of the 20 groups and higher than the other 3 groups listed in table 6. This higher h/OR/day represents either more billed hours for weekday evenings and nights or more billed hours on weekends. Groups T and G had similar h/OR/day to the median of the 20 groups. In contrast, group F had the smallest h/OR/day (approximately 1.3 h less than the median). One of the reasons that group O had higher-than-average tASA/OR site was that this group worked more hours.
How Does Duration of Surgery Affect Our Productivity?
Despite having almost identical tASA/OR site, group G's h/OR/day was almost 2 h more than group F's h/OR/day (table 6). Similarly, group T billed a similar amount of hours per OR site as group G, but billed 25% more tASA/OR site than group G. Groups T and O had similar tASA/OR site, but group O worked almost 20% longer each day per OR site. Therefore, another factor—duration of surgery—influences tASA/OR site.
“Hours billed per case” compares surgical duration. Groups O and G are both academic groups and provided care for surgeries with longer duration than the two private-practice groups. The two academic groups provided care for cases with an average duration of 2.6–3.2 h (h/case). In contrast, the two private-practice groups had average case duration of less than 1.5 h/case. The shorter case duration allowed for more cases to be performed at each anesthetizing site (case/OR site) despite fewer billed hours per anesthetizing site (h/OR/day).
Hourly billing productivity (tASA/h) is influenced by the number of cases done per hour (duration of surgery, i.e. , h/case) and the base units per case (base/case). If the base/case is the same, then tASA/h reflects the greater number of cases and therefore the greater number of total base units billed per hour of care. In the case of groups T, F, and G, base/case were similar and the differences in tASA/h were a reflection of duration of surgery—group G had the lowest tASA/h as compared with groups T and F. Despite group O having longer h/case, it had similar tASA/h to group G because it had higher base/case.
Are Our Cases Longer Because They Are Surgically More Complex?
The four groups listed had higher base/case than the median of the 20 groups (table 6). Both groups F and G had similar base/case, but group G had longer case duration. Groups T and F had similar surgical duration but different base/case. Complexity of surgical cases may be an influence of duration of surgery, but it does not appear that base/case predicted duration of surgery in these sample comparisons.
Discussion
Standardized multiple measurements based on easily accessible data from anesthesiology groups facilitate useful comparisons of group clinical productivity. Although not an industry-wide study, the sample of 20 diverse groups surveyed in this study illustrates how the productivity “per OR site” and “per case” measurements and tASA/h could be determined for industry-wide medians and how these medians could be used by groups to compare and benchmark their clinical productivity. These comparisons examine not only differences in overall productivity (tASA/OR site) but also other possible factors for productivity, including the h/OR/day, h/case and tASA/h, and base/case.
Although the original purpose was to compare anesthesiology groups as a whole and then include subgroup analyses to illustrate specific differences, during the analysis it became clear that private-practice group measurements were strikingly different from those of academic groups because of the differences in surgical duration. Therefore, in addition to overall productivity measurements, median measurements of private-practice groups and academic groups were calculated and compared. For the groups studied, the results demonstrate that analysis of subcategories, such as private-practice and academic groups, can be used to provide more focused comparisons to benchmark productivity. Private-practice groups provide care for private-practice surgeons who generally perform surgical procedures of shorter duration in contrast to academic surgeons. Cases in academic centers may have longer durations because of the inclusion of surgical and anesthesiology house officers who are undergoing training or because more complex cases may be referred to academic medical centers. These clinical impressions are supported by the results in this study. The duration of surgery (h/case) influences the tASA/h (fig. 2). Despite working more billed hours per OR site, academic groups did not bill more tASA/OR site (table 4).
Because the sample size was small, additional subgroup analyses using inferential statistics could not be performed. In table 5, the median values of groups separated by private-practice versus academic group and then by number of anesthetizing sites or OR sites are shown. The purpose of this table is to demonstrate how additional subgroup analyses in a larger survey may provide even more focused comparisons with one's own group and therefore facilitate better benchmarking of a group's productivity. Other potentially useful subgroup analyses that could be performed on an industry-wide study include the number of cases (e.g. , < 10,000, 10,000–20,000, > 20,000), the care model (such as private-practice physician-only vs. private-practice medical direction), and hospital type (ambulatory surgicenter, community, urban, academic medical center).
The design of the study (i.e. , the data collections and measurements) is similar to those reported earlier. 9The purpose of this study was to develop methodology that is suitable to compare group clinical productivity. Our purpose was not to examine an exhaustive set of metrics that an individual group might use, depending on the adequacy of the individual group's database, to evaluate group productivity. For example, an individual group might use as internal “key indicators” total charges per day, accounts receivable, net expected revenue, number of rooms with continuing surgery at 3 pm, or number of scheduled cases per day. On the other hand, external comparisons between groups require standardized measurements that cannot be unique for each group. In other words, the study was designed to examine comparing groups using “external” measurements and not designed to evaluate usefulness of “internal” measurements.
In designing the methodology, several limitations were accepted to increase participation and to use standardized measurements. To increase participation rate and acceptance by groups, the design of the data collection met three requirements: readily available data, limited effort for data submission, and exclusion of confidential information (e.g. , compensation per anesthesiologist). As in the previous study that compared two different anesthesiology groups, 9these data have limitations. By focusing on the productivity of OR anesthesia care, we excluded obstetric care, pain management, and critical care. To minimize the effort involved in data submission and to standardize the information from each group, we did not include RVU-billed services performed in the OR (e.g. , line placement) or modifiers that only some payers allow.
We used measurements similar to those used in the previous two-group report 9but emphasized “per OR site” measurements and excluded any measurements based on “per FTE,” including total ASA units per FTE (tASA/FTE). Because tASA/FTE increases as concurrency increases, 12those results demonstrated that the tASA/FTE measurement was confounded by differences in concurrency or staffing ratio. For comparing anesthesiology group clinical productivity, it is essential to account for differences in concurrency. “Per FTE” measurements may be a useful internal measurement of group productivity for many groups, but when comparing group productivity (external measurements), “per FTE” measurements do not account for differences in concurrency and may therefore be misleading in comparing productivity between groups.
Despite this evidence, a group may still want to use tASA/FTE as an indicator of productivity. Although tASA/FTE can easily be calculated as the product of median concurrency and median tASA/OR site (table 4), this resulting measurement is misleading as an external benchmark when comparing groups having different concurrencies. On the other hand, as an internal benchmark of group (not individual) performance, a group could multiply its own concurrency by the median tASA/OR site to quantify tASA/FTE uniquely for the group. The group may then choose to follow this as a key indicator, assuming concurrency does not change.
This observation that “per FTE” measurements are misleading in comparing group productivity directly contrasts with individual productivity measurements that are by definition “per FTE.”10,13,14The differences between measurements of individual productivity and group productivity can easily be illustrated by examining productivity measurements used in any team sport. For example, in basketball, individual measurements include many different statistics for each position, such as assists for point guards, rebounds for power forwards, and blocked shots for centers. In contrast, the meaningful external team comparison (i.e. , team comparison) is simply the win–loss record.
Despite a favorable win–loss record, a team's coach may be unhappy with certain aspects of team or group performance, such as rebounding or field goal percentage. Therefore, the coach does not rely only on the external measurement (win–loss) to evaluate his team's performance. In terms of the present study of anesthesiology groups, we did not address all possible internal measurements that groups can use to follow group productivity, but focused on accessible external measurements. We defined meaningful measurements as those that allowed clinically relevant questions to be answered. Although additional questions, including revenue- and compensation-related productivity comparisons, could be asked, we designed the measurements with the limitations set by the data collection. Therefore, anesthesiology groups should not rely only on group comparisons, but should also collect “internal” measurements specific to their group (e.g. , “key indicators,” such as the number of cases scheduled per day, percentage of ORs still running at 3 pm, total charges billed per month, or revenue per month). Each group must choose what key indicators to use for their clinical setting and practice. Further, many of the groups studied noted that they provided anesthesia care at more than one hospital or OR suite. In practice, the methodology presented is applicable for each of these multihospital groups to evaluate and compare each OR suite within their group. In addition, in contrast to an outside survey, a group has compensation and revenue information for each OR suite. Additional financial productivity measurements (e.g. , tASA unit/$ compensation, $ revenue/OR site, and $ revenue/h of care) are possible and may provide valuable comparisons. Similarly, for all groups, the measurements presented in the study and these “dollar” productivity measurements may be determined for each surgical service or individual surgeon for whom the group provides care.
For meaningful comparison, we chose tASA/OR site as the measurement of overall departmental productivity. Because cases/OR site would not account for the variations in cases caused by differences in base and time units billed, we eliminated cases/OR site as a measure of overall productivity. Although overall productivity is measured by tASA/OR site, this single measurement does not allow a group to understand the factors contributing to high or low productivity. The number of hours billed for each OR will influence the number of total ASA units billed. Two productivity measurements (h/OR/day and tASA/h) determine tASA/OR site. If tASA/h were the same for all groups, then h/OR/day would determine which group had higher or lower tASA/OR site. But if the groups differ in tASA/h, then the group with a high tASA/h could bill the same number of ASA units but bill fewer hours than a group with a low tASA/h. In other words, cases of shorter duration but similar complexity result in more ASA units billed per hour. Although base/case influences tASA/h, duration of surgery is the primary determinant of tASA/h (fig. 2). 15
Comparisons using h/OR/day may clarify other aspects of OR management. If an anesthesiology group reports low h/OR/day as compared with an industry median, perhaps the overhead cost of surgical services should be reduced by reducing the number of open ORs. Conversely, high h/OR/day may suggest that additional operating rooms should be opened or constructed. An important limitation of this measurement, which includes surgical care during regular working hours with care provided at nights and on weekends, is that after-hours care has both higher direct cost (e.g. , overtime and shift differential) and indirect costs (e.g. , time away from home). 16,17For purposes of this study, to limit the effort required by participating groups in reporting data, we did not attempt to separate cases done at night and on weekends from those done during the daily schedule. Hence, a high h/OR/day measurement could represent long workdays or high numbers of cases performed at night and on weekends. Another limitation to the h/OR site measurement is that it does not include “downtime” in the schedule when anesthesiologists must be available but are not billing time units. If excessive blocks of time during the regular workday are not scheduled, both the anesthesiology group and the hospital may incur unnecessary costs. Despite these limitations, the participating groups considered this measurement to accurately reflect their perception of their practice—groups that had the highest h/OR/day confirmed that surgery continued in many rooms until late in the day, and groups that had the lowest h/OR/day confirmed that surgery was completed early in the day in most rooms.
The results illustrate that the duration of surgery influences total ASA units billed. The median tASA/OR site was similar between both groups, but the private-practice groups billed almost 2 h less per OR per day than the academic groups (table 4). Similar comparisons can be seen when comparing specific groups (table 6). The private-practice groups in this sample provided care for shorter cases (h/case). As a result, the private-practice groups were able to provide care for more cases per hour and therefore bill more base units/h than the academic groups. When comparing tASA/h between groups, the differences in tASA/h are dependent on differences in base units/h billed. The base units/h can differ either because more cases are done per hour (surgical duration is shorter) or base/case is different. Since base/case is not different between the academic and private-practice groups, surgical duration accounts for the difference in tASA/h. For academic departments, the implications of the duration of surgery on total billings and hourly billing (tASA/h) are important. If a medical school or hospital were to benchmark the productivity of an academic department against private-practice productivity, 4,5that department would be required to provide care for more hours to have similar tASA/OR site. In fact, the shortcoming of the tASA/OR site measurement is that it is dependent on factors that are not controlled by the anesthesiology group.
Furthermore, if benchmarking clinical productivity is intended to objectively determine the personnel needs of all clinical departments, anesthesiology is disadvantaged. In contrast to other clinical departments, anesthesiology clinical staffing needs cannot be determined directly by productivity measurements. The number of clinical faculty required to provide surgical anesthesia on a daily basis (i.e. , OR FTEs) is primarily determined by the number of anesthetizing sites to be staffed and the staffing ratio (i.e. , concurrency). 18The staffing ratio for academic groups is generally less than or equal to 2 OR sites:1 anesthesiologist (concurrency ≤ 2.0). This limit on the staffing ratio is for medical direction of residents by faculty and is set in the Program Requirements of the Residency Review Committee of the Accreditation Council of Graduate Medical Education. ††
Although workload ideally should determine the number of OR sites, nonworkload factors often influence the number of OR sites. The productivity measurements discussed in this study could be used to help determine whether the number of OR sites is consistent with industry means. For an academic department to determine objectively whether its faculty members are “working hard,” the h/OR/day measurement provides a comparison that is more accurate than tASA/FTE or tASA/OR site measurements.
The second major implication of the duration of surgery for academic departments is financial. In this sample, if the proportions of various types of third-party coverage were equal, private-practice groups would generate higher revenue per hour of care than academic groups. To generate equal revenue, academic groups must work longer hours. 11However, if an academic group has a higher proportion of nonpaying or poorly paying patients, as is often the case in teaching hospitals, generating the same revenue as a private group would require an even greater number of h/OR/day. This analysis suggests that academic departments of anesthesiology will often, perhaps usually, confront an adverse relation between the cost of providing care and the revenue generated by that care.
Although we believe that these results can likely be generalized to anesthesiology departments across the United States, this study includes too few groups to be considered industry-wide. However, the purpose of the study was to demonstrate a methodology for comparing clinical productivity of anesthesiology groups. The methodology, which relies on a simple survey to generate productivity measurements based on “per OR site, “per case,” and tASA/h, facilitates meaningful comparisons. These data and results suggest that a national survey of a more representative sample of anesthesia groups could provide useful benchmarking data for members of the specialty and for national policy and planning.
The authors thank all the participating Anesthesiology groups. Because of confidentiality issues, we cannot name all the groups that participated, but without their assistance, this study would not have been possible. The authors also thank Jordan Kicklighter, B.A. (Editor), and Irela Salinas, A.A. (Editorial Assistant) in the editorial office of the Department of Anesthesiology at The University of Texas Medical Branch, for preparing and editing the manuscript.