Behavioral economics seeks to define how humans respond to incentives, how to maximize desired behavioral change, and how to avoid perverse negative impacts on work effort. Relatively new in their application to physician behavior, behavioral economic principles have primarily been used to construct optimized financial incentives. This review introduces and evaluates the essential components of building successful financial incentive programs for physicians, adhering to the principles of behavioral economics. Referencing conceptual publications, observational studies, and the relatively sparse controlled studies, the authors offer physician leaders, healthcare administrators, and practicing anesthesiologists the issues to consider when designing physician incentive programs to maximize effectiveness and minimize unintended consequences.
Physicians enter the field of medicine for its unique opportunities as well as an inherent desire to deliver compassionate and quality patient care, demonstrate high levels of competency, and earn the respect of their peers. The Hippocratic Oath compels physicians to rise above the pressures of cost controls, insufficient value attributed to patient-centric primary care, administrative burdens, long hours, and declining reimbursements to help and heal.1–3 While physicians struggle to maintain the meaning of their calling in the rapidly changing world of health care, the industry seems focused on how to motivate physicians to adopt desirable behaviors (defined as enhanced productivity and adherence to quality metrics) by focusing on economic gain using financial incentives. The anesthesiology community itself has been trying to address both clinical and academic incentives for more than a decade.4–9
Motivations to meet performance targets can be intrinsic (arising from personal workplace satisfaction and the inherent desire to achieve mastery or serve a greater good) and/or extrinsic (arising outside of oneself).10,11 Financial incentives or penalties are one type of extrinsic motivator. Extrinsic motivators can be nonfinancial such as employee-of-the-month awards, pats on the back, preferential parking, and ranking of performance (relative social ranking). Focusing on performance compared to peers or the norm seems to yield excellent results, even without additional financial incentives.12–14 While this review concentrates on optimal construction of financial incentives (extrinsic), intrinsic motivators may be greater than extrinsic motivators for cognitive workers such as physicians.15,16
Intrinsic motivations—autonomy in the workplace, mastery of a skill, shared common purpose—resonate powerfully with physicians.17 Other important intrinsic motivators include social relatedness (being a valued contributing member of a group) and self-image (consistency with a personal view of oneself as a healer).18–20 Certain extrinsic motivators such as relative social ranking and recognition reinforce intrinsic ones. Relative ranking shows that one is doing a good job providing health care to patients (mastery, self-image), providing a social good by giving efficient and effective care (shared purpose stewarding limited healthcare dollars, self-image), and being a good contributing member of the organization (social relatedness to other doctors and staff at the facility, self-image).
Intrinsic motivation is powerful when dealing with jobs that require cognitive work, such as creativity and problem-solving. Inherent motivation (free!) can be engaged by healthcare leadership by improving the systems that support the work of physicians. This could take the following forms: allowing physicians the autonomy to make decisions (with nudges to do the usually right thing)21 ; enhancing their work environment so they can employ their mastery using higher-level cognitive skills and relieving them from routine tasks (such as documentation); easing pursuit of mastery by supporting additional education and training10 ; and providing timely relative ranking of relevant, valid quality metrics for self-action.10,12 This describes the work of physicians, with financial incentives primarily deployed for decisions as to volume of work, not overall quality of work or developing solutions for particular patients.
One state-sponsored study calculated that intrinsic motivation was fourfold greater than extrinsic motivation for cardiac surgeons seeking better cardiac surgery outcomes in Pennsylvania.22 Similar findings were shown in Accountable Care Organizations.23 In large-scale studies of physicians’ quality performance (defined mostly with surrogate process measures) and facility-based health system performance, financial incentive systems have been ineffective in changing important health outcomes such as mortality.24–27 Despite what should be a calculated decision of the proper mix of extrinsic and intrinsic motivators for healthcare providers,28 financial incentive systems are pervasive, often as a first and only choice to influence behaviors. Both physicians and physician leaders need to understand how and when to use incentives to achieve goals that will benefit patients and produce responsible stewardship of healthcare resources.
Creating optimal and successful incentive systems requires an understanding of the principles of behavioral economics (table 1). Behavioral economics has been popularized by notable academicians and social scientists publishing books such as Nudge; Misbehaving: The Making of Behavioral Economics; Predictably Irrational; and Drive (fig. 1). In 2002 and 2017, Nobel Prizes in Economics were awarded to Daniel Kahneman, Ph.D. (Professor of Psychology and Public Affairs Emeritus, Woodrow Wilson School of Public and International Affairs, Princeton University, Princeton, New Jersey; the Eugene Higgins Professor of Psychology Emeritus, Princeton University, Princeton, New Jersey, and a fellow of the Federmann Center for the Study of Rationality at the Hebrew University, Jerusalem, Israel) and Richard Thaler, Ph.D. (Charles R. Walgreen Distinguished Service Professor of Behavioral Science and Economics, University of Chicago Booth School of Business, Chicago, Illinois), respectively, who explained how human behaviors may deviate from profit/outcome maximizing solutions in many instances, taking into account the role of intrinsic and extrinsic motivation.
Behavioral economists have studied how individuals can be prompted (“nudged”) to make desirable choices, such as prescribing generics in a computerized provider order entry system.21 Behavioral economics shows that individuals will respond quite differently depending on how incentives are constructed and presented. While predictable, the responses are often not intuitive. As a result, few of these behavioral economic principles are fully appreciated and have not been widely adopted by healthcare organizations, provider networks, or individual physicians, resulting in predictably suboptimal performance.
In the past decade, health care in the United States has undergone dramatic change.29 One major change is the migration to an employed physician model (53% in 2016).30 Salaried employment eliminates the traditional private practice volume-driven financial incentive of fee-for-service medicine. As a result, various health systems and large physician groups employing multiple physicians, including the many national firms of anesthesia providers, have sought to retain the fee-for-service work ethic using incentives.31,32 Studies show that this is effective in many instances in anesthesia and elsewhere, but most studies are observational.7,9,32–35 To our knowledge, there are no controlled trials that demonstrate how best to structure and frame financial incentives to physicians for a sustainable, maximal increase in the volume of work.
In this review, we introduce the essential components of successful financial incentive programs, adhering to the principles of behavioral economics, and referencing conceptual publications, observational studies, and the relatively sparse controlled trial healthcare incentives literature to date. We recognize that this review does not fully explore how to capitalize on intrinsic motivations that promote the feelings of “joy” when a physician heals a patient, creates a work environment that is more efficient and effective by employing his or her own intellectual capital, or sparks understanding within a trainee, or when his or her research uncovers new advances in medicine. In other words, if an exclusively extrinsic plan is the aim, we offer the issues to consider, while also cautioning that ignoring intrinsic motivation is misguided.3,16,17,36,37
An unproven corollary to the successful use of financial incentives to spur productivity improvements is that financial incentives will also be effective in promoting quality care. Whether providing physicians with incentives will yield higher value (quality/cost) health care is especially important to know as government agencies and large payers have begun paying more for quality of care under a rubric known as pay-for-performance or value-based care.
The early results of pay-for-performance are not promising.24–26,38–41 The reasons for this weak result are unclear, but one explanation may be that financial “if/then” incentive systems (if you do this, then you will get that) start with faulty assumptions. The first is that variation in physician performance is due to variation in motivation to do what the organization wants.28,42 Variation in performance is a function of both the individual physician’s effort and the effect of the system on the physician’s ability to perform.43 In anesthesia, for example, on-time performance for room turnover is not limited to a function of the anesthesiologist’s work effort or focus because other system and personnel issues are involved.44 Preoperative administration of ß blockers to 100% of appropriate patients is not only a function of a physician’s desire to provide this indicated medication, but also depends on routine medication reconciliation by others, availability of the right drug at the right time, and a prompt to do it. Variation in performance as a result of variation in motivation is only routinely true for volume of care, especially that provided after hours,45 and may not extend to effective care.
Another flawed assumption is that even if the system is optimized for desired physician performance, the application of a financial incentive will yield the outcome(s) desired by the organization. The medical and psychology literature show that these assumptions are only true in some instances and with many caveats. Moreover, financial incentives may be counterproductive (fig. 2).
Anesthesiologists, like all physicians, and especially those who practice in academic settings, have multimission performance expectations—volume, quality, efficiency, cost, educational excellence, and research productivity. Since many health systems employ incentives, understanding how incentives work is important to optimizing physician performance, and practitioners should appreciate how financial incentives (which are only extrinsic motivators) potentially may result in unintended negative consequences.9,16,24,25,36,37,39,46,47 Stated differently, it would be wise for those involved in creating and participating in incentive programs to understand how to construct them and when they are effective as a motivational tool.
Essential Components of Financial Incentive Programs
Basics of an Incentive Plan
If instituting an incentive, one must want to change outputs or the workplace environment. This encompasses a change management process that is more complex than simply instituting an incentive. There is an entire literature devoted to this and the communication strategies necessary including Leading Change; Switch: How to Change When Change is Hard; and Made to Stick: Why Some Ideas Survive and Others Die (fig. 1). When building an incentive program it is important to carefully consider the end result. It is wise to include those being incentivized in setting the goals, and, if possible, to use the data repetitively (e.g., reports to hospitals, ranking agencies, peer review) to make the costs of collecting data worthwhile. Keep the goals limited and make them Specific, Measurable, Achievable, Relevant, and within the Timeframe desired (SMART goals) with periodic feedback. A transparent and simple calculation of the incentive is also key to success. Finally, if you pay for anything and everything in the context of an incentive system, you may have variable pay but you will likely not get the focused change you want, only justification of existing behaviors and activities6 (fig. 3).
Given constraints related to Stark and other applicable laws that impact payment of financial incentives, legal review of any incentive plan is essential. In general, incentives should never cause compensation to exceed fair market value, or directly incentivize additional facility volume or withholding of essential care.
The Size of the Incentive Matters
Financial incentives are effective in yielding desired behaviors,28,36 as long as they are considered ethical. It is a question of how much will it take, how long it will last, is it worth the expense, and will unintended effects offset the benefit. While no controlled studies have evaluated incentive size thresholds in health care, the range of results of published studies are illustrative.
How Much Is Enough to Incentivize Clinical Productivity and Efficiency?
Even if one applies all the behavioral economics principles in this article, the amount must first be “enough to notice,” and this varies with the value of the item to start.37 For example, one notices a $30 deduction from a $50 toaster (60% off!), but is unmoved by an identical $30 price reduction of a $50,000 car. This principle applies when sizing physician incentives; they must be large enough to notice in relation to a physician’s existing salary. Although not explicitly studied, publications have concluded that many incentive programs’ limited success or failure was due to small payments, insufficiently sized to notice for physicians.14,40,42,47–53 In addition, if operational changes are necessary to implement or sustain the activity, the size of the incentive has to be sufficient to fund that or the necessary investments will not be made and the results will not be sustainable.14,49,52,54
Many institutions have publicly accessible white papers describing their physician incentive methodology.32,34,55,56 However, there are few published results and no randomized controlled studies. In a single department at the University of Pennsylvania (Philadelphia, Pennsylvania), Levin and Gustave33 put 15% of base salary at risk for clinical productivity, with the opportunity to earn more than the previous year with increased productivity. There was a 4.8% increase in the work component of the resource-based relative value units (used for all nonanesthesiology specialties) compared with the previous year, with no decrease in academic output or quality of care. Those latter two activities were incentivized with a 5% withhold in addition to the 15% for work productivity. The paid incentive ranged from 60 to 80% of collections for increased productivity. This amount of incentive was similar to the 60% of collections noted in another academic practice plan.35
Reich et al.9 introduced a complex point system at a single academic anesthesia department, which put 70% of salary at risk for both clinical and academic productivity based on earned “points.” Points were assigned for American Society of Anesthesiologists (ASA) relative value units of clinical work; academic work points were supposedly representative of equivalent time and effort compared to clinical work, but the scale was not validated. Over 3 yr, there was a cumulative 31.5% increase in clinical productivity. Of note, their baseline clinical productivity was well below average with operating room full-time equivalent output of 7,212 ASA relative value units. The fully implemented incentive system pushed operating room full-time equivalent productivity closer to average at 9,480. Other factors could have caused the noted increase besides the incentive plan, such as important factors like mix of cases and surgical block utilization. These were not controlled, nor was productivity adjusted for this. Otherwise, specific results of clinical incentive plans are not well documented in the peer-reviewed literature, despite up to 97% of employed physicians having some sort of incentive plan.32
In a large 2017 Australian survey of general practitioners, the response to incentives to supply after-hours care was assessed.45 There was a weak effect (for each 1% increase in after-hours pay there was an associated 0.12% increase in hours provided), suggesting a larger incentive is necessary to make after-hours care available. If more was paid for both regular daytime work and after-hours care (incentivizing everything), less after-hours care was provided (did not produce desired result of more work). This is another example (see “Basics of an Incentive Plan”) where incentivizing everything is the same as incentivizing nothing. This is equivalent to systems in the United States paying more for all resource-based relative value units versus just incentivizing resource-based relative value units after hours or above benchmark. Of note, sociodemographic features (child care responsibility, practitioner age, sex) significantly affected the response to a financial incentive.45
In anesthesia, most academic departments have an incentive structure that pays for an extra shift or extra hours in the $150/h to $175/h range4 (verbal personal communication, Charles Whitten, M.D., Margaret Milam McDermott Distinguished Chair in Anesthesiology and Pain Management, University of Texas Southwestern Medical Center, Dallas, Texas, 2018). Private sector payments in anesthesia are employer/owner-, market-, and time of day-dependent, and the payments reportedly range from $150/h to $250/h (anonymous sources). Miller and Cohen7 describe a per-hour payment for additional clinical work that reduced differences in salary among assistant to full professors with no change in academic productivity. However, there was no report of per-full-time equivalent work productivity or an examination of unintended effects on the academic careers of junior faculty who were picking up many extra shifts.5,7
In terms of driving operating room efficiency, one study placed 5% of anesthesia salary at risk (withholding) for achieving greater first case on-time starts and benchmark turnover times.57 Several shortcomings are present with this study that limit applicability—allowing anesthesiologists to blame others for delays to qualify for salary credit, providing simultaneous relative social ranking, and statistical flaws.58
How Much Is Enough to Size Incentives for Quality?
Most published studies of incentive systems focus on efforts to improve quality of care. Two controlled studies seem to define the lower boundary of payment necessary to incentivize quality performance for a single process metric. In an innovative study on achieving lipid control, Asch et al.49 employed many behavioral economics principles to influence adherence. Physicians were paid up to $1,024/patient, with an average payment per physician of $3,246 (in 2014 dollars). However, there was no statistically significant difference in lipid levels when compared to the control group. In another controlled trial, similarly-sized incentives (1.6% of salary, $2,672 in 2009 dollars) were successful in getting physicians to better treat hypertensive patients in the the U. S. Veterans Administration Health System.54 Payments less than those above have generally failed or been minimally effective in various compilations of studies.48,53 It is likely that process metrics for matters of low importance to physicians (such as improving documentation) would require greater incentives to deliver results than provided by the two studies above.28
While defining the lower limits for single metric incentives at $3,000, it has been suggested that overall practice incentives will require at least 10 to 15% of annual salary.14,33,59,60 However, one of the most established commercial pay-for-performance shared savings global budgeting programs for physicians, the Alternative Quality Contract administered by Blue Cross Blue Shield of Massachusetts, ranges quality incentive bonuses from 2 to 10% for hitting benchmark quality indicators and limiting resource consumption. While there was little proven impact on quality of care and outcomes, the program did increase adherence to the selected measures for Blue Cross Blue Shield patients and contained Blue Cross Blue Shield patient spending growth at 50% of the statewide benchmark.61,62
In England, success followed an increase in family practice salaries by 25% accompanying the implementation of the Quality Outcomes Framework. Starting in 2004, changing documentation and/or medical practice under the Quality Outcomes Framework was required to receive credit on 146 quality metrics, and successful performance on those metrics determined the salary increase. The data were culled from their records, and there was general approval of the system.59,63 Almost all the practitioners hit their benchmarks and received the vast majority of the potential salary increases (average 937 out of 1,000 points).64 In California, incentives for quality metrics by earning back 5% of the base salary (initially held back as a withhold) was roundly decried and apparently ineffective.63 In the Advocate Health System, while the financial incentive was not isolated from other behavioral economics principles, high attainment (greater than ninetieth percentile, National Committee of Quality Assurance benchmarks) was reported as “consistently maintained” with bonuses to primary care physicians ranging from 10 to 50% of base salary.14 They note their success in distinction to Medicare initiatives, which were too small to notice at only 1 to 2% of salary.14,26
How Much Is Too Little as a Financial Incentive?
Although we were unable to find any published literature in health care studying the potentially perverse impact of sizing incentives too small, behavioral economics experiments have shown that paying too little may devalue the efforts being made to do the baseline work and actually decrease workplace outcomes compared to baseline.37,65,66 Anecdotal reports of very low incentive payments for increased productivity in physician groups (such as $8/resource-based relative value unit above the median benchmark, less than 25% of the Medicare value) have reportedly had such effects.
How Much Is Enough to Incentivize Academic Productivity?
It is unclear what constitutes an effective academic incentive. There are neither large nor controlled studies in this realm. A meta-analysis concluded that there was no effect on teaching when incentivized, and weak, if any, proof of effects on research output.67 This is consistent with the conclusion in other fields that financial incentives do not motivate inherently interesting or creative tasks.36 Levin and Gustave33 instituted a 5% academic incentive and maintained academic productivity in an orthopedic department while driving up clinical productivity about 5%. Reich et al.9 instituted a point system and recovered from a decreased academic output after institution of a clinical and academic productivity scheme by increasing valuation of a single publication to 25% of an full-time equivalent work effort. Sakai et al.68 reported maintaining academic productivity without financial incentives, but by demanding accountability for productive use of time (equivalent approximately to a paper for about 20% of a full-time equivalent effort) to justify the salary of an academic day. No one was actually penalized across many years, but the threat of not earning one’s entire salary was motivating enough and right-sized the requests for academic time.
Miller and Cohen7 and Detsky and Baker35 reported academic faculty productivity remained constant in the face of a clinical productivity scheme without paying for academic output. In general, paying for inherently interesting work is a mistake,36 as it demands continued payment in perpetuity to maintain the same work effort previously provided without additional incentive. Anecdotal evidence suggests that an extrinsic reward system that allocates nonclinical time (instead of money) based on productivity on the use of that time might be as or more effective than a monetary incentive.
Relative Social Ranking
Ranking physician performance is a powerful force (e.g., second in the department in administering appropriate antibiotics on time preoperatively is much better than last in the department). Physicians often have a high opinion of their own capability and are naturally ambitious and competitive.31,68,69 Presenting data that show a specific physician’s performance is below average compared to his or her peers, or demonstrating that he or she is not performing to an accepted high standard, is an incredibly self-motivating force to do better. Most people have a high regard for their public and private image.20 This explains the ability of social ranking to work alone or synergistically with financial incentives to move quality performance metrics.12–14,58,70,71 Sharing anonymous rankings of physician performance are much less effective.29
Unlike almost any other field, medicine promotes excellence in patient care as the only acceptable option. Physicians shown to be underperforming as compared to their peers will change behaviors to improve their relative rankings. It is possible that the same level of unintended consequences that attend financial incentives will follow the use of relative social ranking. For example, once cardiac surgical report cards were made public in New York and Pennsylvania, surgeons were less likely to operate on high-risk patients.72 There has been no published healthcare study comparing relative social ranking with financial incentives. It is widely accepted, however, that the utilization of timely performance feedback and relative social ranking enhances performance,12,14,60,70,71,73,74 and its use may be associated with a higher degree of performance than financial incentives.22,28
The production of rankings is made easiest using an integrated system’s electronic health record, as reports can be pushed to the practitioners with little additional investment. However, in large anesthesia group practices with multiple paper and computer interfaces, producing these reports can require a great deal of investment. We are unaware of a study of the true costs of implementation in these instances.
The Impact of Incentive Timing
Although absolute size is critical to the success of an incentive plan, timing can markedly impact the perceived size, making the paid incentive seem subjectively larger or smaller to the recipient. There are five basic concepts for timing of incentives: hyperbolic discounting, saliency, continued effectiveness over time, effect of withdrawal of incentives after a period of time, and the commitment period for an incentive plan.
People are more likely to choose a smaller certain and immediate reward than a larger reward that accrues later. This is a behavioral economics principle known as hyperbolic discounting, present bias, or immediacy.29 It indicates that the uncertainty of payments distant in the future markedly diminishes their value in the present, far beyond the accepted economic equivalence for net present value of a future payment.37 Net present value is the time value of money—for example, what you can earn in a very safe investment over time such as a bank deposit, and which might currently be valued at 5%/yr. Therefore, assuming people discount future rewards at 5%, one would expect that offering to pay an immediate incentive of $10,000 versus an end-of-year amount of $10,500 would be no different to the recipient. However, hyperbolic discounting may cause devaluation of the year-end lump-sum bonus by as much as 50%, which would make $10,000 now equivalent to a year-end $20,000 bonus.37,75 The traditional use of year-end bonuses, especially in constrained fiscal environments that lend uncertainty as to its full payment, can cause the bonus to lose a great deal of value to the incentivized practitioner.
Most anesthesia departments reward extra shift work relatively soon after performance, which may explain why it is well received and effective.4 This is in contrast to hospital-based value purchasing by the Centers for Medicare and Medicaid Services, which incentivizes health systems on data that is at least 2 yr old. This (along with a size of 1 to 2%) may be one of several reasons that explains its lack of efficacy in impacting desired outcomes.14,24,26,76
Effects Over the Long Term
Performance stagnates over time if incentives are not reinforced or targets are not changed. Once target levels are achieved and incentives earned, the Quality Outcomes Framework demonstrated that further improvements do not occur over time.24 While long-term incentives may continue to produce desired behaviors,36 only a handful of behaviors can occupy a physician’s mental processing at one time. As a result, the number of active nonautomated incentive-driven behaviors is limited.59,60,73,77 Therefore, systematic changes are essential to solidify positive behavioral changes so that decision-making and focus, which are finite, can be directed elsewhere once desired performance thresholds are achieved.
Extinguishing Effects Over Time
Over time, incentives paid routinely become part of the expected base salary, and may or may not continue to effectively motivate behaviors. There is some disagreement on this, as it has not been well studied in health care.28,36 If incentives are withdrawn, and if operational changes (e.g., cognitive aids such as a checklist)78 or electronic medical record prompts or additional clinic support is not implemented to cement performance,52 the withdrawal of incentives is often associated with a return to baseline or even worse performance.36,39,75 Months after withdrawing the incentive for better blood pressure control of veterans,54 physician performance returned to preincentive levels.
At Kaiser Permanente, initial improvement was seen after introducing an incentive for ordering two screening tests at physician visits, but after withdrawal of incentives, performance during the subsequent 4 yr fell, eventually settling at screening levels worse than baseline.39 This is consistent with the behavioral economics literature warning against paying short term for prosocial, inherently interesting, or noble actions.20,36 The Kaiser Permanente study confirms what was previously demonstrated in several classic behavioral economics studies. In one study, students were offered an incentive to draw pictures and did so at the same rate as a second group not being incentivized. Follow-up opportunities to draw showed that the incentivized group had a reduced desire to draw absent the previously paid incentive when compared to the control group. The activity of drawing pictures had lost its inherent value when attached to an incentive that was then withdrawn.17,36
It is interesting to note that anticipation of financial incentives has been shown to activate excitatory regions of the brain (nucleus accumbens) associated with other dopamine reward-driven pleasures such as gambling and exogenous substances.79 This may explain neurophysiologically why withdrawal of the incentive leads to a quick return to baseline. The brain quickly associates the activity with a neurophysiologic reward and will not engage and pursue the activity without the associated reward.
Commitment Period for Incentives
A long-term commitment to an incentive framework is important. Frequent changes in incentive plans cause suboptimal investment and engagement.38 Changing care paradigms and instituting supporting systems require more than a year-long commitment such as seen in the Alternative Quality Contract of Blue Cross Blue Shield Massachusetts and the Quality Outcomes Framework in England.59,62
Incentives versus Penalties: The Impact of the Principle of Loss Aversion
Since originally defined by Tversky and Kahneman,80 two of the founders of behavioral economics, loss aversion has been well studied and quantified. Economists view an equally likely two-sided risk (coin flip up = win $10, coin flip down = lose $10) as no different from zero. However, an equality of potential losses and gains is not seen as equal by humans. A $10 penalty is seen as a greater negative than a $10 reward such that an individual may require a $20 reward to make the outcomes equivalent and therefore worthy of consideration. This has several implications for construction of incentive plans.28,81
First, introducing incentives funded by salary reductions of others (such as a new productivity compensation plan with no additional dollars) is not really a zero sum. The double impact of salary reductions due to loss aversion will make a dollar-neutral incentive plan seem worse than zero to those subject to the plan.28,36 Second, since a penalty of $1,000 is more likely to spur behavioral change than the possibility of gaining $1,000, important medical opinion and theoretical publications have suggested using penalties to make incentive plans more effective.29,38,73 The fact that the Centers for Medicare and Medicaid Services Readmission Reduction Program has been more effective than hospital-based value purchasing in reaching its goals has been attributed to its use of relatively large penalty “payback” effects.59,82 However, when employing individual penalties, large unintended consequences are likely, including more gaming of reported results and negative impacts on morale.28,63,83 Enacting penalties on physicians should be done with great caution.
Status Quo Bias/Endowment Effect
A closely related topic to loss aversion is the endowment effect or status quo bias.84 People routinely place a subjectively larger value on whatever one already possesses as compared to other items of similar value. This applies to both goods and services and inhibits the changes that incentive plans are designed to produce.37 The extrapolation of this concept, borne out by experience but not explicitly published in the healthcare literature, is that when presenting a new payment scheme to providers that is equal in value to the current system, it will be routinely perceived as inferior and less desirable. Therefore, loss aversion and status quo bias work together against acceptance of a new revenue-neutral compensation plan. These various aversions to change and reluctance to adopt new systems are why robust communication and appeals to intrinsic motivations are essential for success when large upside gains cannot be promised.
Group versus Individual Incentives
Questions arise as to whether to pay the physician or the entire care team. (There is literature on the methods/ethics of patient incentive payments that are outside the scope of this article.) While a recent meta-analysis from the broader economy36 suggests that sharing payments among team members, not necessarily equally but according to their level of input and pay, is best, the healthcare literature is equivocal.
An observational report split payments among physicians based 70% on individual performance and 30% on group performance. The results suggested that doing so amplified the impact of the individual incentive and played to the need for social relatedness (the desire to be a well-regarded and contributing member of a group is an intrinsic motivator).14 In one controlled study, contrary to expectations, paying physicians alone was more effective than paying both nurses and physicians, and also more effective than paying the group practice for future individual distribution.54 To our knowledge, there are no published controlled studies in health care showing that group payments are more effective than individual payments. This deserves further controlled trial investigation.
Incentive metrics have to be clearly defined so that all affected are in agreement. Virtually error-free data, shared in a transparent, understandable, and timely manner, can influence performance.47,85 However, with uncommon outcomes, it is hard to differentiate true performance deficits from random chance.38 There is a statistical technique known as hierarchal modeling, which can prevent misclassification of average performers as low performers.86 Inability to differentiate below-expected performance on important uncommon outcomes like mortality leads to a focus on outcome surrogates such as process measures.59,82 However, process measures employed in pay-for-performance have been shown to have poor correlation with the real outcomes of interest such as patient mortality when incentivizing facilities (hospital-based value purchasing and the Readmission Reduction Program)25,26,76 or overall patient outcomes when incentivizing physician practices (Quality Outcomes Framework).24 Differentiation of random variation versus true performance can also be determined using process control charts, which is an accepted industry evaluation technique.16,47 Valid data and differentiation in performance are critical so physicians do not waste effort to change behaviors due to random noise in their environment.
A common refrain of physicians avoiding behavioral change is that their patient population is different from those constituting the comparator group (“my patients are sicker”). Anticipating doubt as to data validity should be expected. It means not only having verifiable data but having evidence that the choice of comparator is appropriate using various indexes of coexisting illness or other means. In some cases, especially with sociodemographically disadvantaged populations, incentive systems may actually penalize physician practices taking care of patients who have fewer personal and community resources to support a good outcome or ongoing healthy lifestyle.87–89
Individuals tend to view money they receive through alternative mechanisms and at different times as fundamentally different than money bundled into a paycheck.90,91 Although not explicitly studied in health care, a separate check for a bonus is expected to carry more impact.37 If one makes $10,000/month and receives $11,000 in the next paycheck, it is marginally different. Getting the $1,000 bonus check separately is a much more noticeable difference from zero. With electronic direct deposit, this impact is lost. Every effort should be made to cut distinct checks or separate electronic deposits with clear notification.
We treat new money as “extra” and existing salary as sacrosanct and subject to loss aversion principles. Routinely paid incentives, in terms of mental accounting, become part of the base salary expected and subject to loss aversion. For example, there is greater willingness to commit in advance to put a future raise into a retirement plan versus getting the same raise in a salary check and then allotting it to the retirement plan.92 This should inform how we introduce incentive systems in health care. Using the next raise to physicians and applying it to productivity or pay-for-performance incentives is likely to meet less resistance than providing a 3% raise and instituting a 3% withhold or penalty tied to performance.93 Using this approach during a period of years, one can fully implement meaningful incentive programs with limited negativity.
Forced Functions/Nudging/Choice Architecture/Defaults/Choice Overload
Airline passengers cannot turn on the light in an airplane restroom without locking the door. This is a forced function that assures that you complete the first task before proceeding to the next task. A less strong but associated concept is nudging or choice architecture, which means placing the right choice within easy reach (physically or mentally), but adherence is voluntary, unlike a forced function. Anesthesiologists were ahead of their time applying this principle to control pharmaceutical costs as early as 1997.13 Leaving inexpensive generics in the operating room, while moving expensive items into the core or leaving them at the operating room pharmacy 50 feet away, was instrumental in cutting pharmaceutical costs by 50%. In lean manufacturing, this is known as “mistake proofing.”94
Defaults address cognitive scarcity, which is the limited ability of busy professionals to manage a variety of initiatives requiring choice and focus. As seen in computerized provider order entry, there are thousands of default options for various medications: dosing, choice of medication, and avoidance of drug–drug interactions. These provide physicians with the most likely beneficial choice without constantly accessing limited human daily decision-making capability. Defaults implemented by a “nudge unit” at the University of Pennsylvania (Philadelphia, Pennsylvania) almost doubled appropriate prescribing of generic medications.21 Checklists similarly overcome cognitive scarcity as scores of essential processes are followed by routine.77
The presence of too many options leads to a lack of action known as choice overload. For example, if one has to choose among numerous metrics in a pay-for-performance plan, some may choose not to participate, whereas they might have chosen to participate with fewer choices. This has been proposed as one of the problems in the many federal government–sponsored pay-for-performance plans.51 Choice overload has been demonstrated when there are too many options for workers to choose among for retirement plans.37
Targets versus Variable Performance/Stretching/Wealth Effect
A combination of target and variable performance incentive is associated with highest performance. A target that is too easy or too hard may result in no effort at all.75 If a target is just out of reach and reaching it results in a reward, people will try hard to achieve it.31,75 If it is only a target, with no additional benefit to higher performance, people strive to reach that target, but nothing more, and further effort is minimized after target achievement. This was clearly demonstrated over time in a study of anesthesia academic productivity.68 To encourage excess performance above a target, a variable increase in incentive after target achievement is recommended.24,28
The wealth effect defines the fact that if you have no dollars, $1 is important. If you have $10,000, $1 is much less meaningful. There is an incentive continuum, such that the first dollar of incentive is more important than the second dollar and so on. Ideal incentive construction has an increasing yield so that the next relative value unit you do has more value than the last relative value unit for which you received an incentive payment. Levin and Gustave33 utilized this principle in their incentive design as they increased incentive payments from 60 to 80% of collections as clinical productivity continually increased.
The Drawbacks of Incentive Systems
Unintended Consequences/Folly of Rewarding A and Hoping for B
Numerous publications suggest we need careful construction of incentives,95 especially pay-for-performance.28,40,41,96–98 In anesthesia, most departments have adopted a time-based extra pay strategy in lieu of an ASA relative value unit–driven individual productivity reward. Anesthesiologists have long realized that rewarding pure ASA relative value unit generation could lead to numerous requests for time in the gastroenterology suite or ophthalmology section, and less interest in the 1:1 care a very sick patient might require. Rewarding pure individual ASA relative value units but hoping for a work ethic that promotes optimal group functionality would be folly in anesthesia.5,6 Current anesthesia reimbursements do not adequately reward high-intensity work or account for billing limitations when working in sites distant from the operating room where coverage of multiple rooms is impossible. Alternate payment schemes have been proposed that might allow individual productivity analyses,99 but given the current reimbursement methodology, individual productivity assessments sometimes pushed by hospital and practice administrators will not yield the desired increase in operating room productivity. In fact, it will have the opposite effect.
In a classic examination of pay-for-performance of pneumonia treatment, incentivizing adherence to the goal of all pneumonia patients getting antibiotics within 4 h led to predictable, but undesired, results. The originators aimed for an expedited diagnosis and quick intervention to minimize progression of disease. However, the incentive design produced perverse results based on its imperfect design. The number of patients prescribed antibiotics in the emergency room increased dramatically, with many getting unnecessary medications rather than processes necessarily being expedited.71 A good incentive would have taken into account the percentage diagnosed within 4 h, the percentage of those started on antibiotics, the percentage of patients started on antibiotics within 4 h who did not need them, as well as the time to discontinue unnecessary antibiotics, and setting defaults to help with adherence. As Albert Einstein said, “Things should be as simple as possible, but no simpler.” These unexpected care deviations may explain how well-meaning surrogate process measures can lead to worse patient outcomes or no apparent difference in outcomes in large studies.24,25,76
Gaming is the manipulation or exploitation of the rules in an attempt to gain an advantage. In health care, gaming occurs when incentives are tied to metrics that can be influenced by the providers. For example, it may encourage otherwise ethical individuals to justify shading or enhancing documentation to improve their relative social ranking or monetary gain. With incentives at risk, providers may also be pushed to justify exclusions from the metric, which on paper show they are providing higher levels of care, but without necessarily making patients better.83,100 This issue remains a predictable and unintended consequence of pay-for-performance.28,101 Gaming can distort actual performance so that rewards become uncoupled from better health outcomes.102 Behavioral economics studies (nonphysicians) show that minor degrees of cheating are widespread when incentives are at risk and individuals are unlikely to be caught.103
Upcoding to increase patient complexity and thereby qualify for increased compensation is prohibited. However, the key return from investing in a clinical documentation initiative is considered by some a game meant to take advantage of payer incentives that pay more for complex patients with complex inpatient courses. Facilities often try to help physicians code for maximal reimbursement by employing chart reviewers who use leading questions to make sure they document all potential illnesses in the specific wording that will justify payment.102
The “benefits” of enhanced coding also apply to how physician comparison Web sites rank physicians. Process measures with exclusions can be manipulated. Actual patient outcomes cannot. That may explain how improving adherence to meaningful process measures seems uncoupled from actual patient outcomes.24,83 Playing to the rules to the maximum extent possible should not have surprised the industry.
In anesthesia, anecdotal evidence exists of gaming. For example, when cutoff times are proposed after which a late-working clinician need not return to work, the required presence of a clinician after that hour tends to increase. When rounding call pay up to the nearest hour, clinicians may sign off more often shortly after the turn of the hour. While not outright cheating, physicians’ choice may be influenced by the construct of incentives and allow the self-justification required to take advantage of the system. The one study in this arena shows that when payments for working after 4 pm were introduced for anesthesiologists at one practice, markedly prolonged turnover times (more than 1 h) late in the day do not increase in frequency.44 This suggests that anesthesiologists are highly ethical, and/or the minimal dollar rewards of staying later are not worth the damage to their reputations.
It has been shown that tournament incentives, such as “top performers get rewards,” lead to a decrease in cooperative behaviors.103 For example, if a pharmaceutical company offered the top three salespeople a Hawaiian trip, it is unlikely an individual salesperson would share leads outside their territory with other representatives competing for the prize. While motivating the individual, this might diminish the group’s performance overall. A better approach is to set a high bar, providing the reward to anyone exceeding that bar plus a variable component.
Competition may have positive effects in other situations. The pride of belonging to a team that is trying to provide the highest quality of care can spur each of several competing units to do better. In this case, competing well reinforces the intrinsic motivation of pursuit of mastery and one’s self image as part of a high performing unit.
Cognitive Scarcity, Willpower, and Crowding Out
Studies have shown that attention to detail and focus are limited, as is the application of willpower.29,37,42 Decision-making exhausts limited cognitive capabilities.77 Therefore, expecting physicians to constantly direct attention to a host of metrics requiring judgment and action is unlikely to succeed.104 Most studies of qualitative incentive plans in the United States have used or suggest using a limited number of metrics (often four to five).6,28,39,42,52,60
Cognitive scarcity may not be as limiting with intrinsic motivation. For example, seeking to simultaneously address multiple deficiencies in the workplace due to a desire for mastery and autonomy may not run into the same limitations of focus compared to externally set process measures that are being pushed through financial incentives. Empowering physicians to apply system learning to improve multiple processes simultaneously is possible, and this may generate better care in a shorter time frame.47 Even so, there are only so many projects of personal interest that one person can undertake concurrently.
A related concept is that incentives themselves narrow focus, preventing attention to other important variables.29,42 This is termed “crowding out,” and was unequivocally shown in the Quality Outcomes Framework, which focused on 146 specific quality incentives per primary care practice. A follow-up to the Quality Outcomes Framework showed that performance on nonincentivized care worsened or was stagnant, and that mortality was unaffected.24,104
Another form of crowding out occurs when moving social norms to market norms. One problem with paying for prosocial or other expected behaviors is unintentionally moving from social to market norms. Introducing extrinsic motivation can completely crowd out intrinsic motivation to perform at the desired level. In a classic study of Haifa day care centers (Haifa, Israel), half were randomized to institute a penalty for late-arriving parents, who were routinely keeping child care providers late.66 At the centers where the fine was instituted, contrary to expectations, late pickups increased dramatically, almost doubling! The prosocial expected behaviors of responsibility to the teachers had been replaced by an economic decision to pay the small fine and arrive late—in short, it was worth it. Interestingly, after removal of the fine, the number of late pick-ups remained increased, suggesting harm had been done to the social compact between parents and teachers that was irreversible. This is a salient lesson as healthcare leaders consider paying doctors to be nice and considerate of their patients’ needs.
Applying behavioral economics principles to the construction of financial incentives enhances the probability of success and optimizes the impact. Adhering to key characteristics and careful design will avoid many pitfalls. Clinician productivity (volume of work or work hours) is usually responsive to financial incentives of sufficient size. While financial incentives can increase performance on selected quality metrics, the actual effect on important outcomes like mortality are minimal, and can occasionally deliver a perverse impact.
If one implements a financial incentive plan, there are many factors that will increase the probability of success. Relative social ranking (benchmarking to peers and national norms) should accompany or precede a financial incentive, as doing so may yield the results desired without expenditure (for metrics unrelated to additional hours of work). It must be sized sufficiently to attract notice (recommended at least 10 to 15% of salary) and be paid without contingency. Paying too little can yield perverse results by devaluing the work and leading to lower effort overall. It should be paid as close as possible to the work done to avoid significant (hyperbolic) discounting of its dollar value. Withdrawing incentives can result in worse than baseline performance unless changes are hardwired into the workflows.
The incentive should combine a reward moderately difficult to achieve, as targets that are too easy or too hard yield no effort. Variable increase in reward above a target threshold should be available, as pure target rewards yield performance only to target, and little more. Ideally, the variable part of the reward should increase as performance increases to offset the wealth effect, whereby the next dollar earned is a little less valuable than the previous dollar since one is now richer. The incentive should be paid to all exceeding a desired threshold. Paying only the top performers (tournament incentives) incites negative noncooperative behaviors that are deleterious for the enterprise.
The number of metrics being incentivized should be limited (three to five is suggested), as the ability to focus (cognitive scarcity) is finite. The finite nature of focus and willpower means that defaults, choice architecture, and nudges, like checklists meant to guide the caregiver to do the right thing, should be employed first before adding financial incentives for non–productivity-related outcomes. This is because the application of incentives signals that variation in performance is due to variation in motivation, which is most often not the case for non—productivity (e.g., quality)–related performance.
Designers of an incentive plan should consider the psychologic impact. Penalties are morale busters. Although loss aversion shows that penalties are twice as powerful compared to an opportunity for gain, they should be used sparingly if at all. People in general will seek to keep their current compensation system (status quo bias) if the impact is even to them (definition of even being twice as much upside as downside). Therefore, a morale-neutral, new incentive compensation plan must have more than twice the upside as compared to downside or it will not be a global positive. This requires more money, and few health systems or anesthesia departments have additional resources. Taking planned raises and applying those to productivity or other pay-for-performance plans will be accepted more than if money comes from existing salary to fund these incentives.
Gaming of results can and will occur, even among ethical human beings who convince themselves they are “correctly” reporting their performance. Appropriate checks on the system (e.g., audits) may be necessary. Paying for things people want to do or like to do or should do is a bad idea. It converts the social norm of expected performance to a market norm, either devaluing the activity, or moving a social norm to a market norm that demands ever more payment in perpetuity for doing something that was once given willingly.
The incentive must be carefully constructed so as not to likely yield unintended effects, and wishful thinking is no substitute for thoughtful design. Using surrogate measures like operating room utilization and hoping for greater profits, instead of rewarding contribution margin per allocated operating room hour (exactly what is intended), can yield perverse results (e.g., many cases with a poor payer mix, all that lose money). This is the classic example of “On the folly of rewarding A, while hoping for B.”95 Finally, a meaningful communications plan must accompany financial incentives or any scheme seeking to change physician behaviors, like all change management initiatives.105 That communication plan must also promote and reference intrinsic motivators.14,47,60,70,103
The principles presented in this review suggest there is still much work to be done to optimize physician motivation and to successfully improve healthcare quality. Financial incentives are only one part of a change management strategy. We suggest that cracking the whip may not be the best thing to do, certainly in regard to quality of care. In those cases, fixing systems that support faultless delivery of care and simply appealing to the powerful forces that have guided physicians’ altruism and self-image as a healer might be more effective.
The authors are grateful to Maureen Fitzpatrick, M.S.N., A.R.N.P.-B.C., Department of Anesthesiology, Miller School of Medicine, University of Miami, Miami, Florida, and Rudolph Davis, B.S., Miller School of Medicine, University of Miami, Miami, Florida, for their assistance with a comprehensive literature review.
Support was provided solely from institutional and/or departmental sources.
The authors declare no competing interests.