What We Already Know about This Topic
Patient-centered assessments of medical care and recovery are gaining in use and importance, but a validated assessment tool for patients undergoing surgery with regional anesthesia is not available.
What This Article Tells Us That Is New
In a stepped process involving interviews, statistical approaches, and internal and external validation, a 19-item questionnaire was developed and validated for patient assessment of regional anesthesia during surgery.
Items fell into five categories: Information, Attention, Waiting, Discomfort, and Pain.
Patient-reported outcomes have gained widespread use since their first description 30 years ago.1These measurements report not only symptoms but also a combination of physical, mental and social health including role functioning, cognitive capacity, general perceptions of well-being, and patient satisfaction.2Reported outcome measures are used for several purposes and serve as an aid in clinicians’ decision-making process.3Anesthesia and the perioperative period increase the complexity of evaluating patient-based measures because of the short time interval combined with high emotional tension and confusing drugs effects. These difficulties may explain weaknesses with some existing tools: mainly they rely on expert instead of patient views,4are not metrically sound because their main properties are not proved,5or make no distinction between regional and general anesthesia.6In 2005, the Evaluation du Vécu de l’Anesthésie Générale (EVAN-G) questionnaire was developed for use in general anesthesia,7in accordance with the current methodology.8
Regional anesthesia allows surgery without loss of consciousness and could be thought to improve patient comfort. However, the real experience of patients undergoing regional anesthesia has not yet been examined with a validated tool; rather, a simple visual analog scale (VAS) is commonly used, giving artificially high ranking,9which is not very informative for improving quality of care.
The construction of a scale forecasted to assess patient experience has to follow some specific protocols of psychometric questionnaire construction. To construct a preliminary questionnaire integrating most patient concerns in the perioperative period, we performed individual interviews within the 48 h after surgery. This phase of item generation was carried out until no other theme emerged in a real-time content analysis.
It defined themes to explore and selected the first set of 55 questions assessing the impact of the perioperative period on patients’ experience referring to the theory of expectations,10which defines satisfaction as the discrepancy between expectations and current experience.
This 55-item questionnaire was applied to a population of 1,215 patients: 238 patients undergoing regional anesthesia and 977 patients undergoing general anesthesia. Twenty-eight items were selected from this study: 26 items suitable for all types of anesthesia and two specific items only for regional anesthesia without loss of consciousness. The 26-item structure represented the final EVAN-G questionnaire, published in 2005.7The 28-item structure represented the Evaluation du Vécu de l’Anesthésie LocoRégionale (EVAN-LR) pilot version, which reported high psychometric properties on an intermediate statistical analysis (not reported here), but was still too long to address the specific concern of everyday practice in regional anesthesia. Thus, a short-form questionnaire seemed to be more suitable for those kinds of procedures frequently involved in the fast track course. EVAN-LR pilot version undergone a second phase of validation to further reduce item numbers and report the properties of the scale obtained (fig. 1).
The purpose of this article was to report the main properties of a multidimensional self-reported questionnaire, termed EVAN-LR, specifically assessing the satisfaction of patients undergoing regional anesthesia.
Materials and Methods
EVAN-LR is a self-reported questionnaire comprising 19 items, structured in a global index and five nonweighted dimensions, each addressing several aspects of patient experience: Attention, Information, Discomfort, Waiting, and Pain. By “dimension” we encompass the different facets of patient experience, which could impact its satisfaction about healthcare process. As an example, items 18 and 19, which emerged from interviews with patients (see appendix 1), have been identified as belonging to the same dimension by factorial analysis, and we called it “Waitings” according to the concept being addressed. EVAN-LR does not need baseline assessment to be scored.
Items were answered, within the 48 h after the surgery, using a five-point Likert scale, defined from 1 to 5 as “much less than expected,”“less than expected,”“as expected,”“more than expected,” and “much more than expected.” All dimension scores were linearly transformed to a 0–100 scale, with 100 indicating the best possible level of satisfaction and 0 the worst. The global index score was computed as the mean of the dimension scores. No dichotomic value was described because the sample population was not large enough to be normative. A steering committee including nine anesthetists, two surgeons, one psychiatrist, and three public health doctors supervised the study.
The criteria for patient inclusion were: age more than 18 years, consent to participate in the study, elective surgery under regional anesthesia (except obstetric), and ability to understand and read French and to fill up a self-reported questionnaire within 48 h after regional anesthesia. The study was naturalistic; there was no impairment of the physician–patient relation. This study meets the requirements of the Declaration of Tokyo11and was the object of a statement to the French national commission on information technology and human rights (Comission Nationale Informatique et Libertés).12
Population and Data Collection
We included 390 patients collected from three university hospitals in southeastern France undergoing various surgical procedures exclusively under regional anesthesia; 238 of these patients were drawn from the previous set of 1,215 patients and 152 were included after an additional recruitment. This group participated in a specific validation phase for EVAN-LR. The purpose of this phase was to further reduce the number of items despite the high psychometric properties of the pilot version to improve applicability.
The Amsterdam Perioperative Anxiety and Information Scale (APAIS)13and Spielberger’s State Trait Anxiety Inventory (STAI)14were completed at consultation with the anesthesiologist. Sociodemographic and other clinical data, as well as the number of previous anesthesia incidences and duration of current anesthesia, were collected.
Patients were asked to self-report the 28 items questionnaire within 4–48 h after the surgery. VASs emphasizing various themes related to patient discomfort in the perioperative period (anxiety, pain, fear, discomfort, confidence in staff, ability to ask questions, be considered as a person, kindness of staff, quality of explanation, and overall satisfaction) were assessed at the time of EVAN-LR completion.
Item Selection and Validation of EVAN-LR
The primary objectives of the validation phase were to check that the questionnaire actually (“validity”) and accurately (“reliability”) measured the concept it has been designed for, that is, patient satisfaction. Assessment of the validity of an instrument is meant to evaluate the systematic error of measure (drift), whereas assessment of the reliability is meant to evaluate the random error of measure (scatter). Secondary objectives of the validation phase were to further reduce the number of items, to provide a shorter and usable instrument in routine clinical practice.
Item deletion was based on classical criteria including redundancy (interitem correlation), skewness of the distribution (floor and ceiling effect), or low response rate (over 20% missing data).
The questionnaire’s multidimensional structure was identified by principal component factor analyses with Varimax rotation,15interitem, item-dimension, and interdimension correlations (Pearson r ).
Each item was matched with its dimension and item-internal consistency was supported if the correlation was over the standard of 0.4 after overlap correction. If an item correlated better with its supposed dimension than with the others, we confirmed its discriminate validity.16For each potential dimension, internal consistency reliability was assessed by Cronbach α coefficient. A Cronbach α coefficient of at least 0.7 was expected for each scale.17Within each dimension the items whose deletion would lead to an α increase of at least 0.02 were candidates for deletion. The unidimensionality of each dimension was assessed using Rasch analyses.
The polytomous Rasch model fits into the broader context of Item Response Theory, for which evaluated traits or dimensions are equally linked to both responses and items properties of the patients. This mathematic model allows to assess the ability of items to measure several dimensions. In our study, this model allows to measure a “trait” (for instance, a dimension of EVAN-LR) through a process in which responses made to items by patients are scored and for which higher scores are intended to indicate increasing levels of attainment (level of satisfaction for instance).18We applied an extrapolation of the Rasch model, the Partial Credit Model, which uses threshold and discrimination parameters.19This model allows an exact empirical test of the hypothesis that response categories represent increasing levels of a latent attribute or trait, here the dimension. The scalability of each dimension was assessed by an indicator, the pattern of item goodness-of-fit statistics ranging between 0.7 and 1.2; this ensures that all items of the scale tend to measure the same concept, the dimension of EVAN-LR.
We assessed external validity by studying the relationships between potential dimensions of EVAN-LR and validated instruments such as APAIS, STAI, and specific VAS. The underlying assumption was that the dimension scores of the EVAN-LR would correlate better with scores of similar dimension from the other concurrent instruments than with dissimilar ones, assessing convergent validity. The discriminate validity of EVAN-LR was determined by dimension mean scores across patient groups that were expected to differ in their sociodemographic (age, sex) or clinical features (American Society of Anesthesiologists physical status, premedication, ambulatory) using ANOVA, Mann–Whitney U test, or Pearson correlation.
An open-ended question at the end of EVAN-LR checked the content validity by requesting patients to point out a missing domain that would have contributed to their experience. Acceptability was assessed by the computing level of missing data, which is an objective measure of acceptance in real-life questionnaire.15The validation analysis was not performed on records with more than 25% missing answers, to ensure data quality.
The questionnaire consists of 19 items, structured in five nonweighted dimensions, each exploring one site of patient experience and synthesized in a global index. The negatively worded items scores were reversed so that higher scores indicated higher level of satisfaction. The score of each dimension was obtained by computing the mean of the item rating of the dimension for each individual. If less than one half of the items were missing, mean of nonmissing items was substituted for scoring. All dimension scores were linearly transformed to a 0–100 scale, with 100 indicating the best possible level of satisfaction and 0 the worst. The global index score was computed as the mean of the dimension scores.
Analyses were performed using WINSTEPS version 3.42 (Computer Programs. Chicago, IL), MAP-R version 1.0, and IBM PASW Statistics version 17.0 software (IBM Corporation. New York, NY).
Only results of the final validation phase of EVAN-LR are reported, to avoid confusion with intermediate development of the questionnaire.
The process of item selection resulted in a final version comprising 19 items structured in five dimensions, depending on their content: Attention (4 items), Information (5 items), Discomfort (4 items), Waiting (2 items), and Pain (4 items) (see appendix 2). The steering committee ensured that the content of every dimension was meaningful and that the five-factor structure dealt with the major domains reported in the patients’ interviews and open-ended comments. This short form of 19 items explained 61.4% of the total variance. A subsample analysis has been made on orthopedics population showing similar characteristics (data not shown).
The 390 patients (table 1) included in the validation analysis underwent various surgical procedures under regional anesthesia: 107 orthopedic (27.4%), 144 hand (36.9%), 32 plastic (8.2%), 41 ophthalmologic (4.7%), 18 spine (4.6%), 11 digestive (2.8%), nine urologic (2.3%), and 28 other (7.2%) including gynecologic. Mean patient age was 53.5 ± 17.2 years; 318 (81.5%) received premedication and 152 were ambulatory patients (39%). Table 1reports other patient characteristics. APAIS mean global score was 7.2 ± 3.6 and STAI mean global score was 51.6 ± 3.0. Mean duration of anesthesia was 57.3 ± 37 min (median 50, range 0–240). Table 2reports mean scores by dimensions. The mean global index was 79 ± 15. The lowest mean dimension score was found for Information (65 ± 22) and the highest for Discomfort (87 ± 18).
Table 2reports items and dimensions scale characteristics. The overall scalability of EVAN-LR was satisfactory because, within each dimension, most items showed a good fit to the Rasch model, with no item showing an item goodness-of-fit statistics outside the acceptable range.
Item-internal consistency ranged from 0.30 to 0.75, supporting a high correlation between items and their corresponding dimension. Correlations between items and the other dimensions (item discriminate validity) ranged from 0.01 to 0.35, ensuring high discrimination capacity. Correlations between dimension scores were low to moderate, ranging from 0.17 to 0.51 (P < 0.001). Internal consistency of the five dimensions showed high reliability and construct validity: Cronbach α ranging from 0.60 to 0.88.
Convergent validity was explored by the level of correlation with other concurrent measures. Although several measures are statistically significant, the level of correlation is not strong, meaning that EVAN-LR does not assess the same trait than APAIS, STAI, or domains assessed by VAS. Nevertheless, we can assume the convergent validity of the scale because its actually related to what it should theoretically be related to. There was a correlation between APAIS (anxiety for anesthesia) and Discomfort (r = −0.316). STAI correlated with Pain dimension (r = 0.219). EVAN-LR dimensions tended to correlate more with their domain-related VAS: VAS assessing “be considered as a person” correlated with Attention (r = 0.224) and VAS assessing “confidence in staff” correlated with Waiting (r = 0.306). Table 3shows these results.
EVAN-LR poorly correlated with premedication or ambulatory surgery. Correlations of EVAN-LR among clinical groups met the assumptions expressed by the steering committee relying on clinical experience and literature analysis concerning patients’ experience (table 4). Female sex was associated with significantly lower Information score. Patients with American Society of Anesthesiologists physical status below II had a significantly lower Attention score. Patients older than 55 years showed higher satisfaction scores for all dimensions except Attention. Procedure time and number of previous anesthesia did not correlate with satisfaction level expressed by patients.
By acceptability we mean that the questionnaire is usable in a real perioperative framework thanks to the reduced number of questions and a well-understood phrasing. Missing values were low, ranging from 2.6 to 6.9% per dimension, confirming the good acceptability of the questionnaire.
EVAN-LR is the first psychometrically validated questionnaire that specifically assesses patient experience of the perioperative period surrounding regional anesthesia. To date there has been no instrument forecasted to score patient experience in regional anesthesia that relies on expectation theory and demonstrates high metric validity.9In fact, most studies relied on a qualitative method to score patient’s experience instead of a multidimensional validated questionnaire. This simplified approach usually gives an artificially high level of satisfaction.20Patients have their proper point of view, which often differs from physicians’ view. As healthcare consumers we must recognize the value of patients’ involvement in determining ways to evaluate health care, and take into account their experience itself.21The purpose of this study was to develop and validate, by following international guidelines,8a specific self-reported questionnaire assessing the perioperative satisfaction of patients undergoing regional anesthesia, resting on expectation theory. This approach allows a fairly accurate measure of satisfaction, although ideally it should be better to have at disposal a presurgery measure but this would make the evaluation process heavier.
The aim of EVAN-LR is to distinguish between various regional anesthesia procedures and processes and identify those associated with highest patient satisfaction. One strength of the scale is its applicability in a real perioperative framework because it is a short questionnaire of 19 items, with very few missing answers and can be rapidly completed. Thus, patients’ reported outcome may be used as a primary outcome, putting the healthcare consumer in control of the quality process.
A wide range of surgical procedures matching the population of patients undergoing regional anesthesia was selected to reflect clinical practice. As a result, orthopedic (including hand surgery) and ophthalmologic surgery were the most represented.22This allows us to extrapolate the results of this validation study to the population of patients undergoing regional anesthesia.
The phase of item generation of the EVAN-LR questionnaire was contemporary with EVAN-G, because our studies share the same approach–to focus on patients’ expectations. This framework of individual in-depth free-form interviews, instead of expert opinion or literature review, led to the generation of some specific concerns about the perioperative period surrounding anesthesia, such as attention or waiting. The consequences of staying alert during regional anesthesia were specifically addressed by two items. Regional anesthesia also comes with other specificities justifying a new validation phase and dimensional structuring of patients’ expectations. The questionnaire recently reported by Mui4assumed that the structure found in general anesthesia could be transposed into regional anesthesia despite not being specific for this kind of procedure. In fact, structuring of items into dimensions requires a true exploratory factor analysis for each version of a questionnaire, instead of a simple confirmatory factor analysis that only assumes some transposability of a scale made for another purpose while weakening content validity. Accordingly, the report presented here shows that the dimensional structure of EVAN-LR deeply differs from that of EVAN-G. Those differences would not have been raised by crossvalidation only. For example, in regional anesthesia the Privacy dimension vanished because of a weak internal validity, two items specifically emerged and address consciousness during surgery, nine items were suppressed because of redundancy, and all item dimensions, except Waiting and Information, have been modified.
One limitation of this study is that inclusion of patients was split in two time periods. This is due to the intermediate development of a pilot version of EVAN-LR. We sought to further reduce the number of items despite the high psychometric properties of the pilot version to improve applicability.
Acceptability of the final version was good, with less than 7% missing values for all dimension scores and only 1% refusal to complete the questionnaire, an advantage of having only 19 questions without sacrificing information. Interestingly, our 19-item questionnaire explained 61.4% of the total variance whereas the 30 items of the Patient Satisfaction with Perioperative Anesthetic Care questionnaire explained only 56.6%.4
EVAN-LR was administered up to 48 h after surgery. The perioperative period is specific because of emotional tension, surgical outcomes, and anesthesia drug effects over a short time interval. By restricting the questionnaire period to 48 h, we intended to weigh perception related to anesthesia over perceptions related to surgery, but with a risk of recall bias. However, we conducted test–retest in the EVAN-G study to ensure the satisfaction assessment linearity upon 1 month. For that purpose a subsample of 36 patients was evaluated by the mean of an intraclass correlation coefficient, twice with a 15-day interval. The stability of the EVAN-G was good with intraclass correlation coefficient ranging from 0.72 to 0.81.7Because the EVAN-LR has the same construct of EVAN-G we assume that it presents the same properties about reliability.
Factors that could influence patients’ experience were explored simultaneously with EVAN-LR validation. Patient anxiety is a major confounding factor that has been addressed, thanks to APAIS and STAI score. Correlations with those scores tend to validate EVAN-LR because anxiety may influence patient experience. However, anxiety, such as that measured by APAIS or STAI, and satisfaction, such as that measured by EVAN-LR are two different kind of things, explaining the low r value observed (table 3). The APAIS split patient anxiety into two dimensions: anxiety for anesthesia and anxiety for surgery; it also evaluated information desire. In our study, anxiety for anesthesia was stronger than anxiety for surgery, and the most anxious patients23expressed lower scores on Global Index, Discomfort, Pain, and Waiting dimensions. Moreover, information desire reported exactly the same tendencies, stressing the need to pay particular attention when interviewing patients24who are more anxious. As in the EVAN-G study, STAI score did not correlate with EVAN-LR and APAIS. One explanation could be that STAI is not specifically designed to assess anxiety during the perioperative period whereas APAIS is. This argument further emphasizes the specificity of this emotionally charged period.
Good medical communication, for example by changing anesthesiologists’ attitude to increase empathy, has already been reported to improve patient satisfaction23and brings other benefits such as increasing adherence to medical advice.25However, patient satisfaction is not only related to anesthesiologists’ individual behavior. In our study, confidence in staff probably played a key role in patient satisfaction because its VAS correlated most significantly with all EVAN-LR dimensions and had the strongest impact on Global Index. The “Be considered as a person” VAS also correlated well with all dimensions except Information, suggesting that the patients were able to distinguish between the communication skills and the technical quality of information.26Surprisingly, Pain VAS did not correlate with the Pain dimension of EVAN-LR. One explanation could be that in regional anesthesia, surgery is only feasible if the nerve block works perfectly. Thus the Pain VAS became a reflection of discomfort; in fact, it correlated only with the Discomfort dimension. This probably reflects a different perception of pain while controlled by regional analgesia, a hypothesis that would need to be tested further.
The purpose of our study was not to explore the link between demographics or clinical status and satisfaction. These data were only necessary to test the hypothesis and EVAN-LR validity. However, our results are consistent with data in the literature.27–29For example, Information dimension score was lower in women and patients below 55 years whereas patients above 55 years showed higher scores for Discomfort, Pain, and Global Index. Also, American Society of Anesthesiologists physical status below II was associated with a lower satisfaction in the Attention dimension and Global Index.
Our study permitted us to distinguish between ambulatory and inpatient surgery. Despite some published reports30,31we did not find a change in patients’ experience when premedication or ambulatory course were applied, but EVAN-G did. Widespread use of premedication in regional anesthesia32may explain why these differences were not significant since only 72 patients (18%) did not receive premedication, leading to a lack of power in the subset analysis. Nevertheless, the link between premedication and patient satisfaction remains unclear and is the subject of an on-going study.
Outpatients report better scores for Pain and Discomfort after general anesthesia.7We did not find these differences after regional anesthesia, confirming that these two dimensions did not explore the same domain in addition to being structured differently.
EVAN-LR is a novel tool assessing five domains of patients’ perception of the perioperative period surrounding regional anesthesia. We demonstrated validity and reliability of the scale. Compared with EVAN-G, the scale formerly validated for general anesthesia, this study showed that patients’ expectations deeply differ from general to regional anesthesia. The discrimination ability of EVAN-LR, assessing the whole care process from the preoperative visit to the postoperative period after regional anesthesia, makes it a well-specified auditing tool as well as a potential primary endpoint measure for clinical research.