Background

Health-related quality of life is usually reported for specific rather than heterogeneous populations such as those treated in routine anesthesia practice. The 8-item short-form generic health-related quality-of-life questionnaire (SF-8) is a candidate instrument for this setting. The authors evaluated the feasibility, reliability, validity, and responsiveness to change of the Spanish version of SF-8 in a population-based surgical cohort.

Methods

Recruiting patients from a large population-based study of risk factors for pulmonary complications, before surgery, the authors administered the 1-week recall SF-8 to 2,991 patients undergoing nonobstetric elective or emergency surgery in 59 hospitals, each of which collected data on seven randomly assigned days in 2006. The SF-8 was administered again 3 months later. Reliability was evaluated using the Cronbach alpha coefficient and validity by comparing physical and mental component summary SF-8 scores with clinical variables. Responsiveness after surgery was evaluated using the standardized response mean.

Results

Cronbach alpha for the overall test was 0.92. Physical and mental component summary scores and all individual scores were lower (worse quality of life) in women (P < 0. 01) and decreased with age (P < 0.01). Preoperative scores were lower for those in worse clinical condition (higher body mass index, American Society of Anesthesiologists physical status class, or surgical risk scores), with preoperative respiratory symptoms, and in emergency situations (P < 0.01). The standardized response mean ranged from 0.1 to 0.5.

Conclusions

The SF-8 is a feasible, reliable, valid, and responsive instrument for assessing health-related quality of life in a broad-spectrum surgical population.

  • ❖ The short-form 8 (SF-8) of the health-related quality-of-life questionnaire has been suggested to follow the impact of medical interventions, but its reliability in a general population of surgical patients has not been tested

  • ❖ In approximately 3,000 surgical patients in Catalonia, Spain, the Spanish version of the SF-8 was feasible, reliable, valid, and responsive to assess the impact of surgery on health-related quality of life

THE number of surgical procedures has grown remarkably with advances in surgical and anesthetic techniques. At the same time, population aging has led to a higher prevalence of morbidity as patients undergo elective surgery at older ages.1–3The assessment of functional recovery and health-related quality of life (HRQL) has, therefore, gained importance. HRQL is a multidimensional construct that reflects a patient's perception of the impact of disease and treatment on physical, psychologic, and social function and on overall well being.4HRQL measurement furthers our understanding of preoperative patient status, recovery, and impact of the duration of hospital stay,5with consequent implications for planning resource utilization and predicting healthcare costs.6,7In surgical settings, routine assessment of HRQL can give both surgeons and anesthesiologists insight into the impact of their practices. In particular, the long-term consequences of anesthesia have begun to receive attention recently.8Further enquiry in this direction will be substantially facilitated if we ensure that the available tools that assess HRQL dimensions are reliable, valid, and efficient in a wide range of perioperative settings.9 

HRQL questionnaires can be categorized as either generic or disease-specific according to their content, scope, and target populations.10One of the most widely used tools is the 36-item short-form generic HRQL questionnaire (SF-36), first introduced in 1988. Although easier and more user-friendly instruments have been recommended as better for use in population-based studies,11we considered that the shorter 8-item version of the SF-36, known as the SF-8, would provide a candidate instrument for routine use in anesthesiology. Available in culturally comparable forms for use in 30 countries, this questionnaire has been highly recommended as a screening and monitoring tool for population studies and offers the advantage that scores can be readily compared with results from the 12-item or original 36-item versions.11Even though short questionnaires have been shown to have significantly less precision than well-constructed multiple-item ones,12an important strength of the SF-8—and one that can effectively counterbalance the problem of limited precision—is that it is short enough to be easily incorporated within preoperative and postoperative assessments in ordinary clinical settings.

Our aim was to assess whether the Spanish version of the SF-8 questionnaire was feasible, reliable, valid, and responsive to relevant clinical changes in a population-based surgical cohort in Catalonia, Spain. To our knowledge, this is the first report prospectively assessing HRQL with the SF-8 in a broad-spectrum surgical cohort under routine conditions of anesthesia and surgery.

The Clinical Research Ethics Committee of Hospital Germans Trias i Pujol, Badalona, Spain, approved the study, and all patients signed statements of informed consent for follow-up data collection and telephone contact. All patients received routine care; no research-related intervention was introduced.

Study Design

We conducted a prospective, multicenter, random-sample cohort study in patients undergoing a nonobstetric hospital surgical procedure under general or regional (neuraxial or plexus) anesthesia. The study was part of a large project applying population-based study methods to Assess Respiratory rIsk in Surgical patients in Catalonia (ARISCAT).13The current study included 3,350 eligible inpatients and outpatients undergoing surgery from 59 hospitals in a region of seven million inhabitants in eastern Spain. The participating centers perform 63% of all anesthetic procedures in the region, and the proportions of main types of surgical procedures and anesthetic techniques are similar in all, according to a cross-sectional survey of anesthetic practices in Catalonia completed in 2003.14Recruitment was carried out from January 2006 to January 2007, and follow-up data collection ended in April 2007. Patients were randomly selected in two steps, following a previously described recruitment method.15Each center was assigned seven randomly chosen days of the year, one for each week day, on which to recruit patients and gather data prospectively, using a system of stratification that ensured that few days involved collection from more than one center. Thus, on every day of the 12-month study period, data were being collected in at least 1 of the 59 centers. Each participating center considered all the patients who had to undergo a scheduled or emergency surgical procedure as eligible. With this sampling method, we sought to reflect the seasonal, weekly, and daily distribution of patients. The recording of preoperative interview and intraoperative parameters until hospital discharge was carried out by local members of the research team, all of whom were anesthesiologists.

To check vital status, functional status, and HRQL with the SF-8, a structured phone survey was carried out 3 months after surgery, a moment corresponding to the expected average time needed for physical recovery. In addition, the National Health Service Death Register was inspected to confirm dates. Telephone operators were blinded to procedures and outcomes.

A short questionnaire on demographic characteristics and type of surgery (scheduled or emergency) was filled in for those patients who declined to take part in the study to ascertain whether nonresponders were different from participants.

Data Collection

Each local research team was made up of anesthesiologists who were previously trained to fill in a structured preoperative questionnaire. “Hot pursuit” methods (starting from admission) were used to ensure completeness of records. A centralized database and applications for remote data recording were developed, incorporating quality control algorithms to validate online data entry and identify missing data. Hypertext Transfer Protocol Secure software (Internet Society, Reston, VA) was used for data transfer protection. A data manager checked entries and asked local teams to confirm the completeness of records. An expert in using the International Classification of Diseases-9th Clinical Modification coded all diagnoses and procedures at the end of the study.

To evaluate the quality of recruitment and data collection, independent blinded observers audited the medical records of a random sample of 5% of patients. It was found that the eligibility criteria were properly applied in all the audited centers.

Population

All patients at least 18 yr of age on the surgical lists for both emergency and elective procedures at each hospital were invited to participate on the randomly selected days, subject to exclusion of patients who were (1) undergoing an obstetric procedure; (2) undergoing local or peripheral nerve anesthesia administered by a surgeon, even if monitoring and/or sedation were needed; (3) undergoing a diagnostic or therapeutic procedure outside the operating room; (4) requiring further surgery because of an in-hospital postoperative complication that occurred before the randomly selected study day; (5) undergoing a transplant; or (6) already intubated.

SF-8 Health Survey

The SF-8 health survey is the most recent short version of the SF-36 questionnaire. This 8-item instrument measures the dimensions of general health, physical functioning, role-physical, bodily pain, vitality, social functioning, mental health, and role-emotional. SF-8 scales and summary measures call for norm-based scoring methods where the means, variances, and regression weights have come from studies in the U.S. general population. Norm-based scoring methods are used. Values more than or less than 50 can be interpreted as better or worse than expected for the general population, and a score difference of 10 points reflects 1 SD. This scoring method facilitates comparisons between different versions of the various short-form questionnaires, all three of which also provide physical component summary (PCS) and mental component summary (MCS) scores.11Culturally comparable versions of the SF-8 have been developed for use in 30 countries, including Spain, after a multistage process of translation, back-translation, and cultural adaptation,11and numerous studies have validated the questionnaire in specific patient populations.16–19Given that this study aimed to validate the SF-8 for assessing the impact of a broad spectrum of surgical interventions on HRQL at 3 months, we considered that the 1-week recall form (rather than 24-h or 4-week recall forms) would be the most appropriate one to use. An example of the English version of the 1-week recall SF-8 questionnaire is included in  appendix 2.

Variables

Before surgery, we assessed age, gender, body mass index, employment status (active and inactive), type of admission (inpatient and outpatient), type of surgery by scheduling (elective or emergency) and by main surgical group (orthopedic, general, and others), type of anesthesia (general, central neuraxial block, general plus central neuraxial block, and peripheral block administered by an anesthesiologist), American Society of Anesthesiologists (ASA) physical status class, respiratory symptoms (cough, sputum production, wheezing, and dyspnea), arterial oxygen saturation measured by pulse oximetry, and preoperative SF-8 assessment. During surgery, we stratified surgical risk/invasiveness using the Silverman-Holt Aggregate Preoperative Evaluation instrument,20which is based on the diagnostic codes of the International Classification of Diseases–9th Clinical Modification and assigns five risk categories from lower to higher risk. Postoperative variables recorded were length of hospital stay, in-hospital postoperative complications (respiratory, cardiac, renal, hepatic, hematologic, neurologic, and/or infectious events), and death. During the 3-month follow-up telephone survey, we recorded vital status (alive vs.  dead), employment status (active vs.  inactive), and quality of life (postoperative SF-8).

Statistical Analysis

Our study is a secondary data analysis related to a large-sample, population-based dataset created for another purpose, the evaluation of surgical risk.13The statistical power calculated for the main study13was assumed to be adequate for this SF-8 validation study.

To first assess feasibility, the percentage of patients with at least one missing value per item was calculated. Floor and ceiling effects (percentage of patients with the lowest and the highest possible scores, respectively) were obtained for each of the SF-8 items and for PCS and MCS scores. Following well-established recommendations,21it was assumed that these scores should vary less than 15%. The reliability of the SF-8 was evaluated in terms of internal consistency by the Cronbach α coefficient, computed with preoperative responses. This statistic is a measure of the degree of homogeneity among items in a dimension. An α coefficient of at least 0.7 or higher is recommended as the usual indicator of reliability.22 

To evaluate the construct validity, we explored the associations between the preoperative SF-8 scores and patients' demographic (age and gender) and clinical data (preoperative arterial oxygen saturation measured by pulse oximetry and ASA physical status), and length of hospital stay. First, we hypothesized that the correlation between clinical variables and SF-8 scores would be low. Second, we expected that SF-8 scores would be worse among more severely ill patients (inpatients, patients undergoing emergency surgery, those with higher ASA scores and at higher surgical risk, and those reporting preoperative respiratory symptoms) and more strongly associated with ASA physical status than with surgical risk. Our reasoning was that ASA measures the patient's overall health status rather than risk that is specific to the intervention (reflected by the invasiveness classification).

Responsiveness to change is the extent to which an instrument can detect changes over time. To measure this property, changes in preoperative and postoperative SF-8 scores were assessed. Patient characteristics and preoperative SF-8 scores were compared between subgroups of patients who did and did not complete follow-up. The responsiveness of the questionnaire was determined using the standardized response mean (SRM) calculated as the mean change in PCS and MCS scores divided by the SD.22SRMs of 0.2, 0.5, and 0.8 correspond to small, medium, and large changes, respectively, in health status.23Results from the subgroup of patients having both preoperative and postoperative SF-8 assessments were used to identify variables that could best capture the impact of surgery on HRQL at three moments. The initial moment, at the time the intervention took place, was reflected by the surgical risk/invasiveness category. The early postoperative period was reflected by in-hospital complications (respiratory, cardiac, renal, hepatic, hematologic, neurologic, and/or infectious events). The early period of at-home recovery was reflected by whether a patient had returned to work or not (assessed in patients who were employed before surgery).

Pearson correlation coefficients were used to explore the associations between continuous variables. For categorical variables, we used the analysis of variance or the t  test for paired or independent groups. Data are presented as mean (SD) or n (%). All analyses were performed using SPSS version 15.0 (SPSS Inc., Chicago, IL).

Figure 1shows a flowchart of patient recruitment, loss, and enrollment. Of 3,350 eligible patients, 359 were nonresponders (response rate, 89.3%): 185 (5.5%) declined to participate, 103 (3.1%) had significant problems with language (non-Spanish–speaking immigrants) or cognitive problems, and 71 (2.1%) were excluded for unrecorded reasons (probably related to emergency status). Nonresponders were older (59 ± 21 yr vs.  56 ± 19 yr, P = 0.032) and more likely than participants to have undergone emergency surgery (31% vs.  12%, P < 0.001). Of the 2,991 patients included, 76 (2.5%) did not complete the preoperative SF-8. Cross-sectional validity (including feasibility, reliability, and construct validity) was, thus, assessed in 2,915 patients, 641 (22%) of whom did not complete the postoperative SF-8 questionnaire. Fifty-nine (1.97%) included patients who died during the follow-up period; 33 (1.10%) of these patients died before discharge and 26 (0.87%) died between discharge and the telephone survey. The median time elapsed from the surgical intervention to death was 47 days (range, 3–97 days). Responsiveness to change was, thus, assessed in the 2,274 surviving enrolled patients.

Fig. 1. Patient flowchart. SF-8 = 8-item short-form generic health-related quality-of-life questionnaire.

Fig. 1. Patient flowchart. SF-8 = 8-item short-form generic health-related quality-of-life questionnaire.

Close modal

A small proportion of patients (76 [2.5%]) did not complete the preoperative SF-8 (74 nonresponders plus 2 with missing items). These patients tended to be older and to have undergone an emergency procedure, have a higher ASA score, higher level of surgical risk, and not be actively employed, with statistically significant differences between the two groups in these respects (table 1). The 641 patients (22% of those included) who did not have complete 3-month follow-up data (the 59 [1.9%] who died plus the 582 [20%] lost to follow-up) also tended to be older and to have undergone an emergency procedure, have a higher ASA score, higher level of surgical risk, and not be actively employed. In addition, noncompleters at this time tended to be males (table 1). Although the differences observed are most likely due to the large sample size, this means that it is not possible to rule out a selection bias, and this must be taken into account when drawing conclusions. Noteworthy patient characteristics in table 1are that 50.5% of patients in the cohort were men, and that the mean age was 57.8 ± 28.3 yr. The median length of hospital stay for the inpatient cohort was 3 days (range, 1–168 days).

Table 1.  Patient Characteristics

Table 1.  Patient Characteristics
Table 1.  Patient Characteristics

The distribution of SF-8 scores is summarized in table 2. Floor effects were low for all dimension scores, whereas ceiling effects exceeded 15% for most scales. However, both floor and ceiling effects for the summary scores (MCS and PCS) were very low (at 0.2 and 5.9%, respectively). The Cronbach α level for the overall test was 0.92, confirming its consistency.

Table 2.  Distribution of Preoperative SF-8 Scores, with Floor and Ceiling Effects

Table 2.  Distribution of Preoperative SF-8 Scores, with Floor and Ceiling Effects
Table 2.  Distribution of Preoperative SF-8 Scores, with Floor and Ceiling Effects

Construct Validity

Preoperative SF-8 summary scores were inversely correlated with age, ASA, and length of hospital stay but coefficients were low (r  between PCS and each variable, respectively: −0.22, −0.29, and −0.23; r  between MCS and each variable: −0.16, −0.25, and −0.18), meaning that lower SF-8 summary scores (worse HRQL) were poorly associated with older age, higher (worse) ASA physical status, and longer hospital stay. Preoperative arterial oxygen saturation measured by pulse oximetry and SF-8 scores correlation was positive but also low, such that patients with lower oxygen saturation levels also had lower (worse) HRQL scores (r  between PCS 0.2; MCS 0.14).

Men had higher mean preoperative PCS and MCS scores than women (PCS: 45.9 [11.6]vs.  44.1 [11.8], P < 0.01; MCS: 50.8 [10.5]vs.  47.3 [11.3], P < 0.01). PCS and MCS scores decreased linearly with increasing age in the overall cohort (P < 0.01). Gender and age group differences were maintained within all individual SF-8 scales (table 3).

Table 3.  SF-8 Scores According to Preoperative Patient Profile

Table 3.  SF-8 Scores According to Preoperative Patient Profile
Table 3.  SF-8 Scores According to Preoperative Patient Profile

Table 3.  Continued

Table 3.  Continued
Table 3.  Continued

Patients with higher body mass index and ASA values had lower preoperative scores on all SF-8 scales and on summary scores (P < 0.01). In addition, more severely ill patients (inpatients, patients undergoing emergency surgery, those with higher ASA scores and at higher surgical risk, and those reporting preoperative respiratory symptoms) had significantly lower preoperative SF-8 (P < 0.01) scores (table 3). It is worth noting that, regardless of the variable, MCS scores were always higher than the corresponding PCS scores, with vitality, social functioning, and mental health scales having the highest values (P < 0.01). Only the mental health subscale differences were nonsignificant between scheduled and emergency surgery (P = 0.09).

Responsiveness to Change

The importance of surgical risk was evaluated in the 2,274 (76%) patients with complete preoperative and postoperative SF-8 assessments. A substantial proportion of these patients showed some improvement in HRQL after surgery, 43 and 47% for the PCS and MCS variables, respectively, and a significant association was found between changes in PCS and MCS scores and surgical risk category. Specifically, less improvement was seen in SF-8 scores among patients at lower risk (categories 1 and 2) in comparison with those at higher risk (categories 3–5) (P < 0.01). The SRM was small for the PCS variable and did not increase linearly with risk. For the MCS scores, however, the SRM tended to be moderate, and there were greater gains among patients in higher risk categories (table 4).

Table 4.  SF-8 Mean Changes after Surgery and SRM, According to the Preoperative Profile of Patients Who Completed both Preoperative and Postoperative Assessments

Table 4.  SF-8 Mean Changes after Surgery and SRM, According to the Preoperative Profile of Patients Who Completed both Preoperative and Postoperative Assessments
Table 4.  SF-8 Mean Changes after Surgery and SRM, According to the Preoperative Profile of Patients Who Completed both Preoperative and Postoperative Assessments

In-hospital postoperative complications (respiratory, cardiac, renal, hepatic, hematologic, neurologic, and/or infectious events) were evaluated in the subgroup of hospitalized patients, 1,856 (62%) inpatients with complete assessments (both preoperative and postoperative SF-8 data). In this group, 193 (10%) patients had at least one postoperative complication. Patients with complications showed significantly lower preoperative PCS and MCS scores than patients without complications (P < 0.01) (data not shown) and tended to have greater changes in the PCS and MCS variables (ΔPCS: 3.0 for patients with complications vs.  1.4 for patients without complications, P > 0.05; ΔMCS: 4.7 vs.  3.5, respectively, P > 0.05), but the differences did not reach statistical significance and the SRM values were small.

Change in employment status was recorded for the 1,075 patients (36%) who were employed before surgery. Those who reported returning to work 3 months after surgery had significantly higher preoperative PSC and MSC scores (P < 0.01) (data not shown) and greater change in those summary scores (ΔPCS: 3.2 vs. −1.4, P < 0.01; and ΔMCS: 2.1 vs.  1.1, P > 0.05, for the active vs.  inactive groups, respectively). The SRM was small and always larger in the group returning to work after surgery (table 4).

This report describes the first study using the SF-8 health survey to assess HRQL in a large heterogeneous surgical population in which a broad range of anesthetic techniques were used, representative of routine clinical practice. The results show that use of the SF-8 is feasible in this setting, providing reliable, valid, and responsive measurements.

The low percentage of nonresponse and missing values (2.5%), the low floor effect (i.e. , the proportion of respondents with the worst health status according to responses), and the high reliability demonstrated in this study show that patients responded consistently to items in the SF-8. Although the ceiling effects (i.e. , the proportion of respondents with the best level of health as measured by the questionnaire) was high for the individual SF-8 dimensions, the component summaries displayed very satisfactory ceiling effects (well below standard recommendations21at 0.2 and 5.9% for the MCS and PCS, respectively). The evidence we provide of the construct validity of the SF-8 were found in a broad-ranging surgical population. Age and gender variations were consistent with previous reports assessing HRQL with the SF-3624,25and SF-1225in the Spanish general population. As expected, our surgical cohort had mean summary component scores that were slightly lower than those of the Spanish general population derived from the SF-36 (differences of 5.38 points for the PSC and 5.24 points for the MSC for men, and 0.73 and 0.78 points, respectively, for women). Our patients' SF-8 component scores were also lower than SF-12 scores (differences of 4.81 points for the PSC and 4.94 points for the MSC for men, and 1.32 and 1.27 points, respectively, for women). Worse physical health status among patients undergoing surgery compared with the general patient population probably explain these differences, although the reason why larger differences were observed for men deserves further analysis. Our findings for women, who uniformly had lower component scores than men, and the observation that HRQL scores decreased linearly with increasing age also support the construct validity of the Spanish version of SF-8, as these patterns are also reported in the literature.26,27 

We hypothesized that the correlation between SF-8 scores and clinical variables would be low, as they expressed different health traits. However, construct validity expectations were met, given that the SF-8 clearly identified patients according to clinical status (ASA class) and surgical risk/invasiveness category and the presence or absence of respiratory symptoms. More severely ill patients, those requiring hospital admission, emergency surgery, or showing respiratory symptoms are expected to feel that their health is worse, and their impression should be expressed by lower preoperative SF-8 scores pertaining to both physical and mental components. These observations are consistent with reports of worse HRQL scores in association with higher ASA and surgical risk categories.4,27It is important to note that, as expected, SF-8 scores decreased linearly with worsening physical status than did surgical risk.

The responsiveness of the SF-8 to change was assessed by measuring PCS and MCS score differences 3 months after surgery and by calculating the SRM. On average, patients showed some HRQL improvement. Although the mean increase in PSC score was small, greater mean improvement in the MSC score was observed. Patients in higher risk groups showed greater improvement compared with lower risk groups. This phenomenon may be comparable with a pattern known as “reframing” that has been observed in patients treated for cancer.28Accordingly, more severely ill patients in higher risk classes might have different expectations and be more optimistic on recovery. In a similar manner, our patients with postoperative complications presented lower preoperative SF-8 scores and felt that they had improved more than those without complications, and this was reflected in both PCS and MCS scores after recovery. Postoperative complications included very different events, however, rendering it difficult to develop a single meaningful HRQL-relevant variable that would be applicable for all surgical procedures. Therefore, the association between postoperative HRQL and medical complications could not be evaluated properly. Because these patients also presented lower preoperative SF-8 scores, regression to the mean may also explain these patterns of change. Variables far from the mean on the first assessment will tend to be closer to the mean on the second, in relation to random effects on variance.29 

The SMR findings show that the changes in PSC and MSC values can be classified as reflecting small and moderate improvements in health status, respectively, confirming that 3 months after surgery, the patients' health status had changed. Whether this trend would be different if postoperative assessment had taken place at a different time, allowing less or more time for physical recovery to take place is unknown. However, other studies assessing HRQL after cardiac surgery have been unable to demonstrate differences 1 month after surgery and, consistent with our findings, have succeeded in demonstrating improvement at 3 months.30 

Score changes over time were evaluated in the subgroup of patients who were actively employed before surgery, because it was hypothesized that return to work within 3 months would serve as a proxy for patient recovery and HRQL improvement. Returning employed patients indeed reported experiencing improvement in HRQL as reflected in the PCS score. However, the postoperative SF-8 scores in the inactive group after 3 months were even lower than the preoperative scores in the employed patients, suggesting that those patients not returning to work had a worse baseline health status.

Certain limitations should be taken into consideration when interpreting the results of this study. First, the possibility of selection bias cannot be ruled out, given that 11% of the patients in the original population-based sample and 20% in the longitudinal subanalysis (responsiveness to change) could not be evaluated. Similar percentage ranges have been reported in surgical cohorts evaluating elderly patients,31whose ability to cooperate can become compromised, although lower rates have been reported in studies that exclude high risk population groups.32Our study used a population-based sample and, understandably, a proportion of older patients or those with more severe disorders was lost to follow-up. Therefore, we surmise that if more optimistic and cooperative patients had participated in our study, they might have given higher SF-8 responses and led us to overestimate improvement. Taking into account the large sample size in our study, however, we judge that the effect on inferences regarding the psychometric properties of the instrument is minimal. A second limitation is that the test was not self-administered, and the length of time needed to respond to questions was not recorded. Although the interviewers may have conditioned patients' responses to some extent,11it was decided that it would be better to use the same preoperative and postoperative methods of data collection, given that the 3-month survey would be carried out by means of a telephone interview. Conversely, a strength of the study, with regard to postoperative data collection, is that the interviewers were blinded to preoperative survey results. In addition, although interviewer-administered and self-administered responses may vary somewhat,33our study is likely to reproduce the actual conditions of routine use of the SF-8 in surgical settings. Finally, the main diagnostic category, surgical procedure, and patient comorbidity have not been taken into account to adjust the results. Our reasoning was, however, that to some extent, this aspect had already been accounted for by the ASA physical status scores.

There are also several statistical issues that should be taken into consideration. As noted, this is a secondary data analysis from a large sample size dataset, and the statistical power was calculated for the main study13and not for the current analyses. We nevertheless consider our sample size to be adequate for the purpose of SF-8 validation. Although we implemented multiple comparisons, we did not apply tests to adjust for such multiplicity. Our approach included a more appropriate inference strategy, specifically the use of an effect size statistic, the SRM.22This statistic provides information which is less influenced by the sample size, and it is, therefore, complementary to the statistical testing of the observed differences.21In addition, commonly accepted guidelines for interpretation of the magnitude of the SRM are available in the literature.34 

In summary, we have evaluated the Spanish version of the SF-8 in a large general surgical population, in conditions reflecting routine practice in anesthesiology. Confirming the original instrument's psychometric properties, we have found that this version of the SF-8 is feasible, reliable, valid, and responsive. Thus, the 1-week recall version of this short tool, which is easy to use in perioperative clinical settings such as preoperative assessment and follow-up evaluations, can be used confidently, giving results that will be comparable with those available in the literature.

The authors thank Mary Ellen Kerans, M.A. (Freelance Editor, Barcelona, Spain), for revising the English language usage in some versions of the manuscript.

1.
Ekstein M, Gavish D, Ezri T, Weinbroum AA: Monitored anaesthesia care in the elderly: Guidelines and recommendations. Drugs Aging 2008; 25:477–500
2.
Silvay G, Castillo JG, Chikwe J, Flynn B, Filsoufi F: Cardiac anesthesia and surgery in geriatric patients. Semin Cardiothorac Vasc Anesth 2008; 12:18–28
3.
Sprung J, Gajic O, Warner DO: Review article: Age related alterations in respiratory function—anesthetic considerations. Can J Anaesth 2006; 53:1244–57
4.
Guyatt GH, Feeny DH, Patrick DL: Measuring health-related quality of life. Ann Intern Med 1993; 118:622–9
5.
Myles PS, Weitkamp B, Jones K, Melick J, Hensen S: Validity and reliability of a postoperative quality of recovery score: The QoR-40. Br J Anaesth 2000; 84:11–5
6.
Wachtel RE, Dexter F, Lubarsky DA: Financial implications of a hospital's specialization in rare physiologically complex surgical procedures. Anesthesiology 2005; 103:161–7
7.
Dexter F, Blake JT, Penning DH, Lubarsky DA: Calculating a potential increase in hospital margin for elective surgery by changing operating room time allocations or increasing nursing staffing to permit completion of more cases: A case study. Anesth Analg 2002; 94:138–42
8.
Sessler DI: Long-term consequences of anesthetic management. Anesthesiology 2009; 111:1–4
9.
Chassany O, Sagnier P, Marquis P, Fullerton S, Aaronson N: Patient-reported outcomes: The example of health-related quality of life—a European guidance document for the improved integration of health-related quality of life assessment in the drug regulatory process. Drug Inf J 2002; 36:209–38
10.
McDowell I, Newell D: Measuring Health: A Guide to Rating Scales and Questionnaires, 2nd edition. New York, Oxford University Press, 1996, pp 10–46
New York
,
Oxford University Press
11.
Ware JE, Kosinki M, Dewey J, Gandek B: How to Score and Interpret Single-Item Health Status Measures: A Manual for Users of the SF-8 Health Survey. Boston, QualityMetric Inc., 2001, pp 4–8
Boston
,
QualityMetric Inc
12.
McHorney CA, Ware JE Jr, Rogers W, Raczek AE, Lu JF: The validity and relative precision of MOS short- and long-form health status scales and Dartmouth COOP charts. Results from the medical Outcomes Study. Med Care 1992; 30:253–65
13.
Mazo V, Briones Z, Canet J, Paluzie G, Cobo E: Risk factors for postoperative pulmonary complications in general surgical population in Catalonia, Spain (abstract). Eur J Anaesthesiol 2008; 25(suppl 44):229
14.
Sabate S, Canet J, Gomar C, Castillo J, Villalonga A: Cross-sectional survey of anaesthetic practices in Catalonia, Spain. Ann Fr Anesth Reanim 2008; 27:371–83
15.
Clergue F, Auroy Y, Pequignot F, Jougla E, Lienhart A, Laxenaire MC: French survey of anesthesia in 1996. Anesthesiology 1999; 91:1509–20
16.
Lefante JJ Jr, Harmon GN, Ashby KM, Barnard D, Webber LS: Use of the SF-8 to assess health-related quality of life for a chronically ill, low-income population participating in the Central Louisiana Medication Access Program (CMAP). Qual Life Res 2005; 14:665–73
17.
Turner-Bowker DM, Bayliss MS, Ware JE Jr, Kosinski M: Usefulness of the SF-8 Health Survey for comparing the impact of migraine and other conditions. Qual Life Res 2003; 12:1003–12
18.
Bost JE, Williams BA, Bottegal MT, Dang Q, Rubio DM: The 8-item Short-Form Health Survey and the physical comfort composite score of the quality of recovery 40-item scale provide the most responsive assessments of pain, physical function, and mental function during the first 4 days alter ambulatory knee surgery with regional anesthesia. Anesth Analg 2007; 105:1693–700
19.
Sugimoto M, Takegami M, Suzukamo Y, Fukuhara S, Kakehi Y: Health-related quality of life in Japanese men with localized prostate cancer: Assessment with the SF-8. Int J Urol 2008; 15:524–8
20.
Holt NF, Silverman DG: Modeling perioperative risk: Can numbers speak louder than words? Anesthesiol Clin 2006; 24:427–59
21.
McHorney CA, Tarlov AR: Individual-patient monitoring in clinical practice: Are available health status surveys adequate? Qual Life Res 1995; 4:293–307
22.
Hays RD, Anderson R, Revicki D: Psychometric considerations in evaluating health-related quality of life measures. Qual Life Res 1993; 2:441–9
23.
Guyatt G, Walters S, Norman G: Measuring change over time: Assessing the usefulness of evaluative instruments. J Chron Dis 1987; 40:171–8
24.
Alonso J, Regidor E, Barrio G, Prieto L, Rodríguez C, de la Fuente L: Population reference values of the Spanish version of the Health Questionnaire SF-36. Med Clin (Barc) 1998; 111:410–6
25.
Vilagut G, Valderas JM, Ferrer M, Garin O, Lopez-Garcia E, Alonso J: Interpretation of SF-36 and SF-12 questionnaires in Spain: Physical and mental components. Med Clin (Barc) 2008; 130:726–35
26.
Quintana JM, Arostegui I, Oribe V, López de Tejada I, Barrios B, Garay I: Influence of age and gender on quality-of-life outcomes after cholecystectomy. Qual Life Res 2005; 14:815–25
27.
Guallar-Castillón P, Sendino AR, Banegas JR, López-García E, Rodríguez-Artalejo F: Differences in quality of life between women and men in the older population of Spain. Soc Sci Med 2005; 60:1229–40
28.
Anthony T, Jones C, Antoine J, Sivess-Franks S, Turnage R: The effect of treatment for colorectal cancer on long-term health-related quality of life. Ann Surg Oncol 2001; 8:44–9
29.
Bland JM, Altman DG: Some examples of regression towards the mean. BMJ 1994; 309:780
30.
Myles PS, Hunt JO, Fletcher H, Solly R, Woodward D, Kelly S: Relation between quality of recovery in hospital and quality of life at 3 months after cardiac surgery. Anesthesiology 2001; 95:862–7
31.
Moller JT, Cluitmans P, Rasmussen LS, Houx P, Rasmussen H, Canet J, Rabbitt P, Jolles J, Larsen K, Hanning CD, Langeron O, Johnson T, Lauven PM, Kristensen PA, Biedler A, van Beem H, Fraidakis O, Silverstein JH, Beneken JE, Gravenstein JS: Long-term postoperative cognitive dysfunction in the elderly ISPOCD1 study. ISPOCD investigators International Study of Post-Operative Cognitive Dysfunction. Lancet 1998; 351:857–61
32.
Monk TG, Weldon BC, Garvan CW, Dede DE, van der Aa MT, Heilman KM, Gravenstein JS: Predictors of cognitive dysfunction after major noncardiac surgery. Anesthesiology 2008; 108:18–30
33.
Kaplan RM, Sieber WJ, Ganiats TG: The quality of well-being scale: Comparison of the interviewer-administered version with a self-administered questionnaire. Psychol Health 1997; 12:783–91
34.
Streiner DL, Norman GR: Health Measurement scales: A Practical Guide to Their Development and Use, 3rd edition. Oxford, Oxford University Press, 2003, pp 117
Oxford
,
Oxford University Press

Appendix 1. Participants

Lluís Gallart, M.D., Ph.D., and Jordi Castillo, M.D. (Staff Anesthetists, Department of Anesthesia, Hospital del Mar, Barcelona, Spain); Valentín Mazo, M.D. (Staff Anesthetist, Department of Anesthesia, Hospital Universitari Germans Trias i Pujol, Badalona, Spain); Sergi Sabaté, M.D., Ph.D. (Staff Anesthetist, Department of Anesthesia, Fundació Puigvert, Barcelona, Spain); Juan Manuel Campos, M.D. (Staff Anesthetist, Department of Anesthesia, Hospital de Sant Pau, Barcelona, Spain); Joaquín Sanchis, M.D., Ph.D. (Professor, Department of Pneumology, Hospital de Sant Pau, Barcelona, Spain); Guillem Paluzie, M.D. (Head of Department of Hospital Medical Records, Corporació de Salut del Maresme i La Selva, Calella, Catalonia, Spain); Erik Cobo, Ph.D. (Professor, Department of Statistics and Operational Research, Universitat Politècnica de Catalunya, Barcelona, Spain).

Appendix 2.  Example of an 8-item Short-form Health-related Quality-of-Life Questionnaire (English-language Version)

Appendix 2.  Example of an 8-item Short-form Health-related Quality-of-Life Questionnaire (English-language Version)
Appendix 2.  Example of an 8-item Short-form Health-related Quality-of-Life Questionnaire (English-language Version)