Patient education materials produced by national anesthesiology associations could be used to facilitate patient informed consent and promote the discipline of anesthesiology. To achieve these goals, materials must use language that most adults can understand. Health organizations recommend that materials be written at the grade 8 level or less to ensure that they are understood by laypersons. The authors, therefore, investigated the language of educational materials produced by anesthesiology associations.
Educational materials were downloaded from the Web sites of 24 national anesthesiology associations, as available. Materials were divided into eight topics, resulting in 112 separate passages. Linguistic measures were calculated using Coh-Metrix (version 3.0; Memphis, USA) linguistic software. The authors compared the measures to a grade 8 standard and examined the influence of both passage topic and country of origin using multivariate ANOVA.
The authors found that 67% of associations provided online educational materials. None of the passages had all linguistic measures at or below the grade 8 level. Linguistic measures were influenced by both passage topic (F = 3.64; P < 0.0001) and country of origin (F = 7.26; P < 0.0001). Contrast showed that passages describing the role of anesthesiologists in perioperative care used language that was especially inappropriate.
Those associations that provided materials used words that were long and abstract. The language used was especially inappropriate for topics that are critical to facilitating patient informed consent and promoting the discipline of anesthesiology. Anesthesiology associations should simplify their materials and should consider screening their materials with linguistic software before making them public.
Knowledge of anesthesia among the general public is poor; although professional societies publish materials for patients to facilitate understanding and consent, the effectiveness of such measures is unknown.
Twenty-four national anesthesia society Web sites were explored, and 67% provided online information. All material was above grade 8 level (i.e., too complex for lay comprehension), especially text describing the anesthesiologist’s role. Such materials should be simplified and validated.
THE stated goals of national anesthesiology associations include setting standards of practice,1–5 professional education,1,2,4–7 and promoting the field of anesthesiology.1,2,4,6,8 Patient education materials that are produced by these associations are intended to help the public understand the contribution that anesthesiologists make to perioperative care.1,2,4,6,8
These materials might also be useful to facilitate patient informed consent. Studies have consistently found that some patients do not have adequate knowledge of their upcoming procedures or alternative treatments.9–11 This is sometimes due to anesthesiologists who do not disclose adequate information to patients.10,11 The prevalence of day surgeries also leaves less time for patients to consider the information that anesthesiologists provide. Randomized trials have shown that providing patients with written materials increases their recall of information related to their upcoming surgeries.12 Supplementing the consent process with preoperative access to educational materials could help mitigate omissions and reinforce understanding.
Association materials must be accessed and understood by laypersons in order to facilitate informed consent and promote the profession. The proportion of patients and their families who access materials from anesthesiology association Web sites is unknown. While one survey study reported that 4% of patients seek out anesthesia-related information online,13 approximately 234 million major surgical procedures are performed worldwide each year.14 Thus, the absolute number of patients seeking anesthesia-related information is not trivial. Regardless of how often these materials are accessed, associations have a responsibility to offer materials that have the highest potential to inform patients and the public.
In English-speaking countries, national literacy assessments indicate that 42 to 53% of adults have deficient literacy skills.15,16 These skills roughly correspond to a reading level of less than grade 8.17 As a result, influential health organizations18–20 have recommended that documents written for patients should use language between the grade 6 and 8 levels. Educational materials should conform to this guideline in order to achieve their intended aims.
Four previous studies have looked at the language of anesthesiology-related educational materials.21–24 They found the language to be overly complex, but the strength of these conclusions is undermined by study limitations. First, these studies have only investigated quantitative readability,21–24 a simple linguistic measure based only on word and sentence length. Other linguistic measures such as word imageability and familiarity are important predictors for the understanding of a text.25 Second, these studies only analyzed educational materials in their entirety.21–24 A single document often covers multiple topics. Information related to some of these topics is especially critical for patients to understand if the patient education materials are to achieve their intended goals. Arguably, passages that describe the risks of anesthetic treatments are most critical if materials are to facilitate informed consent. Passages that deal with the role of anesthesiologists in patient care are most critical if materials are to promote the profession of anesthesiology. Since linguistic measures may vary by topic, an analysis of documents that mix topics is less useful in assessing whether the materials are conducive to achieving their intended goals. We, therefore, examined the availability and linguistics of topically organized content from online patient educational materials that were generated by anesthesiology associations.
Materials and Methods
A computational linguistic analysis of online educational materials was conducted. This study did not require approval from a research ethics board as it examined documents in the public domain and did not involve human participants, animal participants, or biologic samples.
Data Collection and Classification
In October 2014, two investigators (D.G. and D.P.) accessed and collected all available educational materials from the Web sites of 24 associations in six English-speaking countries (Canada, United States, United Kingdom/Ireland, Australia, New Zealand, and South Africa; appendix 1). These associations were chosen based on several criteria. First, they were all national associations. Second, the associations were membership based, in contrast with anesthesia foundations and institutes. Third, the majority of their members were physician anesthetists. The associations were identified by a combination of Web searching and Web linking from unspecialized national anesthesiology associations.
Educational materials on the Web sites were identified based on subpage titles including “for patients,” “FAQs,” or “patient information” and other related terms. Information directed toward anesthesiologists such as information regarding upcoming events/conferences and procedures for joining the association were not included. Only textual materials were downloaded, with educational videos, audioclip, and infographics excluded. This resulted in multiple documents from each association, with varying amounts of thematic overlap and coverage.
To generate portions of representative text (passages) for each association that were topically organized, a multistep data reduction procedure was required. First, irrelevant text related to Web site navigation, contact information, hyperlinks, and references was removed from all documents. Next, text from each association was combined into a single document, resulting in a master document for each association. Finally, the text in the master document was divided into passages of nonoverlapping topics. To accomplish the final task, a classification scheme was created based on a review of the education materials (table 1). Its creation also involved a series of steps to ensure validity.
The same two investigators reviewed all educational materials and used inductive thematic analysis to create an exhaustive classification scheme. They identified eight distinct topics across all of the materials, with each association addressing a different number of these topics.
This classification scheme was vetted by a more senior investigator (A.V.).
Formal definitions for each category were created to facilitate classification.
Two investigators (D.G. and D.P.) classified 500 sentences to allow assessment of interclassifier reliability.
Once reliability was established to be strong and the irrelevant text was removed, all sentences were classified into one of the eight topics.
Several metrics of availability were examined. First, the number of associations that provided any educational material was examined, as well as the number providing educational materials covering all eight identified topics.
Coh-Metrix (version 3.0; Memphis, USA) was used to measure linguistic measures. Outcomes included the Flesch–Kincaid quantitative readability grade level, which outputs a U.S. reading grade level using average word and sentence length.26,27 Average word length and average sentence length were also evaluated independently. Content word overlap represented the proportion of content words shared between adjacent sentences,26 word familiarity represented how familiar a word was to the average adult,26 and word imageability represented how easy it was to construct a mental picture of a word.26
Coh-Metrix provides norms for various types of texts, including science text.28 These linguistic norms are grouped into grade levels between kindergarten and grade 12.28 These norms were used to develop regression equations for each linguistic measure in order to convert the raw values to grade-level equivalents (appendix 2). This allowed for the discrepancies between the educational materials and the grade 8 standard to be calculated and analyzed. While traditional grade levels reach a maximum of 12, we opted to allow our regression equations to calculate grade levels beyond this. Failing to do this would not distinguish between writing suitable for grade 12 students compared to highly educated professionals (such as the authors of the educational materials). It would also artificially truncate the regression equations since the outcomes themselves are unbounded and the complexity of writing can greatly exceed that seen in grade 12 textbooks. In light of this, we left the regression equations unbounded from zero to infinity. Based on the regression equations, a grade 8 standard was selected as it meets the recommendations of several influential health organizations.18–20 Furthermore, several health literacy experts hold that the grade 8 reading level demarcates the boundary between low literacy and literacy.17,29
Statistical analysis was done using SAS (version 9.2; SAS Institute, USA) and Microsoft Excel (version 14.0.7015.1000; Microsoft Corporation, USA). Interclassifier reliability of the two investigators was assessed using Cohen κ.30 The proportion of educational material passages that had linguistic measures at or below a grade 8 standard was determined.
The main outcome was the proportion of patient education materials with all language outcomes equal to or less than a grade 8 standard. However, we conducted a sensitivity analysis to ensure that this proportion was similar with a more liberal but less defensible grade 12 level. The difference between each linguistic measure, expressed as a grade-level score, and the grade 8 standard was calculated. These difference variables were then included in an intercept-only multivariate ANOVA (MANOVA) model. Results from this model indicated whether each of the six individual linguistic measures and the collective set significantly differed from the grade 8 standard (difference, more than 0). Additionally, the influence of the country of origin and topic of the passage on the discrepancy between linguistics and the grade 8 standard was evaluated using MANOVA. Results from this model determined whether each of the six individual linguistic measures and the collective set significantly differed across country and text category. Further, we compared each passage topic against another to produce contrasts. A false discovery rate of 0.05 was set to control for family-wise error.31 In total, there were 43 P values calculated.
The interrater reliability for document classification was “strong”32 (κ = 0.89; 95% CI, 0.87 to 0.93).
The proportion of associations that provided any educational material was 16/24 (66.7%). The proportion that provided material covering all eight identified topics was 6/24 (25.0%), with associations covering a mean of seven of eight topics (range, 2 to 8).
The proportion of classified passages having linguistic measures equal to or less than the grade 8 standard is shown in figure 1. No association had a single passage that had all linguistic measures concurrently meeting grade 8 standard. When the standard was made more lax by raising it to a grade 12 level, 5 of 112 (4.5%) passages were considered appropriate. The intercept-only MANOVA showed that quantitative readability, sentence length, word length, word imageability, and content overlap had grade levels above the grade 8 standard (all P < 0.0003; fig. 2). In contrast, the word familiarity grade level was not significantly different from the grade 8 recommendation (P = 0.09).
Influence of Topic and Country of Origin
The second MANOVA showed that both passage topic (F = 3.64; P < 0.0001) and country of origin (F = 7.26; P < 0.0001) influenced the linguistic measures. Specifically, passage topic influenced all linguistic measures except for sentence length (P = 0.66) and content word overlap (P = 0.03—not significant with false discovery adjustment). To show the linguistics of passages from different countries (fig. 3) and passages describing different topics (fig. 4) in a parsimonious manner, we averaged the grade levels of all linguistics for each passage to generate a composite language variable. Our contrast comparisons revealed significantly worse linguistic measures for passages explaining the roles of perioperative professionals compared to all other passage topics. They also revealed worse linguistic measures for passages explaining risks compared to passages explaining how patients should prepare for surgery and what they should expect to happen on the day of surgery (table 2).
Our study demonstrates that 67% of the investigated national anesthesiology associations provide online patient education materials, and no passage of text from any of the associations had linguistic measures that were all equal to or less than the grade 8 level. This number was increased to five passages if the standard was loosened to a grade 12 level. This makes the materials less capable of facilitating the informed consent process and both promoting and clarifying the important role of anesthesiologists in perioperative care. This represents significant lost opportunities for the profession, patients, and the wider public.
Had we only examined quantitative readability, as did previous studies,21–24 we would have misclassified 15% of the inappropriate passages as being appropriate. We would have also underestimated the inappropriateness of the language. While the materials used language with quantitative readability at a grade 11 level, other linguistic variables were as high as the grade 15 level (fig. 2).
Contrasts revealed that passages dealing with topics that are critical to the intended functions of educational materials contain less appropriate language. The language used in explaining the roles of perioperative professionals was especially poor compared to the language used in the other topics. This is regrettable as many patients misunderstand the role that anesthesiologists play in patient care. Less than 50% of patients realize that anesthesiologists play roles in resuscitation, pain management, and the prevention of intraoperative awareness with recall.33 Furthermore, 30 to 40% of patients do not even realize that anesthesiologists are physicians.33,34 Our findings underscore the missed opportunity to correct misunderstandings and eliminate knowledge gaps that the public has regarding anesthesiologists. The other topic that had significantly less appropriate language than other passage types was “risks of anesthesia.” This is also concerning since understanding the risks of treatments is necessary to provide valid informed consent. This undermines the potential usefulness of the materials to facilitate informed consent. Our study provides insight into how these educational materials can be improved. The linguistic measures with the highest grade levels were word imageability and word length. Therefore, materials should be altered to use shorter words that are easier for the reader to visualize. Reducing sentence length would also be a simple way to improve materials. Further, passages dealing with the same topics should also maximize content overlap by using terminology that is more consistent. Based on the contrasts, the professional associations should be especially careful in describing the roles of perioperative professionals, as this area was particularly concerning across all linguistic measures as compared to other passage topics.
Several limitations of this study merit discussion. While some associations supplemented their written materials with videos and infographics, we limited our analysis to textual materials amenable to analysis in Coh-Metrix. These aids, however, are primarily used as supplements to the textual materials and therefore cannot completely compensate for deficiencies in textual materials. There are also a number of alternative methods that could have been used to evaluate the appropriateness of educational materials.35 However, these involve subjective ratings of preidentified criteria and do not allow easy comparison to a grade-level recommendation. We also only analyzed six of the many linguistic measures calculated by Coh-Metrix.28 Other variables were excluded because they are designed to measure unparsed texts. While content word overlap is also intended for use on unparsed texts, it was included in this study as our thematic classification would not disadvantage the materials examined. Another limitation is that we do not know how many patients access these Web sites. As mentioned, previous research indicates that only 4% of patients seek information online regarding anesthesia.13 This, however, still represents a large number of patients. In addition, Web analytics indicate that there is a significant demand for anesthesia-related information. Traffic on anesthesiology association Web sites is as high as 340,000 hits per month (appendix 1), and “anesthesia” is used as a Google search term approximately 60,500 times per month.36 Even if a minority of people access these Web sites, associations still have a responsibility to create materials that are the profession’s best effort to represent itself.
Our investigation has several methodological strengths. It uses a more sophisticated set of linguistic measures and therefore is able to identity specifically what aspect of the language is problematic. The parsing of the text by topic also allowed us to find the specific educational topics that are being described using especially complex language. This is also the largest study of its kind including associations from six countries, allowing for a greater generalizability of the results.
In summary, many of the national anesthesiology associations examined did not provide any educational materials and those that did provided materials with inappropriate language. The language used was especially bad for topics that are critical to facilitating patient informed consent and promoting the discipline of anesthesiology. Anesthesiology associations should simplify their materials and should consider screening their materials with linguistic software before publication.
Supported by the Department of Anesthesiology and Perioperative Medicine, University of Manitoba, Winnipeg, Manitoba, Canada.
The authors declare no competing interests.