Pain intensity is commonly reported using a 0-10 Numeric Rating Scale in pain clinical trials. Analysis of the change on the Pain Intensity Numerical Rating Scale as a proportion has most consistently correlated with clinically important differences reported on the patient's global impression of change. The correlation of data from patients with breakthrough pain with a Pain Relief Scale and a different global outcome measures will extend our understanding of these measures.
Data were obtained from the open titration phase of a multiple crossover, randomized, double-blind clinical trial comparing oral transmucosal fentanyl citrate with immediate-release oral morphine sulfate for the treatment of cancer-related breakthrough pain. Raw and percentage changes in the pain intensity scores from 1,307 episodes of pain in 134 oral transmucosal fentanyl citrate-naïve patients were correlated with the clinically relevant secondary outcomes of Pain Relief Verbal Response Scale and the global medication performance scale. The changes in raw and percentage change were assessed over time and compared with the ordinal Pain Relief Verbal Response Scale and Global Medication Performance Scale.
The P value of the interaction between the raw pain intensity difference was significant (P = 0.034) for four 15-min time periods but not for the percentage pain intensity difference score (P = 0.26). We found similar results in comparison with the ordinal Pain Relief Verbal Response Scale (P = 0.0048 and P = 0.36 respectively) and global medication performance categories (P = 0.048 and P = 0.45, respectively).
The change in pain intensity in breakthrough pain was more consistent over time and when compared with both the Pain Relief Verbal Response Scale and the Global Medication Performance Scale when the percentage change is used rather than raw pain intensity difference.
What We Already Know about This Topic
❖ Percentage change in pain ratings from baseline on a 0–10 verbal scale correlates better to patient perception of benefit with chronic pain treatment than the raw numerical difference
❖ Whether this also applies to treatment of acute breakthrough pain is not known
What This Article Tells Us That Is New
❖ In 137 cancer patients receiving breakthrough pain treatment, percentage change in pain ratings correlated better with global measures of effect than raw numerical difference
GIVEN the inherently subjective nature of the symptom, the measurements of pain rely primarily on the verbal reports of patients.1–4The multiple dimensions of pain, such as intensity, characteristics, pain relief, and global impressions of change, are considered important additional endpoints for pain clinical trials.5–10However, for studies of pain-specific therapies, change in pain intensity over time is almost always the primary outcome. The pain intensity 0–100-mm Visual Analog Scale and the pain intensity 0–10 Numeric Rating Scale (PI-NRS) are commonly used metrics. The PI-NRS has become the more common choice because of its ease of use, a broader range of methods of administration, and evidence of consistent results across a wide range of languages and cultures.11,12
For chronic pain studies, a greater consistency between the change in the PI-NRS score and a clinically relevant outcome has been demonstrated using the percentage change compared with the raw change in the analysis of pain intensity data.13The calculation of the percent change converts the change in the PI-NRS to a proportional measure. The substantial improvement in the association with patient's report of their global improvement supports the concept that patients use the PI-NRS as a proportional scale to report their change in pain intensity. To our knowledge, this has not been investigated in the studies of rapid-onset breakthrough pain (BTP) and how the measurements of pain change over time, given the relatively rapid resolution of this type of pain episode. In addition, no comparisons have been made to more specific global measures such as the patient's assessment of the overall performance of the analgesic medication and the achievement of specific levels of pain relief. Demonstrating the consistency of these chronic pain findings in additional pain syndromes and using different global anchors will provide important information about the relationship of these measures and allow important comparisons across a wider array of pain studies.
Materials and Methods
This study was approved by the Institutional Review Board of the University of Pennsylvania, Philadelphia, Pennsylvania. Each patient in the original clinical trial provided written informed consent before being enrolled.
The data used in this analysis were obtained from a multicenter randomized, blinded, placebo-controlled clinical trial of oral transmucosal fentanyl citrate (OTFC; ACTIQ® Cephalon Inc, Frazer, PA) compared with oral immediate-release morphine sulfate for the treatment of cancer-related BTP. The study design and methods have been described in detail previously.14In brief, 134 outpatients enrolled in this trial with chronic cancer-related pain controlled with long-acting opioid drugs and recurrent episodes of BTP adequately treated with immediate-release morphine sulfate. Because all patients were OTFC-naïve, an initial titration phase was used to find the appropriate dose of OTFC for each patient, starting at the lowest available dosage strength of 200 μg per OTFC unit. Subjects were titrated up to a maximum of 1,600 μg per OTFC unit or until a single lower dose was found that controlled more than two episodes of the patient's target BTP in a row with a maximum of 1,600 μg per OTFC unit. All 1,307 treated episodes that had adequate data for analysis were included in this study. Given the rapid onset of action of OTFC, acceptable pain relief was expected within 30 min after initiation of the first dose15and at 45 min for immediate-release morphine sulfate. If not achieved, a second dose of the OTFC, or a dose of the patient's original immediate-release morphine sulfate, could be taken as an “additional dose of rescue medication” for that episode of BTP.
The endpoints were collected at 15, 30, 45, and 60 min after the start of the administration of the OTFC dose or until the time the patient decided to take an additional dose of rescue medication. The primary endpoint was the change in the PI-NRS, and the secondary endpoint was pain relief on a five-point Verbal Rating Scale (PR-VRS). A global medication performance (GMP) rating was obtained at the end of the treatment for each episode. The PR-VRS Scale (0 [none], 1 [slight], 2 [moderate], 3 [lots], 4 [complete]) and the GMP Scale (0 [poor], 1 [fair], 2 [good], 3 [very good], 4 [excellent]) were used as reported by patients because these were assumed to be 0 at time 0. Data were collected using a paper patient diary. The data for each day were collected on a separate page, so that the previous day's information was not immediately available, but patients were not specifically blinded to their previous answer. The change in pain intensity was calculated as both (1) the raw pain intensity difference (PID = PI-NRS value − PI-NRS baseline) and (2) the percentage PID (%PID =[PID/PI-NRS baseline]× 100).
The first analysis evaluated the effect of baseline factors on the consistency of the patients' pain reports by examining changes over the full 60-min time period. The values for each treatment episode were stratified into groups by a number of the patient characteristics. The mean value of the change in pain intensity was calculated for each patient group at 15, 30, 45, and 60 min. To provide the best evaluation over time, only data from the 1,105 episodes in which the patient recorded outcomes for the full 60 min could be used (i.e. , those who did not drop out to take an additional rescue dose). By definition, little change was expected for episodes in which the OTFC did not produce some degree of relief (i.e. , episodes that required additional medication), and most of these records were truncated by 45 min. The data imputation methods necessary for the inclusion of these episodes added additional variability without useful data. However, as a sensitivity analysis, we repeated the analysis with all episode data using the last observation carryforward data imputation technique. The outcome variable for these models was the change in pain at the given time point (or percentage change) from baseline. Statistical interactions between patient characteristic groups and study time were tested using a linear regression analyses clustered by subject.
The second analysis compared changes in the pain intensity to the change in the PR-VRS and the GMP Categorical Scale, using the values measured at the end of each treated episode. For these analyses, the change in pain intensity was calculated as the difference between baseline and 45 min because this was the last time point recorded for 95% of those patients who went on to take an additional dose of rescue medication, allowing us to use all treatment regardless of whether they required extra doses of rescue medication. We tested different patient characteristic groups for an interaction between the average change in pain intensity for each level of the PR-VRS and the GMP, both of which were used as the dependent variable in a linear regression model. The P values were adjusted for lack of independence through clustering by patient.
Finally, a linear regression model was also used to test for the interaction between the average values of the PR-VRS (1) compared over study time and (2) separately compared with the GMP Categorical Scale. The same patient characteristics used for the pain intensity comparison were used to define groups, including the initial level of reported pain.
In all analyses, the baseline characteristics considered were the age categories (defined as 18–49, 50–59, 60–69, and 70+ yr), sex, tumor types, final effective therapeutic dose, and Baseline Pain Intensity Scores. The interaction with time and for each stratification factor was tested for both raw changes in pain intensity and percentage change to assess whether they differed according to the grouping of patient characteristic factors.
Analyses were performed using STATA version 8.0 (StataCorp LP, College Station, TX), and the graphs were produced using Excel 2007 (Microsoft, Redmond, WA).
The demographic data from this cohort have been published previously.14Pertinent data are summarized here for convenience. Of the 134 patients starting the study, 93 achieved adequate analgesia using a single dose of OTFC and 89 agreed to enter the randomization phase. The reasons that patients dropped out have been carefully described elsewhere but were primarily because of standard opioid side effects or cancer-related events. The mean age of the 134 patients was 55 yr; they were 92% white and 47% women; and the primary cancers were the colon, breast, and lung. Of all the baseline patient characteristics used to group the various pain outcomes in this study, only the raw PID grouped by baseline pain intensity resulted in a statistically significant interaction with study time (fig. 1) and in the average value association with both PR-VRS (fig. 2) and GMP (fig. 3) categorical scales. In particular, patients who reported a higher numeric value for baseline pain intensity (e.g. , baseline = 9 vs. baseline = 4) demonstrated a larger change in raw pain intensity consistently over time (fig. 1A: P = 0.034), reported a greater level of relief (fig. 2A: P = 0.048), and reported a higher performance level on the GMP Scale (fig. 3A: P = 0.013). For baseline = 9 versus baseline = 4, a change value of 8.75 versus 3.65, respectively, was the average change seen in patients who reported the GMP condition of excellent. Our sensitivity analysis of the pain level over time, using the whole dataset, demonstrated the same separation for raw pain intensity values but with increased variance in the analysis of the 45- and 60-min time points (data not shown).
Calculating the percentage PID resulted in more consistent patterns across the patient characteristic groups over time (fig. 1B), levels of PR-VRS (fig. 2B: P = 0.36), and levels of ordinal GMP categories (fig. 3B, P = 0.45). These comparisons demonstrate that the response seen in the raw PID result was highly dependent on the baseline pain intensity for a full range of values and conditions. In contrast, the profile of the percentage PID was less dependent on baseline pain intensity level over time or when compared with the ordinal categories of the GMP or the PR-VRS.
In considering the PR-VRS as a potential outcome, patient characteristic grouping of the PR-VRS score by the baseline factors showed no statistically significant interaction over time or with the GMP level including when grouped by the baseline pain intensity. In particular, the comparison over time (fig. 4A: P = 0.39) and the comparison with the GMP (fig. 4B: P = 0.52) were similar between patient characteristic groups.
Percentage Change versus Raw Change in Pain Intensity
Our findings support the improved consistency of a relationship between the percentage change in the analysis of the PI-NRS and the clinically important changes measured on the global outcomes, in the context of a clinical trial for a rapid acting analgesic used to treat a rapid onset BTP in patients with cancer. Demonstrating this finding in the setting of the use of different global outcome measures supports the consistency of this result across a broader array of study designs. The increased consistency of the percentage change in PI-NRS compared with the global outcome and as measured across study time supports the calculation of percentage as a way to adjust for the baseline pain and potentially as an appropriate primary outcome for such clinical trials. This expands on the previous demonstration of the same relationship in patients from 10 chronic pain studies.13It raises the possibility that the analysis of the PID raw scores may lead to inconsistent results, especially if the starting pain scores vary between groups or across studies. Calculating the percentage PID generates a result that is more consistent with clinically relevant global measures of the outcome and has the potential to increase comparability across studies.
To put these findings into perspective, we should consider other studies and analysis techniques. For example, our results are consistent with a published study of 700 patients with acute pain treated with multiple doses of medications. In that study, the difference in the average pain intensity results between baseline pain groups was substantially reduced by the calculation of the percentage change.16
The most likely explanation for our findings is that the calculation of the PID as a percentage change brings this measure of pain intensity in line with the patient's global report of improvement, as represented here by the PR-VRS and GMP Scores. This supports the concept that patients use the PI-NRS as a percentage scale. As a result, in situations where baseline pain is variable across patients, failing to calculate treatment effects as a percentage change consistently across the range of baseline values may obscure true differences in the group treatment effects within a study and complicate comparisons across studies. For example, in the published study of chronic pain data (see paragraph above), the average baseline pain intensity in patients ranged from 6.2 in one study to 7.0 in another.13This difference in baseline value (7.0 − 6.2 = 0.8) at the start of the two studies could potentially result in a 13% variation (0.8/6.2) in the size of the efficacy outcome reported by these two studies based only on the difference in patient-reported baseline pain. Because the range of the baseline pain intensity levels cannot be known before conducting a study, the percentage PID seems to be a more appropriate a priori analysis decision because it provides a more consistent value across all initial pain values.
In considering the use of a percentage change as the primary outcome for the analysis of group differences, there are additional statistical issues to consider. Vickers17has shown that in the analysis of a truly normal dataset, the calculation of a percentage change may not preserve the normality of the data and may have less statistical power. However, this assumes that a constant absolute change is of equal importance across all levels of the scale, which is not generally true for numeric pain scales. The primary finding of our study is that change in pain has a better association with clinically relevant levels of global measures, when it is considered as a proportion. In his article, Vickers states that the statistical power will be more similar if the treatment effect is proportional, as is supported by our data. Therefore, any loss in statistical efficiency for a parametric analysis is not as large for proportional data. In addition, because nonparametric approaches to the analysis of pain data are preferred by several other authors,18,19additional work will be necessary to establish whether the cost in statistical efficiency of the use of percentage change rather than absolute change is of concern. Even if a small cost in efficiency remains, our data support the clinical relevance of the use of a proportion in the analysis of change in pain intensity, which is an important consideration in making an appropriate choice for the primary analysis of clinical trial data.
Other precedents also exist for examining the percentage changes as a preferred method for defining clinically important differences for symptomatic conditions such as pain.20–23For example, in the comprehensive evaluation of arthritis clinical trials developed by the Outcome Measures in Rheumatology Clinical Trials group, experts derived several levels of clinically important improvement from treatment, all of which are expressed as percentage change.24The percentage change is now endorsed by the American Rheumatology Association as the standard criterion and is identified as an acceptable method in Food and Drug Administration guidelines for the development of new arthritis products.25From the work of Moore et al. ,26using meta-analysis to combine outcomes from smaller clinical trials, a 50% cutoff point for the maximum total pain relief was established as the point representing a clinically important change, reasoning that it “is a simple clinical endpoint . . . easily understood by professionals and patients.” Other investigators have used percentage changes in outcomes as the means for defining important and clinically meaningful treatment differences in analgesic trials.27–29
Relationship of GMP and PR-VRS
The second important finding of our study is that a direct verbal measure of pain relief in short-term pain treatment trials is inherently consistent with the global outcome. The remarkable stability of the PR-VRS Measurement Scale over time and when compared with the GMP Score supports the concept that the GMP and the PR-VRS seem to be measuring a similar patient construct in the cancer BTP setting. In studies in which these endpoints may have been used as a primary outcome for pain treatment efficacy, the consistency of these findings with the percentage PID should allow a better comparison of results for meta-analyses and systematic reviews of acute pain and BTP.5,30–35However, caution should be taken in extending these findings to chronic pain studies, in which there is evidence that pain relief measures are associated more consistently with change in mood than changes in pain intensity.36
Although our findings are consistent with other studies, the potential limitations of this study must also be considered. First, the potential generalizability issue to other types of rapid onset or BTP clinical studies. Cancer-related BTP is generally a self-limiting form of acute pain. It is possible that the response to BTP in patients who do not have cancer may be slightly different. In addition, this population comprised mostly whites, and testing in other populations would be appropriate. Third, the global measures used here are different from those used in other studies. Although a strength, it is also possible that other global measures may provide slightly different results. The limitations on the availability of other clinically relevant factors that can affect the perception and report of pain, such as affect, mood, and expectations, which are known to complicate the interpretation of results from clinical trials of pain therapies,37,38prevent us from examining their effect on the relationship of pain intensity to the global outcome. However, comparisons in our study were among different scales used by the same patients and thus should not be affected by patient factors unless we presume that different measures are differentially affected in the same patient.
It is also important to acknowledge that the calculation of the percentage change requires some practical consideration. Clearly, any formula requiring a division will not have a value when dividing by 0. Although true, this is not a major issue in most pain clinical trials, because patients have to have some pain to participate. In addition, the number of values a percentage change can take gets smaller as the baseline is smaller. Again, this is usually not a major issue because entry criteria in a pain study are usually having enough pain (often ≥4/10) to warrant treatment. Finally, it should be apparent that percentage up and percentage down are different. For example, going from six to four is a decrease of 33%, whereas going from four to six is an increase of 50%. Although this can be handled statistically, by dividing by the maximum of the baseline and final pain value, the appropriateness of this approach has not been adequately tested. Because most studies are conducted to evaluate therapies that improve pain, the majority of patients end up at lower pain levels than baseline. For the comparison of two treatment groups, we would not expect the effect of such calculation issues to be different between treatment groups.
Ideally, the analytical techniques presented here will need to be tested using datasets from other populations of patients using different analgesics to verify that percentage changes in pain intensity remain most consistent in other situations as well. However, because our findings are consistent with our previous analysis of a sample of patients with five different chronic pain syndromes,13we have more confidence that replicating our procedures in data from other studies will demonstrate similar results.
We have reanalyzed data from the multiepisode titration phase of a clinical trial of BTP in patients with cancer, observing that a percentage change pain intensity is better associated with PR-VRS and GMP and may provide a more consistent representation of the patient's response to treatment over time than the raw PID. Although the choice of outcome measures and analyses for future clinical trials will depend on the study question and study design, the use of a percentage change may provide a more standardized approach to the evaluation and interpretation of clinical trials for pain therapies. A more standard approach may improve our ability to compare the results across trials and in the evaluation of differences in the pain experiences between different populations such as the report of pain by male and female patients and in different cultures.39–42The consistency of the reported changes in the percentage pain intensity with the reported level of clinical benefit that was demonstrated across all demographic factors suggests a way to handle the interperson differences in numeric pain measures to provide more consistent results across clinical studies.