To the Editor:—  The editorial by Nuttall and Houle1on the article by Vincent et al.  2is long on method but short on biology. Nonstatisticians—the majority of readers—will be trying to get the article in clinical context. The editorial does not help them in this, and its pejorative title gets it off to a bad start. Nuttall and Houle1give a useful assessment of propensity scoring (in general) but barely mention the data (in this study), and so risk giving the reader the impression that the content should be given only limited credence. An editorial that gave more prominence to the biology would have collated the evidence and achieved broader perspective.

The article of Vincent et al.  2is a hypothesis-generating study that questions the current consensus on erythrocyte transfusion therapy, in a similar manner to the findings of Connors et al.  3(with respect to pulmonary artery catheterization) and, more recently, Karkouti et al.  4and Mangano et al.  5(with respect to aprotinin in cardiac surgery). The conclusions of Vincent et al.  2may be disturbing, but to summarize the study with the truism “interpret with caution”—on the grounds of methodology—is an incomplete response that serves nobody. The key question is: Do the article’s findings reflect flawed methods, or do they suggest a problem with generalizability (e.g. , might previous data derived from randomized controlled trials (RCTs) be driving current practice inappropriately)? The editorialists omit the latter possibility altogether, which is unfortunate because it may be the most important lesson from the article.

In looking at two studies with disparate results, such as those of Vincent et al.  2and the landmark Transfusion in Critical Care (TRICC) study,6the most useful initial response is to try to understand how they can be reconciled, or how what was apparently true before might not be true now. Transfusion practice has changed as a result of the TRICC trial, and transfusion of leukodepleted erythrocytes is now widespread. If these changes are truly beneficial, we would expect the impact of transfusion decisions to change also, with a reduction in “harmful transfusion.” If the changes in practice had resulted in overly conservative decision making, we might observe an increase in harm from “harmful nontransfusion.” Successful RCTs that are followed by evidence of a “downside” are not novel; the Randomized Aldactone Evaluation Study,7which showed improved survival in patients receiving spironolactone, was followed by observational data suggesting an increase in morbidity and mortality from hyperkalemia.8Although the “harm” component in the TRICC trial seemed to stem from liberal transfusion in younger, healthier patients, later analysis suggested possible harm also from not  transfusing in TRICC participants with known coronary artery disease.9 

The editorialists are right to address study methodology, but they should not leave the reader with an indictment of propensity scores and, by extension, observational studies. When discussing methodologic issues, we should keep in mind the suggestion that recent high-quality observational studies and RCTs often arrive at similar conclusions,10the fact that highly cited randomized trials may produce incorrect or exaggerated results,11and the suggestion that the durability of medical knowledge is unrelated to methodologic quality.12 

Even the best observational study is limited by an inability to draw causal inferences and by the presence of confounders. RCT design takes causality as a given and puts its trust in an ability to minimize—of course it does not eliminate—confounders by randomization. But the problem of “unknown unknowns” remains, and the greater the number of unknown confounders that exist, the greater the likelihood of an imbalance. This problem is common to RCTs and observational studies alike and is probably most likely in small studies where our understanding of disease pathogenesis is limited. In a study with total n ≈ 1,600, where five independent confounders exist, each with an incidence of 20%, the probability of an imbalance for at least one confounder is almost 25%.13So studies A and B might disagree because A has greater balance of unknown confounders than B, and thus a better balance of confounders in a large observational study might “trump” randomization in a small RCT. This does not upgrade the status of observational studies, but it does explain why well-designed observational studies often arrive at similar conclusions relative to RCTs, and why some of the time they will correctly contradict previous RCT data. The controversial articles by Karkouti et al.  4and Mangano et al.  5may exemplify this—as suggested by the results of the recent Blood Conservation Using Antifibrinolytics in a Randomized Trial.14 

The article of Vincent et al.  discusses whether leukoreduction might account for the findings but provides no data2; the editorial does not mention it.1Neither the original article nor the editorial provides any convincing explanation (i.e. , biologic basis) for the reported effect. We wonder whether additional analysis of the data in the article of Vincent et al.  2might shed light on whether leukoreduction may be responsible for the apparently altered impact of transfusion, as has been suggested previously.15,16 

The data of Vincent et al.  2and the recent TRICC reanalysis by Deans et al.  9suggest that outcome is changing over time and that the interpretation of the TRICC trial is more complex than we thought. It will be some time before we get a clearer picture, but in the meantime, we should not treat propensity scoring as a straw man. Reading the article of Vincent et al. ,2we experience the judgment under uncertainty that pervades clinical life. Decisions to transfuse—and not to transfuse—are not made lightly, so it is a truism that these data should be viewed with caution. The function of the article, however, is to make us view with caution things that we think we know.

*St. Vincent’s University Hospital, Dublin, Ireland.


Nuttall GA, Houle TT: Liars, damn liars, and propensity scores. Anesthesiology 2008; 108:3–4
Vincent JL, Sakr Y, Sprung C, Harboe S, Damas P, on behalf of the Sepsis Occurrence in Acutely Ill Patients (SOAP) Investigators: Are blood transfusions associated with greater mortality rates? Results of the Sepsis Occurrence in Acutely Ill Patients Study. Anesthesiology 2008; 108:31–9
on behalf of the Sepsis Occurrence in Acutely Ill Patients (SOAP) Investigators
Connors AF Jr, Speroff T, Dawson NV, Thomas C, Harrell FE Jr, Wagner D, Desbiens N, Goldman L, Wu AW, Califf RM, Fulkerson WJ Jr, Vidaillet H, Broste S, Bellamy P, Lynn J, Knaus WA: The effectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators. JAMA 1996; 276:889–97
Karkouti K, Beattie WS, Dattilo KM, McCluskey SA, Ghannam M, Hamdy A, Wijeysundera DN, Fedorko L, Yau TM: A propensity score case-control comparison of aprotinin and tranexamic acid in high-transfusion-risk cardiac surgery. Transfusion 2006; 46:327–38
Mangano DT, Tudor IC, Dietzel C: The risk associated with aprotinin in cardiac surgery. N Engl J Med 2006; 354:353–65
Hébert PC, Wells G, Blajchman MA, Marshall J, Martin C, Pagliarello G, Tweeddale M, Schweitzer I, Yetisir E: A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. Transfusion Requirements in Critical Care Investigators, Canadian Critical Care Trials Group. N Engl J Med 1999; 340:409–17
Pitt B, Zannad F, Remme WJ, Cody R, Castaigne A, Perez A, Palensky J, Wittes J: The effect of spironolactone on morbidity and mortality in patients with severe heart failure. Randomized Aldactone Evaluation Study Investigators. N Engl J Med 1999; 341:709–17
Juurlink DN, Mamdani MM, Lee DS, Kopp A, Austin PC, Laupacis A, Redelmeier DA: Rates of hyperkalemia after publication of the Randomized Aldactone Evaluation Study. N Engl J Med 2004; 351:543–51
Deans KJ, Minneci PC, Suffredini AF, Danner RL, Hoffman WD, Ciu X, Klein HG, Schechter AN, Banks SM, Eichacker PQ, Natanson C: Randomization in clinical trials of titrated therapies: Unintended consequences of using fixed treatment protocols. Crit Care Med 2007; 35:1509–16
Benson K, Hartz AJ: A comparison of observational studies and randomized, controlled trials. N Engl J Med 2000; 342:1878–86
Ioannidis JP: Contradicted and initially stronger effects in highly cited clinical research. JAMA 2005; 294:218–28
Poynard T, Munteanu M, Ratziu V, Benhamou Y, Di Martino V, Taieb J, Opolon P: Truth survival in clinical research: An evidence-based requiem? Ann Intern Med 2002; 136:888–95
Shrier I, Platt RW, Steele RJ: Mega-trials versus  meta-analysis: Precision versus  heterogeneity? Contemporary Clin Trials 2007; 28:324–8
Fergusson DA, Hébert PC, Mazer CD, Fremes S, MacAdams C, Murkin JM, Teoh K, Duke PC, Arellano R, Blajchman MA, Bussières JS, Côté D, Karski J, Martineau R, Robblee JA, Rodger M, Wells G, Clinch J, Pretorius R, for the BART Investigators: A comparison of aprotinin and lysine analogues in high-risk cardiac surgery. N Engl J Med 2008; 358:2319–31
for the BART Investigators
Hébert PC, Fergusson D, Blajchman MA, Wells GA, Kmetic A, Coyle D, Heddle N, Germain M, Goldman M, Toye B, Schweitzer I, vanWalraven C, Devine D, Sher GD: Clinical outcomes following institution of the Canadian universal leukoreduction program for red blood cell transfusions. JAMA 2003; 289:1941–9
Fergusson D, Hébert PC, Lee SK, Walker CR, Barrington KJ, Joseph L, Blajchman MA, Shapiro S: Clinical outcomes following institution of universal leukoreduction of blood transfusions for premature infants. JAMA 2003; 289:1950–6