The first rule of discovery is to have brains and good luck; the second is to sit tight and wait until you get a bright idea.1 

ANESTHESIOLOGISTS know well that high-quality research involves both generating and testing hypotheses. Generating high-quality hypotheses is becoming more important as funding decreases and scrutiny increases, and methods for developing hypotheses have been published.2One fruitful approach to hypothesis generation focuses on the use of previously collected data to reveal undiscovered causalities. Consequently, retrospective research, such as the article by Exadaktylos et al.  3in this issue of Anesthesiology can be viewed as the hunt for questions rather than answers.

Exadaktylos et al.  3performed a retrospective cohort study comparing the effect of paravertebral blockade in addition to general anesthesia in altering recurrence and metastasis of breast cancer after mastectomy. Recurrence- and metastasis-free survival was 94% (95% confidence interval, 87–100%) and 82% (74–91%) at 24 months and 94% (87–100%) and 77% (68–87%) at 36 months in the groups that did and did not receive paravertebral blockade, respectively (P = 0.012). It is tempting to dismiss these findings outright for many reasons, including inherent bias of retrospective studies, biologic implausibility, and their small cohort size; all of these issues will be addressed below. However, we strongly agree with the authors’ intent of proposing their findings as a possible benefit in need of prospective verification. Such a two-phased approach should be the model for discovery of moderate treatment effects that would require large-scale trials to prove, because large prospective trials should only be undertaken when there is significant support for the hypothesis.

Bias in retrospective studies is a serious threat to the validity of the findings. Bias can come in a myriad of forms, and it may be impossible for statistical analysis to detect and eliminate bias.4Recently, propensity scores are increasingly being used to help determine whether the outcome differences seen are true effects of the treatment or just a sign that the risk factors for the outcome were not evenly distributed between the groups. Propensity scores are different from regression analyses because they take into account the variable’s influence on the likelihood for the subject to receive treatment, the variable’s impact on outcome, and the variable’s impact on the relationship between treatment (or nontreatment) and outcome.5Such analyses increase in importance in studies where randomization is impossible or impractical. However, it remains controversial whether anything other than randomization of more than 50 patients (per center) can ever equally distribute unknowns so that they do not confound the relationship between treatment and outcome.

Assuming that chance, bias, and confounding are unlikely explanations of the association of paravertebral blockade to improved recurrence-free survival, we must judge the likelihood of a cause–effect relation in the manner proposed by Sir Austin Bradford Hill, a preeminent British biostatistician, by examining several factors, including biologic plausibility.6Biologic plausibility demands that a possible association fits existing biologic or medical knowledge. This is a potentially double-edged guide, because one is only likely to look for what one expects or is prompted to see, but at the same time, it helps to reign in the overuse of exploratory analyses from areas of erroneous association.

Exadaktylos et al.  3have proposed a bold link between paravertebral blockade and reduced cancer recurrence. The preservation of immunologic function in the paravertebral blockade group is cited as the cause for decreased recurrence and metastasis. Although there is some basic science research that frames their exciting theory, thoracic paravertebral block with 0.5% bupivacaine (1.5 mg/kg) has been shown to block somatosensory evoked potential transmission reliably only at the dermatome of injection.7Further, 0.25% bupivacaine has been shown to produce far less inhibition of somatosensory evoked potentials than 0.5% bupivacaine.8Consequently, the 0.2 ml/kg of 0.25% levobupivacaine bolus (plus infusion) used by Exadaktylos et al.  3seems insufficient to inhibit the stress response enough to allow unfettered immunologic function.

Randomization of subjects in large prospective studies seems to avoid the possible biases of retrospective and unrandomized studies. The difficulty is deciding what questions should be answered because as we search for smaller effects, we require dramatically larger groups of subjects to study. Therefore, we have the ethical issues of limiting resources to specific questions and avoiding the unethical use of subjects in studies that would be too small to produce a reliable answer. The article by Exadaktylos et al.  3can be seen as an example to use current and past clinical care databases to produce preliminary data, which helps to focus future research by highlighting a possible benefit and also by estimating an effect size that allows for a more precise and thus economical enrollment target. Consequently, we are in favor of publishing high-quality retrospective studies, but we demand strict guidelines on quality (discussed below), and we strongly caution against bias not to publish high-quality retrospective studies where no relation between predictors and outcome has been demonstrated. Such publication bias could lead to subjects’ being enrolled in studies where there should have been published reasons to believe that no effect would be seen.

Guidelines have been proposed for data collection from medical records for use in retrospective analyses to avoid biases from using data collected for patient care.9These guidelines indicate two major risk categories: poor information flow from the patient to the medical record and poor information flow from the medical record to the research database. Poor information flow from the patient requires verification from original source documents and multiple physicians’ notes. When a key element, exposure, or risk factor is sought, a random sample of patients should be surveyed, with the survey being the accepted standard against which the abstracted data are judged.

Poor information flow from the medical record to the research database encompasses both the availability of the needed data in the medical record and the ability to extract the data in an unbiased fashion. Reviewing peer-reviewed studies with a focus on the method sections, discussions with expert clinicians, and input from biostatisticians and epidemiologists are critical to determining what data are needed both to answer the question and to minimize confounding. The next step is a pilot study to determine whether the necessary data are available from the medical records. Bias is further reduced by using blinded data collectors who have an appropriate paramedical education and who are armed with case report forms with clearly adjudicated definitions of disease, exposure, and confounders. The work of the data collectors should be compared with control interobserver variability, and random checks of the case report forms against source documents will further help to control error.

The aging population will require increased perioperative care during a time of shrinking federal and private funding for research. The heightened scrutiny of the federal, private, and public sectors for improved outcomes will put anesthesiologists, as leaders in safety and quality improvement, in an ideal position to direct and lead trials to improve outcomes of our patients. Ethics, reality, and economy direct us to take advantage of accumulated clinical cases to help develop and guide future trials. It is therefore incumbent on anesthesiologists to use the tools, guidelines, and expertise of epidemiologists and statisticians to produce the most informative trials possible. It is also incumbent on journal reviewers and editors to balance enforcing best scientific literature practice with the need for thought-provoking articles that stimulate new areas of research and heated discussion; this intellectual tension can only improve academic anesthesiology.

*Department of Anesthesiology and Critical Care, University of Pennsylvania, Philadelphia, Pennsylvania.

Polya G: How to Solve It: A New Aspect of Mathematical Method. Princeton, Princeton University Press, 2004
Princeton University Press
McGuire WJ: Creative hypothesis generating in psychology: Some useful heuristics. Annu Rev Psychol 1997; 48:1–30
Exadaktylos AK, Buggy DJ, Moriarty DC, Mascha E, Sessler DI: Can anesthetic technique for primary breast surgery affect recurrence or metastasis? Anesthesiology 2006; 105:660–4
Christenfeld NJ, Sloan RP, Carroll D, Greenland S: Risk factors, confounding, and the illusion of statistical control. Psychosom Med 2004; 66:868–75
Weitzen S, Lapane KL, Toledano AY, Hume AL, Mor V: Principles for modeling propensity scores in medical research: A systematic literature review. Pharmacoepidemiol Drug Saf 2004; 13:841–53
Hill AB: Principles of Medical Statistics, 8th edition. London, Lancet, 1966
Richardson J, Jones J, Atkinson R: The effect of thoracic paravertebral blockade on intercostal somatosensory evoked potentials. Anesth Analg 1998; 87:373–6
Lund C, Hansen OB, Kehlet H: Effect of epidural 0.25% bupivacaine on somatosensory evoked potentials to dermatomal stimulation. Reg Anesth 1989; 14:72–7
Jansen AC, van Aalst-Cohen ES, Hutten BA, Buller HR, Kastelein JJ, Prins MH: Guidelines were developed for data collection from medical records for use in retrospective analyses. J Clin Epidemiol 2005; 58:269–74