THE elucidation of the human genome and proteome offers the clinical researcher the opportunity to test thousands of hypotheses in a single study. 1Clinical researchers in established areas such as oncology, diabetes, and cardiovascular disease have seized this opportunity. For example, PubMed searches of the disease name with the terms polymorphism and human yield more than 10,000 citations for cancer and 2,000 for diabetes or hypertension. Although there are many studies examining the genetic risk factors for variations in macroscopic structural disease that may trigger pain—e.g. , coronary stenosis, rheumatoid arthritis, or lumbar disc herniation—clinical pain researchers have made little use of these genomic riches to study variations in pain processing, given a uniform initiating injury. The plausibility of clinical pain genetic studies is supported by the recent findings of major differences between inbred mouse strains in the behavioral response to more than 20 different acute and chronic pain conditions, including thermal and chemical stimulation of the skin and viscera and nerve injury. 2–4These results suggest that genetic variants affecting pain processing are common and conserved in mammalian populations.
Perhaps pain researchers have neglected clinical genetics because, apart from a few rarities, 5,6they have not noticed obvious familial inheritance of pain syndromes. However, familial inheritance only becomes obvious for alleles conferring a relative risk (RR) of 50 or more. Association studies, in which the frequencies of common allelic variants are compared in cases and controls, may detect relatively small increases in RR. Such studies have detected alterations in RR in Alzheimer disease, Crohn disease, venous thrombosis, diabetes, schizophrenia, osteoporotic fractures, and other common medical disorders. 7
Allele-based association studies differ from locus-based family genetic studies in several ways. In family studies, in which the subjects share whole chromosomes or large portions thereof, several hundred genetic markers through the chromosome are sufficient to search the entire genome for a susceptibility locus. Several markers will be on the same preserved chromosome fragment as the disease susceptibility gene. Association studies are generally performed in unrelated individuals in whom only short segments of DNA are shared, so the density of markers studied over any length of DNA must be up to 1,000 times greater than in family studies. Conversely, two advantages of association studies are that they have greater power than family linkage studies to detect genetic effects of slight to modest size, 8and one has broad latitude in selecting unrelated subjects in a way to optimize the clinical phenotype and to standardize environmental exposures and measurement methods.
Genetic association studies lend themselves to the study of the most perplexing problem in pain research, that after apparently identical structural injuries to a variety of tissues, pain resolves rapidly in most patients and persists in others. Well-studied examples include shingles, diabetic neuropathy, spinal degeneration, limb amputation, mastectomy, thoracotomy, or whiplash injury. Only a small part of the variance in pain persistence has been explained by age, severity of the injury, personality traits, social support, or economic status. 9,10Experience with analgesic clinical trials 11,12suggests that one can best detect pharmacologic effects of moderate size in conditions where the measurable effects of the injury overwhelm the other environmental sources of variance. In contrast, clinical trials in many idiopathic pain conditions without characteristic structural lesions (e.g. , fibromyalgia, chronic tension-type headache, irritable bowel disease, nonspecific low back pain) often yield inconsistent results even with large sample sizes. This poor signal-to-noise ratio may be due to the diverse environmental factors that prompt a small proportion of affected subjects to seek medical attention and might lower the sensitivity of genetic studies.
One could theoretically maximize one’s chances for success in the candidate gene “lottery” by testing every gene. Technological advances over the next 5–10 yr will probably make it feasible to correlate a trait with multiple markers in every human gene, 1,13but at current genotyping costs of 20 cents an assay, it would cost $60,000 per patient to complete a 300,000–single nucleotide polymorphism panel. Moreover, sample sizes in the thousands would be required to overcome the statistical correction for this many multiple comparisons.
With current technology, association studies must restrict their focus to a limited set of candidate genes. The purpose of this article is to propose a systematic approach to improving the odds of success in examining any clinical phenotype in its early stages of genetic analysis. In particular, we will suggest a method to prioritize the choice of candidate polymorphisms and describe the relation between the sample size and the number of candidate loci one can examine simultaneously with adequate power to detect a given RR. Association studies of pain candidate genes have already shown promise 14and may help to prioritize the hundreds of potential molecular targets for analgesic development, lead to diagnostic tests for risk of chronic pain, or identify novel pain mediators. The strategies outlined below will probably have to be modified in several years based on the actual results of the initial group of human pain candidate gene studies and technical advances in genotyping and bioinformatics.
Materials and Methods
We devised a method for prioritizing candidate genes and polymorphisms for chronic pain studies by rating each polymorphism in a candidate gene according to three criteria: (1) strength of evidence supporting involvement of the gene in pain processing, (2) frequency of the specific variant, and (3) likelihood that the polymorphism alters function. We assigned each polymorphism zero to three points in each of these categories, with a maximum score of 9.
1. Involvement in Pain Processing: We searched recent textbook chapters and reviews 15–17and the Society of Neuroscience abstracts from 2000 and 2001 to compile a list of approximately 200 molecules ( appendix 1) that basic scientists have described to be involved in pain processing. We assigned one point for a single laboratory reporting involvement, two points for reports from multiple groups, and three points if there were multiple reports specifically describing involvement in animal models of neuropathic pain, the focus of our human genetic studies. Molecules without reported involvement in pain processing were excluded from our final priority list, even if they had maximum scores for the two other criteria.
2. Frequency: Two authors (I. B., M. B. M.) performed a PubMed search for each of the 200 molecules using the search query [molecule name] AND human AND polymorphism and read pertinent abstracts and articles. We assigned zero points if the population frequency (proportion of all chromosomes) of the variant was less than 3%, one point for 3–10%, two points for 10–30%, and three points for 30–50%.
3. Function: We examined articles resulting from the PubMed search for evidence of functional consequences of polymorphisms of the 200 candidate genes. We assigned one point if the variant changed an amino acid; two points for a single report that the variant changed the amount of message or protein expression or function, or was associated with a different clinical outcome from the common allele for a clinical phenotype; and three points for independent replication of any of these types of evidence.
Testing Individual Polymorphisms versus Haplotypes
Most published association studies focus on individual polymorphisms, but the current approach of many laboratories is to type many regularly spaced markers on the candidate gene to determine haplotype blocks, which are combinations of common alleles that occur together over 10- to 100-kilobase lengths of DNA. Over each of these DNA segments, approximately 90% of individuals have one of the two to five most common haplotypes. When loci are present in haplotype blocks, their information can be combined and haplotype can be used as genotype. If approximately six loci are tested per block, there is little loss of power to detect the effect of a moderately abundant but unknown functional locus between the tested markers for that block, compared with testing that locus specifically. 18In the discussion that follows, we will usually refer to individual polymorphisms, but the same considerations and methods can be applied using the haplotype block as the unit.
We assume the investigator is studying a cohort of patients exposed to the same injury or disease and genotyping all of the patients, regardless of whether they develop persistent pain. (If patients are plentiful and inexpensive to screen, genotyping only those with clinical outcomes at either extreme of the range may be more statistically informative and cost efficient. 19)
Table 1shows the range of experimental variables we included in the sample size calculations. For ease of calculation, we assume that the outcome is dichotomous—e.g. , at a certain time after a uniform injury or disease, patients have either pain or no pain. Somewhat more information would be preserved if pain were analyzed as a lengthier ordinal scale or a continuous measure, yielding slightly greater power.
We assumed an autosomal dominant model of inheritance, i.e. , one copy of the minor allele confers the maximal difference in phenotype from the homozygote for the major allele. However, depending on the relation between the phenotype and the amount and function of the protein coded by the gene, the appropriate model may be recessive (two copies of the minor allele change the phenotype) or codominant (two copies of the minor allele change the phenotype more than one copy). A dominant model will give more optimistic sample size estimates than a recessive model, but one can readily interconvert the two estimates by a method that will be illustrated. A codominant model 14may offer greater power than the dominant model presented, by providing richer information from the range of zero, one, or two copies of the polymorphic allele.
Sample sizes are separately estimated for 10, 20, or 40% incidences of pain or other outcome of interest. RR is the incidence rate of pain in the group “exposed” to one or two variant alleles (P1), divided by the incidence rate in the “unexposed” group (P2) homozygous for the common allele. The association between the candidate genes and pain was assessed by the formulation of the hypothesis H0: P1− P2=0 versus HA: P1− P2>0, where P1− P2is the mean of the observed proportions p1− p2in the exposed and nonexposed groups. The test is based on the two-sample binomial test.
We assume a biallelic model; the candidate genes are in Hardy-Weinberg equilibrium with susceptibility (minor) allele A and normal allele (major) a, and allele A has frequency of p in the population. Thus, in a recruited population, the expected sample size ratio of non-exposed group (aa) to exposed group (AA + Aa) is r = (1 − p) 2/(p2+ 2p (1 − p)). We assume a set of k candidate loci (k = 1 to 5,000) will be investigated. Candidate loci are assumed to be independent of each other. Multiple testing adjustment is performed using the usual Bonferroni error (α*=α/k). Therefore, the sample size could be overestimated, if there is linkage between the candidate genes. Sample sizes were determined assuming RRs of 1.5, 2.0, or 2.5 and minor allele or haplotype frequency ranging from 5 to 30%.
The exposed and unexposed groups generally have unequal sample sizes caused by the disparate frequencies of the major and minor allele for the candidate genes. The sample size for the exposed group (nE) is estimated by (1) and (2) in appendix 2. 20The sample size for nonexposed group (nC) is then nC= r · nE. The total sample size needed is given as N = (r + 1) · nE.
Prioritization of Candidate Polymorphisms
Table 2shows the highest ranked candidate polymorphisms for chronic neuropathic pain studies. 21–40Even at this early stage of genome research, many candidates ranked high by all our criteria based on replicated peer-reviewed articles. The largest single group code for cytokines that have been implicated in peripheral and central nervous system mechanisms in many studies of neuropathic and inflammatory pain: interleukin (IL)-6, tumor necrosis factor (TNF)-α, IL-1β, IL-10, and IL-13. Polymorphisms of other inflammatory mediators are also represented, such as neuronal and inducible nitric oxide synthase and the B1 and B2 bradykinin receptors. Other polymorphisms affect genes for neurotransmitters thought to transmit or inhibit pain, their receptors, transporters, and metabolic enzymes: the serotonin transporter, prodynorphin, μ-opioid receptor, α2A-adrenergic receptor, kainate-3 receptor, catechol-O-methyltransferase, and tyrosine hydroxylase. Another group consists of nerve growth factors and their receptors, such as glial-derived nerve growth factor, its receptor RET, and brain-derived nerve growth factor.
Figures 1–3show total sample size plotted against the number of independent candidate polymorphisms tested if pain incidence is 10, 20, or 40% in the group without a pain-causing minor allele. (The calculations also apply to searches for pain-preventing alleles.) Within each of the three figures, the four panels represent the cases where the minor alleles have population frequency of 5, 10, 20, or 30%. Within each panel, the three curves represent an RR conferred by the minor allele of 1.5, 2.0, or 2.5. Figure 4shows similar sample size curves for a case in which one tests up to 5,000 independent polymorphisms.
Figures 1–4show that the RR is the main factor driving sample size. Although N increases considerably as one increases the number of candidate genes from 1 to 10 (figs. 1–3), only modest additional increases in N are needed to test hundreds or thousands of additional loci (fig. 4). As one increases the incidence of the less common phenotype (fig. 1,vs. fig. 2,vs. fig. 3) or population frequency of the minor allele (four panels within each figure), one can decrease N almost reciprocally.
For common minor alleles, one can approximate the required sample size from the curves in figures 1–3that assume a dominant model. For example, consider a recessive model for a study of candidate genes with minor allele frequency of 30%. Nine percent of individuals will be homozygous, so sample sizes will be slightly greater than those illustrated for the dominant model curves in the upper left panels of figures 1–3for minor allele frequency of 5%, in which case one would expect 9.75% of individuals to have at least one copy of the allele. For less common minor alleles, one may calculate the proportion of homozygotes, p22, and approximate the sample size from the curves in the upper left of figures 1–3using the formula, N (recessive) = (N from figure) × 0.0975/p22. (The exact number will be slightly lower because with rare minor alleles, the large number of unexposed patients allows a small decrease in the number of those “exposed” to the homozygous recessive condition.) For a codominant model, one has a three-group study design, and one would need to make further assumptions regarding the RR pattern before deriving the necessary sample sizes.
Our search of the published pain and human genetics literature identified many attractive candidate polymorphisms, several of which have had preliminary confirmation in the published literature. 14,41We do not claim that our scoring system is the optimal one or that our priority list includes all of the best candidate genes for neuropathic pain. We merely wish to illustrate how one might systematically approach the pain and clinical genetics literature to design one’s own study.
The prominence of cytokines and other inflammatory mediators on the list may reflect the adaptive value of immune gene mutations to maintain diverse responses to infectious agents. 42Many of the polymorphisms in table 2have minor alleles with population frequencies greater than 20%, increasing the power to detect dominant, codominant, or recessive effects, or interactions with common polymorphisms at other loci. Research groups differ on which specific site in many candidate genes is responsible for altered protein expression and disease risk; e.g. , there are proponents of multiple rival TNF-α43and IL-6 promoter polymorphisms. 44In such cases if not in all, multiple regularly spaced markers across the gene should be typed.
We have illustrated this prioritization process with a search of the published literature, which was adequate to identify 20 common polymorphisms that have been known long enough to accumulate replicate evidence for altered function. However, dbSNP, Celera, or other specialized genetic databases are essential for prioritizing the much larger number of polymorphisms recently catalogued by the Human Genome Project or for selecting markers for a haplotype study of any candidate gene.
Predicting Functional Effects of Polymorphisms from Gene and Protein Databases
Our current information about the functional consequences of genetic variants lags far behind our knowledge of their location and frequency. In the absence of direct evidence about biochemical function in model systems or clinical phenotype, there are several potential methods for predicting functional impact, which differ according to whether the polymorphism is in a protein-coding region, a promoter region, or an intron. If the polymorphism is in a coding region, one can predict from the triplet code whether the polymorphism leaves the amino acid sequence unchanged, changes an amino acid, or more grossly disrupts translation. Should the amino acid change and the structure of the protein or a homolog is known,**one may assign a score reflecting whether the amino acid change is likely to change structure or binding affinity in a functionally important region of the protein.
If the protein lacks structural homologs in the Protein Data Bank, one may turn to new protein structure–modeling tools that identify secondary (α helices, β strands, and coils) and tertiary structures. There are several public online methods available for secondary structure predictions, including PSIPRED, PHD, and PROF.††
For three-dimensional structure prediction, homology (also known as comparative ) modeling or fold recognition methods are used. In homology modeling, the sequence whose structure is to be predicted is derived from a known sequence structure, which has biophysically solved three-dimensional structure in the Protein Data Bank. Homology modeling is not appropriate for proteins that do not have related structural homologs in the three-dimensional data banks. Many proteins differ in sequence similarity but tend to fold in somewhat similar fashion. Several relatively new fold recognition methods detect fold similarities between known three-dimensional structures by evaluating how well the amino acid sequences of an unknown protein fits into a fold of one of the known three-dimensional structures. 45Current structural genomics initiatives are rapidly expanding the available catalog of three-dimensional structures of proteins.
Polymorphisms in promoters, which account for most of the high priority pain candidates in table 2, have been shown to affect gene function by changing the three-dimensional structure of the promoter and altering the binding of transcription factors or RNA polymerase. Several bioinformatics resources such as TRANSFAC, Eukaryotic promoter database, Data Base of Transcriptional Start Sites, and rSNP Guide 46provide instant access to all known promoter sequences and transcription factor binding sites. Novel, yet unknown promoter sequences can be identified by motif search using MEME‡‡or AlignACE (Aligns Nucleic Acid Conserved Elements)§§tools, but the development of tools to predict the effect of promoter polymorphisms on function is in its early stages. 47Polymorphisms within introns may affect gene function by affecting regulatory motifs within introns or RNA splicing mechanisms, 48but as with promoter polymorphisms, tools to predict these effects from the DNA sequence are not yet available.
The sample size calculations emphasize two main points. As the number of patients rises linearly, the number of tests possible goes up approximately exponentially (fig. 4). 49Because large-scale genotyping costs are expected to decrease rapidly in the next 5 yr, it makes sense to collect samples that will permit studies of hundreds or thousands of candidate alleles or haplotype blocks. 18Associations of common variants with many diseases 7have already been replicated even though only a small proportion of human genes have been examined. If common variants affect function enough to cause these diseases, it is plausible that these or other variants may be discovered to affect the risk of persistent pain. The chances of finding such a link will be greater if many or all genes can be studied simultaneously.
Unlike power for additional genetic tests, which can be bought cheaply with a few more patients, one needs large increases in sample size to detect smaller increases in RR (figs. 1–4). The key question, which will only be answered by multiple studies, is the magnitude of RR conferred by pain-related candidate polymorphisms. If the chronic pain phenotype proves analogous to Crohn disease 50or late-onset Alzheimer disease, 51where single copies of the NOD2 or ApoE4 allele impart an RR of approximately 3, one can see from figure 4that collection of several hundred patients will allow thousands of genes to be tested. However, most replicated common variant/ common disease associations show RRs between 1.2 and 2.0. 7RR values of 1.5 or less will require thousands of patients (fig. 4) to sensitively search the genome.
Pain researchers should not be discouraged by the latter estimate because RR imparted by a polymorphism can be increased by thoughtful definition of the phenotype. For example, the apparent RRs for breast cancer caused by BRCA1 and 2 mutations or those for some candidate genes for early-onset neurodegenerative diseases were increased to readily detectable levels by excluding older patients, in whom most cases were caused by factors other than the allele of interest.
Experience in animal models of pain and randomized trials has shown that the biologic signal-to-noise ratio may be amplified greatly in experimental designs in which there is a relatively severe and uniform injury, pain is assessed at multiple standardized time points to avoid recall bias, 52and as many relevant covariates as possible are measured and accounted for. The causal links between pain and key covariates, including depression, anxiety, and alcohol and drug abuse, have received little examination in longitudinal studies. The ability to explain these portions of variance would improve the statistical power of genetic studies.
Many polymorphisms differ in frequency among various ethnic groups. Table 2shows allele frequencies derived from studies of various white populations, but investigators should ascertain the frequencies of polymorphisms of interest in the populations they are considering. If the study population includes more than one ethnic group that differs in prevalence of both the polymorphism of interest and the disease phenotype, the study may be vulnerable to “population stratification” bias, illustrated by the following example. Consider a back pain genetic study performed in a region whose residents belong to the prosperous ethnic group A or the poor immigrants of ethnic group B. Group B subjects are more likely to have chronic back pain because more of them work at hard labor that causes back pain and have additional psychosocial stressors that tend to increase reported pain intensity. If a polymorphism at gene M has nothing to do with spinal degeneration or pain processing but has an allele 1 that is much more frequent in ethnic group B than in group A, an analysis of the whole group (A + B) may show a spurious association between allele 1 and back pain because of the asymmetric economic and occupational stratification of the mixed ethnicity population. Methods for detecting and correcting for population stratification are rapidly evolving 53and include subanalyses that take into account the confounding variables; the use of family-based designs such as the transmission-disequilibrium test; and new methods such as genomic control, in which one types a large set of genetic markers spaced through the genome to detect and correct for more subtle ancestral subgroups than can be identified by conventional ethnic labels. Reviewers of genetic grant applications and papers often scrutinize the methods for detecting population stratification, so investigators should consult local experts about the most current approaches.
Potential Value of Genetic Studies of Human Pain
Association studies powered to examine many polymorphisms may improve pain diagnosis and therapy. Using current candidate gene technology, pharmaceutical firms could use human data to prioritize among the dozens of potential molecular targets addressed by drugs in their libraries. Such studies would be unable to assess candidate genes containing no common functional variants, but at least one quarter of human genes have common variants changing amino acid sequence in coding regions, 54and others may cause functionally relevant changes in regulatory regions. As dense whole genome methods become available, human studies may reveal either totally novel therapeutic targets or provide information to help basic scientists to prioritize research on the hundreds of molecules up-regulated or down-regulated by painful injuries. 55–57
Most of our discussion has emphasized the potential of large studies to search many polymorphisms, but some clinical researchers may contemplate adding the assays of several polymorphisms as a secondary aim in smaller studies of pain treatment or physiology. Our analysis suggests that this may only be worthwhile if the polymorphisms are common and have substantial functional effects. In large or small studies, investigators might modify the criteria and weightings that we used in our candidate prioritization, but we suggest that they plan the research program systematically at the start, rather than merely test for any polymorphism whose assay happens to be available. The risk of the latter approach is that were the researcher lucky enough to hit on an important variant, the statistical correction for multiple tests might make it difficult to persuade a reviewer this was more than a chance result. An alternative to our approach of skimming the most attractive candidates from all categories of pain mediators might be to choose a group of candidate genes all involved with the same aspect of pain processing, even if major effects of the polymorphisms on function have not yet been proven. In this case, collection of outcome measures would be intensively focused on that aspect of pain.
The design of future genetic studies of pain will be shaped by future insights into fundamental questions about pain, such as whether subtypes of musculoskeletal, neuropathic, and visceral pain are processed by mostly similar or differing mechanisms.
Genetic methods may be among the most powerful tools available to answer these questions. We hope that clinical pain researchers will take full advantage of the new genomic resources to make human pain studies the equal of animal research as a source of fundamental discoveries.
The authors thank Raymond A. Dionne, D.D.S., Ph.D. (Chief, Pain and Neurosensory Mechanisms Branch, National Institute of Dental and Craniofacial Research, National Institutes of Health, Bethesda, Maryland), and Michael J. Iadarola, Ph.D. (Senior Investigator, Pain and Neurosensory Mechanisms Branch, National Institute of Dental and Craniofacial Research, National Institutes of Health), for helpful discussions, and Brendan O’Donnell, M.D., Suzan Khoromi, M.D., and Hyung-Suk Kim, D.D.S., Ph.D. (all Clinical Fellows, Pain and Neurosensory Mechanisms Branch, National Institute of Dental and Craniofacial Research, National Institutes of Health), for reviewing the manuscript.
Appendix 1: List of Putative Pain-related Molecules Used in Prioritization
Neurotransmitters, Receptors, Transporters, and Metabolic Enzymes
Opioid receptors (μ, δ, and κ)
N -methyl-d-aspartate receptor: NMDA R1 subunit, R2A–D subunit
γ-Aminobutyric acid, GABAA, GABABreceptors and subtypes
Peripheral benzodiazepine receptor
Bradykinin receptors (BK1, BK2)
Vanilloid receptor, vanilloid receptor–like protein (VRP)
Pain-related cation-channel receptor (P2X3)
Calcitonin gene–related peptide and its receptor
Galanin and receptor
Cholecystokinin A and B receptors and precholecystokinin
Imidazoline receptor (I2)
Neurotensin and its receptors
Nicotinic cholinergic receptors
Muscarinic cholinergic M1 and M2 receptors
Vasoactive intestinal polypeptide and receptor
Serotonin receptors (5HT1A, B/D, 5HT2, 5HT3)
Nonopioid σ1 and 2 receptors
Somatostatin 2A receptor
Prostaglandin receptors (EP1–4)
Neuronal nitric oxide synthase (NOS1)
Inducible nitric oxide synthase (NOS2A)
Glutamate carboxypeptidase II
Adenosine kinase adenosine 1 and 2A receptors
Equilibrative nucleoside transporter (ENT)
Glycine receptor and transporter
Cannabinoid receptor anandamide
Endothelin-1 and ET-A receptor
α1- and α2A-adrenergic receptors
Na: Voltage-gated Na+channels α and β subunits
Tetrodotoxin-resistant sodium channel (SNS)
Sensory neuron–specific sodium channel (SNS-1)
Epithelial sodium channel/degenerin (DEG/Enac)
Amiloride-sensitive epithelial sodium channel (BnaC2)
Potassium channels (GIRKs)
Calcium: N type, α1Bsubunit
Inflammatory Mediators and Their Receptors
Interleukin 1α, β, and γ receptors
Interleukin 2 receptor β
Tumor necrosis factor α and receptors (TNFR I, II)
Cyclooxygenase 1, 2
Leukemia inhibitory factor (LIF)
Phospholipase type 2
Growth Factors and Their Receptors
Nerve growth factor and neurotrophin receptor (Trk1)
Brain-derived neurotrophic factor (BDNF)
Neurotrophin receptors (NT 4/5, Trk B, NT3, Trk C)
Glial cell line–derived neurotrophic factor (GDNF)
GDNF family receptor alpha 1 (GFRα1)
Protooncogene (tyrosine kinase)
Low-affinity neurotrophin receptor (P 75 receptor)
Phospholipase C (γ1 and β)
GDNF family receptor α3 (GFRα3)
Extracellular signal–regulated protein kinase 1, 2 p38 mitogen-activated kinase (p38 MAPK)
Calcium calmodulin kinase II and IIα
Phospholipase C β4
Phospholipase C γ and ε
Phosphorylated (activated) cyclic AMP response element binding protein
Regulator of G-protein signaling (RSG3)
Protein kinase A
G protein–coupled receptor kinase 2
Nuclear factor κB
Protein kinase B (Akt)
Protein tyrosine kinase (Src)
The statistic used to compare proportions can be written as
where p̄= (p1+ rp2)/(r+1) and q̄= 1 −p̄. The needed sample size for control group is given as nC= nE. The continuity correction is given by the formula
for the uncorrected version derived by
where Q1= 1 − P1, Q2= 1 − P2, P̄= (P1+rP2)/(r+1), and Q̄= 1 −P̄.