E-Note for Adult
Medicine
Stat | Lytes |
Drugs | ID |
Heart | Lungs |
Kidneys | GI |
Rheum | Heme-Onc |
Endo | Neuro |
Derm | Misc. |
Resource |
Home
NOTE: To view the article with
Web enhancements, go to:
http://www.medscape.com/ABFP/JABFP/1999/v12.n04/fp1204.07.mise/fp1204.07.mise-01.html.
Critical Appraisal of the Literature
William F. Miser, MD, MA, Department of Family Medicine, University of Ohio Hospitals Clinic, Columbus. Ohio.
[J Am Board Fam Pract 12(4):315-333, 1999. © 1999 American Board of Family Practice]
Case 1
A 47-year-old perimenopausal woman, in your office for a well-woman examination, has a newspaper clipping given to her by a friend. The clipping reviews a recent article published in a well-known national medical journal that warns against the use of hormonal replacement therapy (HRT) because of an increased risk of breast cancer.[1] Although she is at low risk for this cancer, and findings of her breast examination are normal, she resists your recommendation to begin HRT. When you discuss with her the results of an article showing that postmenopausal use of estrogen reduces the risk of severe coronary heart disease,[2] she counters with another article from the same issue that concludes that cardiovascular mortality is increased in estrogen users.[3] As you review these studies, you fail to recognize that all have serious flaws. Also, you do not have available articles that are more methodologically sound that show the overwhelming benefit of HRT[4-6] with no increased risk in breast cancer.[7-9] She leaves triumphantly from your office without a prescription, and you feel confused about the overall benefit of HRT.
After you make a mental note to read more about HRT, you see your next patient, a 28-year-old man with allergic rhinitis. He hands you a study he obtained from the Internet, which concludes that the latest antihistamine is far superior in efficacy to all of the other antihistamines currently available on the market. He asks you for this new prescription, realizing that his health insurance company will not approve it unless you justify to them why he should take this particular antihistamine. You promise to review the article and call him later in the week with his prescription.
The mother of your next patient, a 12-year-old boy, requests a test that you have never heard of. She hands you yet another article, which suggests that physicians who do not offer this test are guilty of negligence. As you review this study, you wish that you remembered more about how to assess an article critically, and you hope that the rest of the day goes better.
The above scenarios are occurring more frequently as patients are increasingly gaining access to medical information and then looking to their physicians for its interpretation. Gone are the days when what the physician says goes unchallenged by a naive patient. The public is inundated with medical advice and contrary views from the newspaper, radio, television, popular lay journals, and the Internet, and physicians are faced with the task of damage control.
Physicians also encounter constantly changing recommendations for clinical practice and an information jungle.[10,11] With 6 million medical articles published each year, the amount of information available is overwhelming.[12] If clinicians, trying to keep up with all of the literature, were to read two articles per day, in just 1 year, they would fall 82 centuries behind in their reading!
Despite this gargantuan volume of medical literature, less than 15 percent of all articles published on a particular topic are useful.[13] Most articles are not peer-reviewed, are sponsored by those with commercial interests, or arrive free in the mail. Even articles published in the most prestigious journals are far from perfect. Analyses of clinical trials published in a wide variety of journals have described large deficiencies in the design, analysis, and reporting; although improving with time, the average quality score of clinical trials during the past two decades is less than 50 percent.[14-16] As a result, many diagnostic tests and therapies are not rigorously evaluated before becoming established as a routine part of practice, which leads to the widespread use of tests with uncertain efficacy and treatments that are either ineffective or that may do more harm than good.[17] Readers must thus take personal responsibility for judging the validity and clinical importance of the medical literature.
The challenge to physicians is to provide up-to-date medical care incorporating valid new information. Our ultimate goal as clinicians should be to help patients live long, functional, satisfying, pain- and symptom-free lives. To do so requires us to balance compassion with competence. One of the essential skills needed to maintain competence, to provide patients with the best possible care, and to do more good than harm is the ability to critically appraise the literature. We must be able to find potentially relevant information, filter out the best from the much larger volume of less credible information, and then judge whether to believe the information that remains.[12]
The two major types of studies (Figure 1) reported in the medical literature are (1) those that report original research (analytic, primary studies), and (2) those that summarize or draw conclusions from original research (integrative, secondary studies). Primary studies can be either experimental (an intervention is made) or observational (no intervention is made). The purpose of this article is to provide an overview of a systematic, efficient, and effective approach to the critical review of original research. This information is pertinent to physicians no matter what their setting, be it an academic medical center or a rural solo practice. Because of space limitations, this article cannot address everything in exhaustive detail, and the reader is encouraged to refer to the suggested readings at the end for further assistance.
Figure 1. Major types of studies found in the medical literature.
It is important that clinicians master the skills of critical appraisal of the literature if they are to apply evidence-based medicine to the daily clinical problems they encounter. Most busy clinicians do not have hours to spend critiquing an article, however; they need a brief and efficient screening method that allows them to know whether the information is valid and applicable to their practice. By applying the techniques offered here, it is possible to approach the literature confidently and base clinical decisions on "evidence rather than hope."[18]
This approach is modified and adapted from several excellent sources. The Department of Clinical Epidemiology and Biostatistics at McMaster University in 1981 published a series of useful guides to help the busy clinician critically read clinical articles about diagnosis, prognosis, etiology, and therapy.[19-23] These guides have subsequently been updated and expanded to focus more on the practical issues of first finding pertinent articles and then validating (believing) and applying the information to patient care.[18,24-43] The recommendations from these users' guides form the foundation upon which techniques developed by Slawson, Shaughnessy, and Bennett[10,11] are modified and added.
With an article in hand, the process involves three steps: (1) screen for initial validity and relevance, (2) determine the intent of the article, and (3) evaluate the validity of the article based on its intent. This paper focuses on the type of study most germane to clinical practice: a therapeutic intervention. To make the most of this exercise, it would be helpful for the reader to obtain a copy of the article mentioned in case 2, and to follow the steps outlined below. The users' guides and other resources listed at the end of this paper are helpful in learning how to appraise other types of articles.
Croup season is approaching, and you have a rather large pediatric population in your practice. Since you finished your residency, you have been treating croup with mist therapy but have been dissatisfied with its results. As you talk to a colleague about this problem, she hands you the following article recently published in 1998 in the Journal of the American Medical Association, "Nebulized Budesonide and Oral Dexa-methasone for Treatment of Croup -- A Randomized Controlled Trial."[44] You were taught that the use of corticosteroids for croup is controversial and should be reserved for those in the hospital. You have a few minutes before seeing your next patient but are unsure whether you have the time to read this article.
Step 1 - Screen for Initial Validity and Relevance
The first step when looking at an article is to ask whether the article is worth taking the time to review in depth. This question can be answered within a few seconds by asking six simple questions (Table 1). A stop or pause answer to any of these questions should prompt you to seriously consider whether you should spend the time to critically review the study. The article mentioned in case 2 will be used to illustrate these points.
Articles published in the major peer-reviewed journals have already undergone an extensive process to weed out flawed studies and to improve the quality of the ones subsequently accepted for publication. When an investigator submits a manuscript to a peer-reviewed journal, the editor typically will first establish whether the manuscript is suitable for that journal, and then, if acceptable, send it to several reviewers for analysis. Peer reviewers are not part of the editorial staff but usually are volunteers who have expertise in both the subject matter and research design. The purpose of the peer review is to act as a sieve by detecting those studies that are flawed by poor design, are trivial, or are uninterpretable. This process, along with the subsequent revisions and editing, improves the quality of the paper and its statistical analyses.[46-49] The Annals of Internal Medicine, for example, receives more than 1200 original research manuscript submissions each year. The editorial staff reject one half after an internal review, and the remaining half are sent to at least 2 peers for review. Of the original 1200 submissions, only 15 percent are subsequently published.[50]
Because of these strengths, peer review has become the accepted method for improving the quality of the science reported in the medical literature.[51] This mechanism, however, is far from perfect, and it does not guarantee that the published article is without flaw or bias.[13] Other types of publication biases are inherent in the process despite an adequate peer-review process. Studies showing statistically significant (positive) results and having larger sample sizes are more likely to be written and submitted by authors, and subsequently accepted and published than are nonsignificant (negative) studies.[52-55] Also, the speed of publication depends on the direction and strength of the trial results; trials with negative results take twice as long to be published as positive trials.[56] Finally, no matter how good the peer-review system, fraudulent research, although rare, is extremely hard to recognize.[57]
The article you are assessing is published in the Journal of the American Medical Association (JAMA). You are almost certain that this journal is peer-reviewed, which is confirmed in their Instructions for Authors ("JAMA is an international, peer-reviewed, general medical journal..."). You answer "yes" to this question.
In the article you are assessing, you notice at the bottom of the first page that the study was performed in two university hospitals in Canada. There is no reason to believe children with croup for whom you provide care are different from those seen in Canada, but you begin to wonder whether the study done in a tertiary care center is applicable to your practice. You decide to continue critiquing this article, but make a mental note to consider this issue later.
You again review the information about the authors, and look at the end of the article for this information. You find that funding support was from several foundations, but none from a company that has commercial interests in the drugs used in the study.
The answers to the next three questions dealing with clinical relevance to your practice can be obtained by reading the conclusion and selected portions of the abstract. Clinical relevance is important not only to physicians but also to their patients. Rarely is it worthwhile to read an article about an uncommon condition you have never encountered in your practice, or about a treatment or diagnostic test that is not and never will be available to you. Reading these types of articles might satisfy your intellectual curiosity but will not impact your practice. Slawson and his colleagues have emphasized that for a busy clinician, articles concerned with patient-oriented evidence that matters (POEMs) are far more useful than those articles that report disease-oriented-evidence (DOE).[10,45] So, given a choice between reading an article that describes the sensitivity and specificity of a screening test in detecting cancer (a DOE) and one that shows that those who undergo this screening enjoy an improved quality and length of life (a POEM), you would probably want to choose the latter.
In only a few seconds, you have quickly answered six pertinent questions that allow you to decide whether you want to take the time to critically review this article. This weeding tool allows you to recycle those articles that are not relevant to your practice, thus allowing more time to examine the validity of those few articles that might have an impact on the care of your patients.
Step 2 - Determine the Intent of the Article
If you decide to continue with the article after completing step 1, your next task is to determine why the study was performed and what clinical question(s) the investigators were addressing.[60] The four major clinical categories found in articles of primary (original) research are (1) therapy, (2) diagnosis and screening, (3) causation, and (4) prognosis (Table 2). The intent of the article can usually be found by reading the abstract and, if needed, by skimming the introduction (usually found in the last paragraph) to determine the purpose of the study.
For the article mentioned in case 2, the investigators address a therapeutic intervention (the use of oral dexamethasone in treating mild-to-moderate croup). Because you are seriously considering including this therapeutic intervention in your practice, you decide you need to spend the time to validate critically the conclusions of the study.
Step 3 - Evaluate the Validity of the Article Based on Its Intent
After an article has successfully passed the first two steps, it is time to assess critically its validity and applicability to your practice setting. Each of the four clinical categories found in Table 2 (and illustrated in Figures 2 through 5) has a preferred study design and critical items to ensure its validity. The Users' Guides published by the Department of Clinical Epidemiology and Biostatistics at McMaster University provide a useful list of questions to help you with this assessment. Modifications of these lists of questions are found in Tables 3 through 6.
Figure 2. Randomized controlled trial, considered the reference standard for studies dealing with treatment or other interventions.
Figure 3. Cross-sectional (prevalence) study. This design is most often used in studies on diagnostic or screening tests.
Figure 4. Prospective and retrospective cohort study. These types of studies are often used for determining causation or prognosis. Data are typically analyzed using relative risk.
Figure 5. Case-control study, a retrospective study in which the investigator selects a group with disease (cases) and one without disease (controls) and looks back in time at exposure to potential risk factors to determine causation. Data are typically analyzed using the odds ratio.
To get started on this step, read the entire abstract; survey the boldface headings; review the tables, graphs, and illustrations; and then skim the first sentence of each paragraph to grasp quickly the organization of the article. You then need to focus on the methods section, answering a specific list of questions based on the intent of the article. Because the article from case 2 deals with a therapeutic intervention, you begin reading the methods section of the article and address the questions listed in Table 3.
Randomization diminishes the potential for investigators selecting participants in a way that would unfairly bias one treatment group over another (selection bias). It is important to determine how the investigators actually performed the randomization. Although infrequently reported in the past, most journals now require a standard format that provides this information.[15] Various techniques can be used for randomization.[61] Investigators can use simple randomization; each participant has an equal chance of being assigned to one group or another without regard to previous assignments of other participants. Sometimes this type of randomization will result in one treatment group being larger than another, or by chance, one group having important baseline differences that might affect the study. To avoid these problems, investigators can use blocked randomization (groups are equal in size) or stratified randomization (subjects are randomized within groups based on potential confounding factors such as age or sex).
To determine the assignment of participants, investigators should use a table of random numbers or a computer that produces a random sequence. The final allocation of participants to the study should be concealed from both investigators and participants. If investigators responsible for assigning participants are aware of the allocation, they might unwittingly (or otherwise) assign those with a better prognosis to the treatment group and those with a worse prognosis to the control group. RCTs that have inadequate allocation concealment will yield an inflated treatment effect that is up to 30 percent better than those trials with proper concealment.[62,63]
In the article you are assessing, you find in the second paragraph of the methods section that the study design was an RCT, and that participants were randomized to one of three groups: nebulized budesonide and oral placebo, placebo nebulizer and oral dexamethasone, and nebulized budesonide and oral dexamethasone. A central pharmacy randomized the patients into these groups using computer-generated random numbers in random blocks of 6 or 9 to help ensure equal distribution among the groups, and then stratified them by study site. The randomization list was kept in the central pharmacy to ensure allocation concealment. You answer "yes" to this question and proceed with your assessment.
At the conclusion of the study, participants should be analyzed in the group in which they were originally randomized, even if they were noncompliant or switched groups (intention-to-treat analysis). For example, a study is designed to determine the best treatment approach to carotid stenosis, and patients are randomized to either carotid endarterectomy or medical management. Because it would be unethical to perform sham surgery, investigators and patients cannot be blinded to their treatment group. If, during the initial evaluation, participants randomized to endarterectomy were found to be poor surgical candidates, they might be treated medically. At the conclusion of the study, however, their outcomes (stroke, death) should be included in the surgical group, even if they did not have surgery; to do otherwise would unfairly inflate the benefit of the surgical approach.
Most journals now require a specific format for reporting RCTs that includes a chart allowing you to easily follow the flow of participants through the study.[15] In the article you are assessing, you notice in the chart that all but 1 of 198 participants were observed to study completion, which is an outstanding follow-up. You also notice in the methods section that the "primary analysis was based on the intention-to-treat principle." You answer "yes" to this question.
In the article you are assessing, you find that the dexamethasone syrup and placebo syrup were identical in taste and appearance. Since budesonide was slightly opaque and the nebulized placebo was clear saline, the investigators took extra precautions by packaging the solutions in brown syringes. The investigators went further by asking the research assistants and participants to guess which intervention the patients received; their responses were no greater than chance alone, indicating the blinding was successful. Assured that this study was properly conducted and double-blinded, you answer "yes" to this question.
In the article you are assessing, you find the groups to be similar, but not exact, in sex, age, history, croup score, and vital signs. Those in the dexamethasone-treated group had a slightly higher percentage of preceding upper respiratory tract infections than did those in the budesonide-treated group (67 percent vs 54 percent). The investigators do not include an analysis on whether this difference is statistically significant, but it is unlikely that this small difference would be clinically significant. It is in areas such as these that you must use your clinical experience and judgment to determine whether small differences are likely to influence outcomes. You are satisfied that the groups are similar enough, and answer "yes" to this question.
The choice of statistical test depends on the study design, the types of data analyzed, and whether the groups are independent or paired. The three main types of data are categorical (nominal), ordinal, and continuous (interval). An observation made on more than one participant or group is independent (eg, measuring serum cholesterol in two groups of participants), whereas making more than one observation on a single participant is paired (eg, measuring serum cholesterol in a participant before and after treatment). Based on this information, one can then select an appropriate statistical test (Table 7). Be suspicious of a study that has a standard set of data collected in a standard way but is analyzed by a test that has an unpronounceable name and is not listed in a standard statistical textbook; the investigators might be attempting to prove something statistically significant that truly has no significance.[65]
There are two types of errors that can potentially occur when comparing the results of a study with reality (Figure 6). A type I error occurs when the study finds a difference between groups when, in reality, there is no difference. This type of error is similar to a jury finding an innocent person guilty of a crime. The investigators usually indicate the maximum acceptable risk (the a level) they are willing to tolerate in reaching this false-positive conclusion. Usually, the a level is arbitrarily set at 0.05 (or lower), which means the investigators are willing to take a 5 percent risk that any differences found were due to chance. At the completion of the study, the investigators then calculate the probability (known as the P value) that a type I error has occurred. When the P value is less than the a value (eg, < 0.05), the investigators conclude that the results are statistically significant.
Statistical significance does not always correlate with clinical significance. In a large study, very small differences can be statistically significant. For example, a study comparing two antihypertensive drugs in more than 1000 participants might find a statistically significant difference in mean blood pressures of only 3 mmHg, which in the clinical realm is trivial. A P value of < 0.0001 is no more clinically significant than a P value of < 0.05. The smaller P value only means there is less risk of drawing a false-positive conclusion (less than 1 in 1000). When analyzing an article, beware of being seduced by statistical significance in lieu of clinical significance; both must be considered.
Instead of using P values, investigators are increasingly using confidence intervals (CIs) to determine the significance of a difference. The problem with P values are they convey no information about the size of differences or associations found in the study.[66] Also, P values provide a dichotomous answer -- either the results are significant or not significant. In contrast, the confidence interval provides a range that will, with high probability, contain the true value and provide more information than P values alone.[67-69] The larger the sample size, the narrower and more precise the confidence interval. A standard method used is the 95 percent confidence interval, which provides the boundaries in which we can be 95 percent certain that the true value falls within that range. For example, a randomized clinical trial shows that 50 percent of patients treated with drug A are cured compared with 45 percent of those treated with drug B. Statistical analysis of this 5 percent difference shows a P value of < 0.001 and a 95 percent confidence interval of 0 percent to 10 percent. The investigators conclude this improvement is statistically significant based on the P value. As a reader, however, you decide that a potential range of 0 percent to 10 percent is not clinically significant based on the 95 percent confidence interval.
In the article you are assessing, there was no statistical difference found among the groups in the change in croup score from baseline to final study assessment, time in the emergency department, hospitalization, and use of supplemental glucocorticoids. This trial is considered negative (no differences found). As such, you go on to the next question, which addresses these types of studies.
Prior to the start of a study, the investigators should do a power analysis to determine how many participants should be included in the study. Unfortunately, this step is often not done. Only 32 percent of the RCTs with negative results published between 1975 and 1990 in JAMA, Lancet, and New England Journal of Medicine reported sample size calculations; on review, the vast majority of these trials had too few patients that led to insufficient statistical power to detect a 25 percent or 50 percent difference.[72] Other studies have shown similar deficiencies in other journals and disciplines.[14,48,73,74] Whenever you read an article reporting a negative result, ask whether the sample size was large enough to permit investigators to draw such a conclusion. If a power analysis was done, check to find out whether the study had the required number of participants. If a power analysis was not done, view the conclusions with skepticism -- it might be that the sample size was not large enough to detect a difference.
In the article you are assessing, you find that the investigators did perform a power analysis, which, using the criteria established above, required a minimum sample size of 62 participants per group. You notice that in the final analysis, each group had more than this number. You are assured that this study had adequate power to detect a type II error, and answer "yes" to this question.
In the article you are assessing, the investigators treated each of the groups equally (except for the intervention drugs). They also looked at such factors as earlier upper respiratory tract infections and episodes of croup, which could have had a potential impact. Since you can think of none, you answer "no" to this question.
The investigators addressed this issue in the results section. Because the therapeutic interventions were equal, oral dexamethasone was recommended as the preferred therapy because it is less expensive and easier to administer.
Conclusion of Case 2
After a thorough assessment of this article, you conclude it is well designed with valid results. You feel confident that oral dexamethasone should be stocked in your office during croup season and that you will institute this treatment as a standard within your practice. As you apply this therapy, you also make a commitment to monitor its benefits and risks to your patients and to scan the literature for future articles that might offer additional information about croup therapy. Consistency of the results in your practice, as well as across multiple published studies, is one characteristic of the scientific process that leads to acceptance and implementation.
With some practice and the use of the worksheets, one can quickly (within a few minutes) perform a critical assessment of an article. While performing this appraisal, it is important to keep in mind that few articles will be perfect. A critical assessment is rarely black-and-white, but often comes in shades of gray.[24] Only you can answer for yourself the exact shade of gray that you are willing to accept when deciding to apply the results of the study to your practice. By applying the knowledge, principles, and techniques described in this paper, however, you can more confidently recognize the various shades of gray and reject those articles that are seriously flawed.
Address reprint requests to William F. Miser, MD, MA, Department of Family Medicine, The Ohio State University, 456 West 10th Ave, Columbus, OH 43210.
Is this article worth taking the time to review in depth?
A "stop" or "pause" answer to any of the following should prompt you to question seriously whether you should spend the time to review the article critically1. Is the article from a peer-reviewed journal? Articles published in a peer-reviewed journal have already gone through an extensive review and editing process. Yes
(go on)No
(stop)2. Is the location of the study similar to mine so the results, if valid, would apply to my practice? Yes
(go on)No
(pause)3. Is the study sponsored by an organization that might influence the study design or results? Read the conclusion of the abstract to determine relevance. Yes
(pause)No
(go on)4. Will this information, if true, have a direct impact on the health of my patients, and is it something they will care about? Yes
(go on)No
(stop)5. Is the problem addressed one that is common to my practice, and is the intervention or test feasible and available to me? Yes
(go on)No
(stop)6. Will this information, if true, require me to change my current practice? Yes
(go on)No
(stop)
Note: Questions 4 through 6 were adapted from Slawson and his Information Mastery Working Group.[45]
Clinical Category Description Preferred Study Design Therapy Tests the effectiveness of a treatment, such as a drug, surgical procedure, or other intervention Randomized, double-blinded, placebo- controlled trial (Figure 2) Diagnosis and screening Measures the validity (is it dependable?) and reliability (will the same results be obtained every time?) of a diagnostic test, or evaluates the effectiveness of a test in detecting disease at a presymptomatic stage when applied to a large population Cross-sectional survey (comparing the new test with a reference standard) (Figure 3) Causation Assesses whether a substance is related to the development of an illness or condition Cohort or case-control (Figures 4 and 5) Prognosis Determines the outcome of a disease Longitudinal cohort study (Figure 4)
Adapted from Greenhalgh.[60]
If the article passes the initial screening in Table 1, proceed with the following critical assessment by reading the methods section. A "stop" answer to any of the following should prompt you to question seriously whether the results of the study are valid and whether you should use this therapeutic intervention 1. Is the study a randomized controlled trial? Yes
(go on)No
(stop)a. How were patients selected for the trial? b. Were they properly randomized into groups using concealed assignment? 2. Are the patients in the study similar to mine? Yes
(go on)No
(stop)3. Are all participants who entered the trial properly accounted for at its conclusion? Yes
(go on)No
(stop)a. Was follow-up complete and were few lost to follow-up compared with the number of bad outcomes? b. Were patients analyzed in the groups to which they were initially randomized (intention to treat analysis)? 4. Was everyone involved in the study (participants and investigators) "blind" to treatment? Yes No 5. Were the intervention and control groups similar at the start of the trial? (check Table 1) Yes No 6. Were the groups treated equally (aside from the experimental intervention)? Yes No 7. Are the results clinically as well as statistically significant? Yes No a. Were the outcomes measured clinically important? 8. If a negative trial, was a power analysis done? Yes No 9. Were there other factors that might have affected the outcome? Yes No 10. Are the treatment benefits worth the potential harms and costs? Yes No
Adapted from material developed by The Department of Clinical Epidemiology and Biostatistics at McMaster University25 and by the Information Mastery Working Group.[10]
If the article passes the initial screening in Table 1, proceed with the following critical assessment by reading the methods section. A "stop" answer to any of the following should prompt you to question seriously whether the results of the study are valid and whether you should use this diagnostic test 1. What is the disease being addressed and what is the diagnostic test?___________________________________________________ 2. Was the new test compared with an acceptable reference standard test, and were both tests applied in a uniformly blind manner? Yes
(go on)No
(stop)3. Did the patient sample include an appropriate spectrum of patients to whom the diagnostic test will be applied in clinical practice? Yes
(go on)No
(stop)4. Is the new test reasonable? What are its limitations?
Explain: __________________________________________________________5. In terms of prevalence of disease, are the study participants similar to my patients? Varying prevalences will affect the predictive value of the test in my practice. Yes No 6. Will my patients be better off as a result of this test? Yes No 7. What are the sensitivity, specificity, and predictive values of the test? Sensitivity = a/(a + c) = _______ Specificity = d/(b + d) = _______
Positive predictive value = a/(a + b) = _______ Negative predictive value =c/(c + d) = _______
Test Result Reference Standard Result Positive Negative Positive a b Negative c d
Adapted from material developed by the Department of Clinical Epidemiology and Biostatistics at McMaster University27 and by the Information Mastery Working Group[10]
If the article passes the initial screening in Table 1, proceed with the following critical assessment by reading the methods section. A "stop" answer to any of the following should prompt you to question seriously whether the results of the study are valid and whether the item in question is really a causative factor. 1. Was there a clearly defined comparison group or those at risk for or having the outcome of interest? Yes
(go on)No
(stop)2. Were the outcomes and exposures measured in the same way in the groups being compared? Yes
(go on)No
(stop)3. Were the observers blinded to the exposure of outcome and to the outcome? Yes
(go on)No
(stop)4. Was follow-up sufficiently long and complete? Yes
(go on)No
(stop)5. Is the temporal relation correct? Does the exposure to the agent precede the outcome? Yes No 6. Is there a dose-response gradient? As the quantity or the duration of exposure to the agent increases, does the risk of outcome likewise increase? Yes No 7. How strong is the association between exposure and outcome? Is the relative risk (RR) or odds ratio (OR) large? Yes No
Adapted from material developed by The Department of Clinical Epidemiology and Biostatistics at McMaster University.[29]
If the article passes the initial screening in Table 1, proceed with the following critical assessment by reading the methods section. A "stop" answer to any of the following should prompt you to question seriously whether the results of the study are valid. 1. Was an inception cohort assembled? Did the investigators select a specific group of people initially free of the outcome of interest, and observe them over time? Yes
(go on)No
(stop)2. Were the criteria for entry into the study objective, reasonable, and unbiased? Yes
(go on)No
(stop)3. Was follow-up of participants adequate? (at least 70% - 80%) Yes
(go on)No
(stop)4. Were the patients similar to mine, in terms of age, sex, race, severity of disease, and other factors that might influence the course of the disease? Yes
(go on)No
(stop)5. Where did the participants come from? Was the referral pattern specified? Yes No 6. Were outcomes assessed objectively and blindly? Yes No
Adapted from material developed by the Department of Clinical Epidemiology and Biostatistics at McMaster University30 and by the Information Mastery Working Group.[10]
Types of Data Categorical
2 SamplesCategorical
>/= 3 SamplesOrdinal Continuous Independent variables Categorical,
2 samplesChi-square
Fisher exact-- -- -- Categorical,
>/= 3 samplesChi-square
(r x r)Chi-square
(r x r)-- -- Ordinal Mann-Whitney U
Wilcoxon rank sumKruskal-Wallis one-way analysis of variance (ANOVA) Spearman r
Kendall tau-- Continuous Student t ANOVA Kendall tau
Spearman r
ANOVAPearson correlation
Linear regression
Multiple regressionPaired observations McNemar Cochran Q Wilcoxon signed rank
Friedman two-way
ANOVAPaired t
* The test chosen depends on study design, types of variables analyzed, and whether observations are independent or paired. Categorical (nominal) data can be grouped, but not ordered (eg, eye color, sex, race, religion, etc). Ordinal data can be grouped and ordered (eg, sense of well-being: excellent, very good, fair, poor). Continuous data have order and magnitude (eg, age, blood pressure, cholesterol, weight, etc).
A superb article that addresses the concepts of POEMs and DOEs.
An excellent article that reviews how to manage one's way through the medical information jungle without getting lost or eaten alive.
Provides useful techniques on reading a review article.
Original McMaster Series -- despite being published in 1981, this series still has some great information!
A good series on the approach to keeping up with the medical literature.
The ultimate series written from the perspective of a busy clinician who wants to provide effective medical care but is sharply restricted in time for reading .
A great series that complements the User's Guide.