By Richard Bracken II, DO
Critical appraisal of the latest in evidence-based practices is important to many emergency medicine physicians, but how can we quantify whether the conclusion of a study is a real result or simply chance? In the past, researchers have relied heavily on p-values to assess statistical significance, but all that may be about to change. In July 2016, Pulm Crit blogger Josh Farkas published an article applying the “fragility index” to the NINDS trial in which researchers concluded that TPA improves outcomes for patients with ischemic stroke. Farkas’ article generated significant buzz in the FOAM world and could completely reframed our understanding of the trial’s outcomes and reproducibility.
In its most basic form, the fragility index is the number of different or opposite outcomes it would take in order for us to deem the study results statistically insignificant. The smaller the fragility index, the more “fragile” the study’s outcome. Consider a hypothetical trial with 100 patients presenting with abscess equally randomized to receive I&D or I&D + antibiotics and the primary outcome is treatment failure requiring hospitalization. If 13 patients in the I&D group were hospitalized compared to only 3 patients in the I&D + antibiotics group, the results of this trial would be statistically significant with a p value of .04. However, if only two patients from the I&D +antibiotics group required hospitalization then the results would fail to be statistically significant (p = .07). Therefore, the fragility index would be two outcomes or two patients in this case since our example uses human subjects.
Application of the fragility index to recent trials, including the NINDS trial discussed in Farkas’ article, perfectly illustrates its utility. The majority of stroke literature suggests that stroke patients experience either no benefit from tPA or suffer harm from tPA. In fact, of the 12 studies to evaluate the use of tPA for ischemic stroke, only two showed a statistically significant benefit to using systemic thrombolytics: ECAS III and NINDS. While the results of both the ECAS III and NINDS trials were statistically significant with p-values less than <.05, the fragility index for ECAS III was only one patient and for NINDS was only three patients. The fragility index quantifies what many have been saying about both studies for years: while their results may be technically statistically significant, they are far from being significant in practical terms.
Different providers have different evidentiary thresholds for changing their clinical practice. But when it comes to administering medications as risky as tPA, that threshold is always heightened. Like most things in medicine, p-values should not be considered in isolation. The fragility index is one tool to provide context and understanding to the way we appraise medical literature and it is likely we will only see its utilization in the FOAM world continue to increase.
- Fragility index = the number of different or opposite outcomes it would take in order for the study results to be statistically insignificant.
- The smaller the fragility index, the more “fragile” and less reproducible a study’s outcome is.
- ECAS III and NINDS had fragility index of one and three respectively
To read more FOAM:
Feinstein AR. The unit fragility index: an additional appraisal of “statistical significance” for a contrast of two proportions. J Clin Epidemiol. 1990;43(2):201-9.
Hacke W, Kaste M, Bluhmki E, et al. Thrombolysis with alteplase 3 to 4.5 hours after acute ischemic stroke. N Engl J Med. 2008;359(13):1317-29.
Tissue plasminogen activator for acute ischemic stroke. The National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group. N Engl J Med. 1995;333(24):1581-7.