Week 16 – MELD

“A Model to Predict Survival in Patients With End-Stage Liver Disease”

Hepatology. 2001 Feb;33(2):464-70. [free full text]

Prior to the adoption of the Model for End-Stage Liver Disease (MELD) score for the allocation of liver transplants, the determination of medical urgency was dependent on the Child-Pugh score. The Child-Pugh score was limited by the inclusion of two subjective variables (severity of ascites and severity of encephalopathy), limited discriminatory ability, and a ceiling effect of laboratory abnormalities. Stakeholders sought an objective, continuous, generalizable index that more accurately and reliably represented disease severity. The MELD score had originally been developed in 2000 to estimate the survival of patients undergoing TIPS. The authors of this 2001 study hypothesized that the MELD score would accurately estimate short-term survival in a wide range of severities and etiologies of liver dysfunction and thus serve as a suitable replacement measure for the Child-Pugh score in the determination of medical urgency in transplant allocation.

This study reported a series of four retrospective validation cohorts for the use of MELD in prediction of mortality in advanced liver disease. The index MELD score was calculated for each patient. Death during follow-up was assessed by chart review.

MELD score = 3.8*ln([bilirubin])+11.2*ln(INR)+9.6*ln([Cr])+6.4*(etiology: 0 if cholestatic or alcoholic, 1 otherwise)

The primary study outcome was the concordance c-statistic between MELD score and 3-month survival. The c-statistic is equivalent to the area under receiver operating characteristic (AUROC). Per the authors, “a c-statistic between 0.8 and 0.9 indicates excellent diagnostic accuracy and a c-statistic greater than 0.7 is generally considered as a useful test.” (See page 455 for further explanation.) There was no reliable comparison statistic (e.g. c-statistic of MELD vs. that of Child-Pugh in all groups).

C-statistic for 3-month survival in the four cohorts ranged from 0.78 to 0.87 (no 95% CIs exceeded 1.0). There was minimal improvement in the c-statistics for 3-month survival with the individual addition of spontaneous bacterial peritonitis, variceal bleed, ascites, and encephalopathy to the MELD score (see Table 4, highest increase in c-statistic was 0.03). When the etiology of liver disease was excluded from the MELD score, there was minimal change in the c-statistics (see Table 5, all paired CIs overlap). C-statistics for 1-week mortality ranged from 0.80 to 0.95.

In conclusion, the MELD score is an excellent predictor of short-term mortality in patients with end-stage liver disease of diverse etiology and severity. Despite the retrospective nature of this study, this study represented a significant improvement upon the Child-Pugh score in determining medical urgency in patients who require liver transplant. In 2002, the United Network for Organ Sharing (UNOS) adopted a modified version of the MELD score for the prioritization of deceased-donor liver transplants in cirrhosis. Concurrent with the 2001 publication of this study, Wiesner et al. performed a prospective validation of the use of MELD in the allocation of liver transplantation. When published in 2003, it demonstrated that MELD score accurately predicted 3-month mortality among patients with chronic liver disease on the waitlist. The MELD score has also been validated in other conditions such as alcoholic hepatitis, hepatorenal syndrome, and acute liver failure (see UpToDate). Subsequent additions to the MELD score have come out over the years. In 2006, the MELD Exception Guidelines offered extra points for severe comorbidities (e.g HCC, hepatopulmonary syndrome). In January 2016, the MELDNa score was adopted and is now used for liver transplant prioritization.

References and Further Reading:
1. “A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts” (2000)
2. MDCalc “MELD Score”
3. Wiesner et al. “Model for end-stage liver disease (MELD) and allocation of donor livers” (2003)
4. Freeman Jr. et al. “MELD exception guidelines” (2006)
5. 2 Minute Medicine
6. UpToDate “Model for End-stage Liver Disease (MELD)”

Image Credit: Ed Uthman, CC-BY-2.0, via WikiMedia Commons

Week 15 – CHADS2

“Validation of Clinical Classification Schemes for Predicting Stroke”

JAMA. 2001 June 13;285(22):2864-70. [free full text]

Atrial fibrillation is the most common cardiac arrhythmia and affects 1-2% of the overall population with increasing prevalence as people age. Atrial fibrillation also carries substantial morbidity and mortality due to the risk of stroke and thromboembolism although the risk of embolic phenomenon varies widely across various subpopulations. In 2001, the only oral anticoagulation options available were warfarin and aspirin, which had relative risk reductions of 62% and 22%, respectively, consistent across these subpopulations. Clinicians felt that high risk patients should be anticoagulated, but the two common classification schemes, AFI and SPAF, were flawed. Patients were often classified as low risk in one scheme and high risk in the other. The schemes were derived retrospectively and were clinically ambiguous. Therefore, in 2001, a group of investigators combined the two existing schemes to create the CHADS2 scheme and applied it to a new data set.

Population (NRAF cohort): Hospitalized Medicare patients ages 65-95 with non-valvular AF not prescribed warfarin at hospital discharge.

Intervention: Determination of CHADS2 score (1 point for recent CHF, hypertension, age ≥ 75, and DM; 2 points for a history of stroke or TIA)

Comparison: AFI and SPAF risk schemes

Measured Outcome: Hospitalization rates for ischemic stroke (per ICD-9 codes from Medicare claims), stratified by CHADS2 / AFI / SPAF scores.

Calculated Outcome: performance of the various schemes, based on c statistic (a measure of predictive accuracy in a binary logistic regression model)

Results:
1733 patients were identified in the NRAF cohort. When compared to the AFI and SPAF trials, these patients tended be older (81 in NRAF vs. 69 in AFI vs. 69 in SPAF), have a higher burden of CHF (56% vs. 22% vs. 21%), are more likely to be female (58% vs. 34% vs. 28%), and have a history of DM (23% vs. 15% vs. 15%) or prior stroke/TIA (25% vs. 17% vs. 8%). The stroke rate was lowest in the group with a CHADS2 = 0 (1.9 per 100 patient years, adjusting for the assumption that aspirin was not taken). The stroke rate increased by a factor of approximately 1.5 for each 1-point increase in the CHADS2 score.

CHADS2 score           NRAF Adjusted Stroke Rate per 100 Patient-Years
0                                      1.9
1                                      2.8
2                                      4.0
3                                      5.9
4                                      8.5
5                                      12.5
6                                      18.2

The CHADS2 scheme had a c statistic of 0.82 compared to 0.68 for the AFI scheme and 0.74 for the SPAF scheme.

Implication/Discussion
The CHADS2 scheme provides clinicians with a scoring system to help guide decision making for anticoagulation in patients with non-valvular AF.

The authors note that the application of the CHADS2 score could be useful in several clinical scenarios. First, it easily identifies patients at low risk of stroke (CHADS2 = 0) for whom anticoagulation with warfarin would probably not provide significant benefit. The authors argue that these patients should merely be offered aspirin. Second, the CHADS2 score could facilitate medication selection based on a patient-specific risk of stroke. Third, the CHADS2 score could help clinicians make decisions regarding anticoagulation in the perioperative setting by evaluating the risk of stroke against the hemorrhagic risk of the procedure. Although the CHADS2 is no longer the preferred risk-stratification scheme, the same concepts are still applicable to the more commonly used CHA2DS2-VASc.

This study had several strengths. First, the cohort was from seven states that represented all geographic regions of the United States. Second, CHADS2 was pre-specified based on previous studies and validated using the NRAF data set. Third, the NRAF data set was obtained from actual patient chart review as opposed to purely from an administrative database. Finally, the NRAF patients were older and sicker than those of the AFI and SPAF cohorts, and thus the CHADS2 appears to be generalizable to the very large demographic of frail, elderly Medicare patients.

As CHADS2 became widely used clinically in the early 2000s, its application to other cohorts generated a large intermediate-risk group (CHADS2 = 1), which was sometimes > 60% of the cohort (though in the NRAF cohort, CHADS2 = 1 accounted for 27% of the cohort). In clinical practice, this intermediate-risk group was to be offered either warfarin or aspirin. Clearly, a clinical-risk predictor that does not provide clear guidance in over 50% of patients needs to be improved. As a result, the CHA2DS2-VASc scoring system was developed from the Birmingham 2009 scheme. When compared head-to-head in registry data, CHA2DS2-VASc more effectively discriminated stroke risk among patients with a baseline CHADS2 score of 0 to 1. Because of this, CHA2DS2-VASc is the recommended risk stratification scheme in the AHA/ACC/HRS 2014 Practice Guideline for Atrial Fibrillation. In modern practice, anticoagulation is unnecessary when CHA2DS2-VASc score = 0, should be considered (vs. antiplatelet or no treatment) when score = 1, and is recommended when score ≥ 2.

Further Reading:
1. AHA/ACC/HRS 2014 Practice Guideline for Atrial Fibrillation
2. CHA2DS2-VASc (2010)
3. 2 Minute Medicine

Summary by Ryan Commins, MD

Image Credit: Alisa Machalek, NIGMS/NIH – National Insititue of General Medical Sciences, Public Domain

Week 14 – CURB-65

“Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study”

Thorax. 2003 May;58(5):377-82. [free full text]

Community-acquired pneumonia (CAP) is frequently encountered by the admitting medicine team. Ideally, the patient’s severity at presentation and risk for further decompensation should determine the appropriate setting for further care, whether as an outpatient, on an inpatient ward, or in the ICU. At the time of this 2003 study, the predominant decision aid was the 20-variable Pneumonia Severity Index. The authors of this study sought to develop a simpler decision aid for determining the appropriate level of care at presentation.

The study examined the 30-day mortality rates of adults admitted for CAP via the ED at three non-US academic medical centers (data from three previous CAP cohort studies). 80% of the dataset was analyzed as a derivation cohort – meaning it was used to identify statistically significant, clinically relevant prognostic factors that allowed for mortality risk stratification. The resulting model was applied to the remaining 20% of the dataset (the validation cohort) in order to assess the accuracy of its predictive ability.

The following variables were integrated into the final model (CURB-65):

  1. Confusion
  2. Urea > 19mg/dL (7 mmol/L)
  3. Respiratory rate ≥ 30 breaths/min
  4. low Blood pressure (systolic BP < 90 mmHg or diastolic BP < 60 mmHg)
  5. age ≥ 65

1068 patients were analyzed. 821 (77%) were in the derivation cohort. 86% of patients received IV antibiotics, 5% were admitted to the ICU, and 4% were intubated. 30-day mortality was 9%. 9 of 11 clinical features examined in univariate analysis were statistically significant (see Table 2).

Ultimately, using the above-described CURB-65 model, in which 1 point is assigned for each clinical characteristic, patients with a CURB-65 score of 0 or 1 had 1.5% mortality, patients with a score of 2 had 9.2% mortality, and patients with a score of 3 or more had 22% mortality. Similar values were demonstrated in the validation cohort. Table 5 summarizes the sensitivity, specificity, PPVs, and NPVs of each CURB-65 score for 30-day mortality in both cohorts. As we would expect from a good predictive model, the sensitivity starts out very high and decreases with increasing score, while the specificity starts out very low and increases with increasing score. For the clinical application of their model, the authors selected the cut points of 1, 2, and 3 (see Figure 2).

In conclusion, CURB-65 is a simple 5-variable decision aid that is helpful in the initial stratification of mortality risk in patients with CAP.

The wide range of specificities and sensitivities at different values of the CURB-65 score makes it a robust tool for risk stratification. The authors felt that patients with a score of 0-1 were “likely suitable for home treatment,” patients with a score of 2 should have “hospital-supervised treatment,” and patients with score of  ≥ 3 had “severe pneumonia” and should be admitted (with consideration of ICU admission if score of 4 or 5).

Following the publication of the CURB-65 Score, the author of the Pneumonia Severity Index (PSI) published a prospective cohort study of CAP that examined the discriminatory power (area under the receiver operating characteristic curve) of the PSI vs. CURB-65. His study found that the PSI “has a higher discriminatory power for short-term mortality, defines a greater proportion of patients at low risk, and is slightly more accurate in identifying patients at low risk” than the CURB-65 score.

Expert opinion at UpToDate prefers the PSI over the CURB-65 score based on its more robust base of confirmatory evidence. Of note, the author of the PSI is one of the authors of the relevant UpToDate article. In an important contrast from the CURB-65 authors, these experts suggest that patients with a CURB-65 score of 0 be managed as outpatients, while patients with a score of 1 and above “should generally be admitted.”

Further Reading/References:
1. Original publication of the PSI, NEJM (1997)
2. PSI vs. CURB-65 (2005)
3. Wiki Journal Club
4. 2 Minute Medicine
5. UpToDate, “CAP in adults: assessing severity and determining the appropriate level of care”

Summary by Duncan F. Moore, MD

Image Credit: by Christaras A, CC BY-SA 3.0

Week 13 – Sepsis-3

“The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3)”

JAMA. 2016 Feb 23;315(8):801-10. [free full text]

In practice, we recognize sepsis as a potentially life-threatening condition that arises secondary to infection. Because the SIRS criteria were of limited sensitivity and specificity in identifying sepsis and because our understanding of the pathophysiology of sepsis had purportedly advanced significantly during the interval since the last sepsis definition, an international task force of 19 experts was convened to define and prognosticate sepsis more effectively. The resulting 2016 Sepsis-3 definition was the subject of immediate and sustained controversy.

In the words of Sepsis-3, sepsis simply “is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection.” The paper further defines organ dysfunction in terms of a threshold change in the SOFA score by 2+ points. However, the authors state that “the SOFA score is not intended to be used as a tool for patient management but as a means to clinically characterize a septic patient.” The authors note that qSOFA, an easier tool introduced in this paper, can identify promptly at the bedside patients “with suspected infection who are likely to have a prolonged ICU stay or die in the hospital.” A positive screen on qSOFA is identified as 2+ of the following: AMS, SBP ≤ 100, or respiratory rate ≥ 22. At the time of this endorsement of qSOFA, the tool had not been validated prospectively. Finally, septic shock was defined as sepsis with persistent hypotension requiring vasopressors to maintain MAP ≥ 65 and with a serum lactate > 2 despite adequate volume resuscitation.

As noted contemporaneously in the excellent PulmCrit blog post “Top ten problems with the new sepsis definition,” Sepsis-3 was not endorsed by the American College of Chest Physicians, the IDSA, any emergency medicine society, or any hospital medicine society. On behalf of the American College of Chest Physicians, Dr. Simpson published a scathing rejection of Sepsis-3 in Chest in May 2016. He noted “there is still no known precise pathophysiological feature that defines sepsis.” He went on to state “it is not clear to us that readjusting the sepsis criteria to be more specific for mortality is an exercise that benefits patients,” and said “to abandon one system of recognizing sepsis [SIRS] because it is imperfect and not yet in universal use for another system that is used even less seems unwise without prospective validation of that new system’s utility.”

In fact, the later validation of qSOFA demonstrated that the SIRS criteria had superior sensitivity for predicting in-hospital mortality while qSOFA had higher specificity. See the following posts at PulmCrit for further discussion: [https://emcrit.org/isepsis/isepsis-sepsis-3-0-much-nothing/] [https://emcrit.org/isepsis/isepsis-sepsis-3-0-flogging-dead-horse/].

At UpToDate, authors note that “data of the value of qSOFA is conflicting,” and because of this, “we believe that further studies that demonstrate improved clinically meaningful outcomes due to the use of qSOFA compared to clinical judgement are warranted before it can be routinely used to predict those at risk of death from sepsis.”

Additional Reading:
1. PulmCCM, “Simple qSOFA score predicts sepsis as well as anything else”
2. 2 Minute Medicine

Summary by Duncan F. Moore, MD

Image Credit: By Mark Oniffrey – Own work, CC BY-SA 4.0

Week 50 – Sepsis-3

“The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3)”

JAMA. 2016 Feb 23;315(8):801-10. [free full text]

In practice, we recognize sepsis as a potentially life-threatening condition that arises secondary to infection.  Because the SIRS criteria were of limited sensitivity and specificity in identifying sepsis and because our understanding of the pathophysiology of sepsis had purportedly advanced significantly during the interval since the last sepsis definition, an international task force of 19 experts was convened to define and prognosticate sepsis more effectively. The resulting 2016 Sepsis-3 definition was the subject of immediate and sustained controversy.

In the words of Sepsis-3, sepsis simply “is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection.” The paper further defines organ dysfunction in terms of a threshold change in the SOFA score by 2+ points. However, the authors state that “the SOFA score is not intended to be used as a tool for patient management but as a means to clinically characterize a septic patient.” The authors note that qSOFA, an easier tool introduced in this paper, can identify promptly at the bedside patients “with suspected infection who are likely to have a prolonged ICU stay or die in the hospital.” A positive screen on qSOFA is identified as 2+ of the following: AMS, SBP ≤ 100, or respiratory rate ≥ 22. At the time of this endorsement of qSOFA, the tool had not been validated prospectively. Finally, septic shock was defined as sepsis with persistent hypotension requiring vasopressors to maintain MAP ≥ 65 and with a serum lactate > 2 despite adequate volume resuscitation.

As noted contemporaneously in the excellent PulmCrit blog post “Top ten problems with the new sepsis definition,” Sepsis-3 was not endorsed by the American College of Chest Physicians, the IDSA, any emergency medicine society, or any hospital medicine society. On behalf of the American College of Chest Physicians, Dr. Simpson published a scathing rejection of Sepsis-3 in Chest in May 2016. He noted “there is still no known precise pathophysiological feature that defines sepsis.” He went on to state “it is not clear to us that readjusting the sepsis criteria to be more specific for mortality is an exercise that benefits patients,” and said “to abandon one system of recognizing sepsis [SIRS] because it is imperfect and not yet in universal use for another system that is used even less seems unwise without prospective validation of that new system’s utility.”

In fact, the later validation of qSOFA demonstrated that the SIRS criteria had superior sensitivity for predicting in-hospital mortality while qSOFA had higher specificity. See the following posts at PumCrit for further discussion: [https://emcrit.org/isepsis/isepsis-sepsis-3-0-much-nothing/] [https://emcrit.org/isepsis/isepsis-sepsis-3-0-flogging-dead-horse/].

At UpToDate, authors note that “data of the value of qSOFA is conflicting,” and because of this, “we believe that further studies that demonstrate improved clinically meaningful outcomes due to the use of qSOFA compared to clinical judgement are warranted before it can be routinely used to predict those at risk of death from sepsis.”

Additional Reading:
1. PulmCCM, “Simple qSOFA score predicts sepsis as well as anything else”
2. 2 Minute Medicine

Summary by Duncan F. Moore, MD

Week 48 – HAS-BLED

“A Novel User-Friendly Score (HAS-BLED) To Assess 1-Year Risk of Major Bleeding in Patients with Atrial Fibrillation”

Chest. 2010 Nov;138(5):1093-100. [free full text]

Atrial fibrillation (AF) is a well-known risk factor for ischemic stroke. Stroke risk is further increased by individual comorbidities such as CHF, HTN, and DM and can be stratified with scores such as CHADS2 and CHA2DS2VASC. The recommendation for patients with intermediate stroke risk is treatment with oral anticoagulation (OAC). However, stroke risk is often closely related to bleeding risk, and the benefits of anticoagulation for stroke need to be weighed against the added risk of bleeding. At the time of this study, there were no validated and user-friendly bleeding risk-stratification schemes. This study aimed to develop a practical risk score to estimate the 1-year risk of major bleeding (as defined in the study) in a contemporary, real-world cohort of patients with AF.

Population: adults with EKG or Holter-proven diagnosis of AF
Exclusion criteria: mitral valve stenosis, valvular surgery

(Patients were identified from the prospectively developed database of the multi-center Euro Heart Survey on AF. Among 5,272 patients with AF, 3,456 were free of mitral valve stenosis or valve surgery and completed their 1-year follow-up assessment.)

No experiment was performed in this retrospective cohort study.

In a derivation cohort, the authors retrospectively performed univariate analyses to identify a range of clinical features associated with major bleeding (p < 0.10). Based on systematic reviews, they added additional risk factors for major bleeding. Ultimately, the result was a list of comprehensive risk factors that make up the acronym HAS-BLED:

H – Hypertension (> 160 mmHg systolic)
A – Abnormal renal (HD, transplant, Cr > 2.26 mg/dL) and liver function (cirrhosis, bilirubin >2x normal w/ AST/ALT/ALP > 3x normal) – 1 pt each for abnormal renal or liver function
S – Stroke

B – Bleeding (prior major bleed or predisposition to bleed)
L – Labile INRs (time in therapeutic range < 60%)
E – Elderly (age > 65)
D – Drugs (i.e. ASA, clopidogrel, NSAIDs) or alcohol use (> 8 units per week) concomitantly – 1 pt each for use of either

Each risk factor represents one point each. The HAS-BLED score was then compared to the HEMORR2HAGES scheme, a previously developed tool for estimating bleeding risk.

Outcomes:

  • incidence of major bleeding within 1 year
  • bleeds per 100 patient-years, stratified by HAS-BLED score
  • c-statistic for the HAS-BLED score in predicting the risk of bleeding

Definitions:

  • major bleeding: bleeding causing hospitalization, Hgb drop >2 g/L, or bleeding requiring blood transfusion (excluded hemorrhagic stroke)
  • hemorrhagic stroke: focal neurologic deficit of sudden onset that is diagnosed by a neurologist, lasting > 24h, and caused by bleeding

Results:
3,456 AF patients (without mitral valve stenosis or valve surgery) who completed their 1-year follow-up were analyzed retrospectively. 64.8% (2242) of these patients were on OAC (with 12.8% (286) of this subset on concurrent antiplatelet therapy), 24% (828) were on antiplatelet therapy alone, and 10.2% (352) received no antithrombotic therapy. 1.5% (53) of patients experienced a major bleed during the first year. 17% (9) of these patients sustained intracerebral hemorrhage.

HAS-BLED Score       Bleeds per 100-patient years
0                                        1.13
1                                         1.02
2                                        1.88
3                                        3.74
4                                        8.70
5                                        12.50
6*                                     0.0                   *(n = 2 patients at risk, neither bled)

Patients were given a HAS-BLED score and a HEMORR2HAGES score. C-statistics were then used to determine the predictive accuracy of each model overall as well as within patient subgroups (OAC alone, OAC + antiplatelet, antiplatelet alone, and no antithrombotic therapy).

C statistics for HAS-BLED:
For overall cohort, 0.72 (95% CI 0.65-0.79); for OAC alone, 0.69 (95% CI 0.59-0.80); for OAC + antiplatelet, 0.78 (95% CI 0.65-0.91); for antiplatelet alone, 0.91 (95% CI 0.83-1.00); and for those on no antithrombotic therapy, 0.85 (95% CI 0.00-1.00).

C statistics for HEMORR2HAGES:
For overall cohort, 0.66 (95% CI 0.57-0.74); for OAC alone, 0.64 (95% CI 0.53-0.75); for OAC + antiplatelet, 0.83 (95% CI 0.74-0.91); for antiplatelet alone, 0.83 (95% CI 0.68-0.98); and for those on no antithrombotic therapy, 0.81 (95% CI 0.00-1.00).

Implication/Discussion:
This study helped to establish a practical and user-friendly assessment of bleeding risk in AF. HAS-BLED is superior to its predecessor HEMORR2HAGES because the acronym is easier to remember, the assessment is quicker and simpler to perform, and all risk factors are readily available from the clinical history or routine testing. Both stratification tools had (grossly) similar c-statistics for the overall cohort – 0.72 for HAS-BLED versus 0.66 for HEMORR2HAGES. However, HAS-BLED was particularly useful when looking at antiplatelet therapy alone or no antithrombotic therapy at all (0.91 and 0.85, respectively).

This study is useful because it provides evidence-based, easily calculable, and actionable risk stratification in the assessment of bleeding risk in AF. In prior studies, such as ACTIVE-A (ASA + clopidogrel versus ASA alone for patients with AF deemed unsuitable for OAC), almost half of all patients (n= ~3500) were given a classification of “unsuitable for OAC,” which was based solely on physicians’ clinical judgement without a predefined objective scoring. Now, physicians have an objective way to assess bleed risk rather than “gut feeling” or wanting to avoid iatrogenic insult.

The RE-LY trial used the HAS-BLED score to decide which patients with AF should get the standard dabigatran dose (150mg BID) rather than a lower dose (110mg BID) for anticoagulation. This risk-stratified dosing resulted in a significant reduction in major bleeding compared with warfarin but maintained a similar reduction in stroke risk.

Furthermore, the HAS-BLED score could allow the physician to be more confident when deciding which patients may be appropriate for referral for a left atrial appendage occlusion device (e.g. Watchman).

Limitations:
The study had a limited number of major bleeds and a short follow-up period, and thus it is possible that other important risk factors for bleeding were not identified. Also, there were large numbers of patients lost to 1-year follow-up. These patients likely had more comorbidities and may have transferred to nursing homes or even died. Their loss to follow-up and thus exclusion from this retrospective study may have led to an underestimate of true bleeding rates. Furthermore, generalizability is limited by the modest number of very elderly patients (i.e. 75-84 and ≥85), who likely represent the greatest bleeding risk. Finally, this study did not specify what proportion of its patients were on warfarin for their OAC, but given that dabigatran, rivaroxaban, and apixaban were not yet approved for use in Europe (2008, 2008, and 2011, respectively) for the majority of the study, we can assume most patients were on warfarin. Thus the generalizability of HAS-BLED risk stratification to the DOACs is limited.

Bottom Line:
HAS-BLED provides an easy, practical tool to assess the individual bleeding risk of patients with AF. Oral anticoagulation should be considered for scores of 3 or less. If HAS-BLED scores are ≥4, it is reasonable to think about alternatives to oral anticoagulation.

Further Reading/References:
1. 2 Minute Medicine
2. ACTIVE-A trial
3. RE-LY trial
4. RE-LY @ Wiki Journal Club
5. HAS-BLED Calculator
6. HEMORR2HAGES Calculator
7. Watchman (for Healthcare Professionals)

Summary by Patrick Miller, MD

Week 20 – CHADS2

“Validation of Clinical Classification Schemes for Predicting Stroke”

JAMA. 2001 June 13;285(22):2864-70. [free full text]

Atrial fibrillation is the most common cardiac arrhythmia and affects 1-2% of the overall population, with increasing prevalence as people age. Atrial fibrillation also carries substantial morbidity and mortality due to the risk of stroke and thromboembolism, although the risk of embolic phenomenon varies widely across various subpopulations. In 2001, the only oral anticoagulation options available were warfarin and aspirin, which had relative risk reductions of 62% and 22%, respectively, consistent across these subpopulations. Clinicians felt that high risk patients should be anticoagulated, but the two common classification schemes, AFI and SPAF, were flawed. Patients were often classified as low risk in one scheme and high risk in the other. The schemes were derived retrospectively and were clinically ambiguous. Therefore, in 2001 a group of investigators combined the two existing schemes to create the CHADS2 scheme and applied it to a new data set.

Population (NRAF cohort): Hospitalized Medicare patients ages 65-95 with non-valvular AF not prescribed warfarin at hospital discharge. Patient records were manually abstracted by five quality improvement organizations in seven US states (California, Connecticut, Louisiana, Maine, Missouri, New Hampshire, and Vermont).

Intervention: Determination of CHADS2 score (1 point for recent CHF, hypertension, age ≥ 75, and DM; 2 points for a history of stroke or TIA)

Comparison: AFI and SPAF risk schemes

Measured Outcome: Hospitalization rates for ischemic stroke (per ICD-9 codes from Medicare claims), stratified by CHADS2 / AFI / SPAF scores.

Calculated Outcome: performance of the various schemes, based on c statistic (a measure of predictive accuracy in a binary logistic regression model)

Results:
1733 patients were identified in the NRAF cohort. When compared to the AFI and SPAF trials, these patients tended be older (81 in NRAF vs. 69 in AFI vs. 69 in SPAF), have a higher burden of CHF (56% vs. 22% vs. 21%), more likely to be female (58% vs. 34% vs. 28%), had a history of DM (23% vs. 15% vs. 15%) and prior stroke or TIA (25% vs. 17% vs. 8%). The stroke rate was lowest in the group with a CHADS2 = 0 (1.9 per 100 patient years, adjusting for the assumption that aspirin was not taken). The stroke rate increased by a factor of approximately 1.5 for each 1-point increase in the CHADS2 score.

CHADS2 score            NRAF Adjusted Stroke Rate per 100 Patient-Years
0                                      1.9
1                                      2.8
2                                      4.0
3                                      5.9
4                                      8.5
5                                      12.5
6                                      18.2

The CHADS2 scheme had a c statistic of 0.82 compared to 0.68 for the AFI scheme and 0.74 for the SPAF scheme.

Implication/Discussion
The CHADS2 scheme provides clinicians with a scoring system to help guide decision making for anticoagulation in patients with non-valvular AF.

The authors note that the application of the CHADS2 score could be useful in several clinical scenarios. First, it easily identifies patients at low risk of stroke (CHADS2 = 0) for whom anticoagulation with warfarin would probably not provide significant benefit. The authors argue that these patients should merely be offered aspirin. Second, the CHADS2 score could facilitate medication selection based on a patient-specific risk of stroke. Third, the CHADS2 score could help clinicians make decisions regarding anticoagulation in the perioperative setting by evaluating the risk of stroke against the hemorrhagic risk of the procedure. Although the CHADS2 is no longer the preferred risk-stratification scheme, the same concepts are still applicable to the more commonly used CHA2DS2-VASc.

This study had several strengths. First, the cohort was from seven states that represented all geographic regions of the United States. Second, CHADS2 was pre-specified based on previous studies and validated using the NRAF data set. Third, the NRAF data set was obtained from actual patient chart review as opposed to purely from an administrative database. Finally, the NRAF patients were older and sicker than those of the AFI and SPAF cohorts, thus the CHADS2 appears to be generalizable to the very large demographic of frail, elderly Medicare patients.

As CHADS2 became widely used clinically in the early 2000s, its application to other cohorts generated a large intermediate-risk group (CHADS2 = 1), which was sometimes > 60% of the cohort (though in the NRAF cohort, CHADS2 = 1 accounted for 27% of the cohort). In clinical practice, this intermediate-risk group was to be offered either warfarin or aspirin. Clearly, a clinical-risk predictor that does not provide clear guidance in over 50% of patients needs to be improved. As a result, the CHA2DS2-VASc scoring system was developed from the Birmingham 2009 scheme. When compared head-to-head in registry data, CHA2DS2-VASc more effectively discriminated stroke risk among patients with a baseline CHADS2 score of 0 to 1. Because of this, CHA2DS2-VASc is the recommended risk stratification scheme in the AHA/ACC/HRS 2014 Practice Guideline for Atrial Fibrillation. In modern practice, anticoagulation is unnecessary when CHA2DS2-VASc score = 0, should be considered (vs. antiplatelet or no treatment) when score = 1, and is recommended when score ≥ 2.

Further Reading:
1. AHA/ACC/HRS 2014 Practice Guideline for Atrial Fibrillation
2. CHA2DS2-VASc (2010)
3. 2 Minute Medicine

Summary by Ryan Commins, MD

Week 13 – CURB-65

“Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study”

Thorax. 2003 May;58(5):377-82. [free full text]

Community-acquired pneumonia (CAP) is frequently encountered by the admitting medicine team. Ideally, the patient’s severity at presentation and risk for further decompensation should determine the appropriate setting for further care, whether as an outpatient, on an inpatient ward, or in the ICU. At the time of this 2003 study, the predominant decision aid was the 20-variable Pneumonia Severity Index. The authors of this study sought to develop a simpler decision aid for determining the appropriate level of care at presentation.

Population: adults admitted for CAP via the ED at three non-US academic medical centers

Intervention/Comparison: none

Outcome: 30-day mortality

Additional details about methodology: This study analyzed the aggregate data from three previous CAP cohort studies. 80% of the dataset was analyzed as a derivation cohort – meaning it was used to identify statistically significant, clinically relevant prognostic factors that allowed for mortality risk stratification. The resulting model was applied to the remaining 20% of the dataset (the validation cohort) in order to assess the accuracy of its predictive ability.

The following variables were integrated into the final model (CURB-65):

  1. Confusion
  2. Urea > 19mg/dL (7 mmol/L)
  3. Respiratory rate ≥ 30 breaths/min
  4. low Blood pressure (systolic BP < 90 mmHg or diastolic BP < 60 mmHg)
  5. age ≥ 65

Results:
1068 patients were analyzed. 821 (77%) were in the derivation cohort. 86% of patients received IV antibiotics, 5% were admitted to the ICU, and 4% were intubated. 30-day mortality was 9%. 9 of 11 clinical features examined in univariate analysis were statistically significant (see Table 2).

Ultimately, using the above-described CURB-65 model, in which 1 point is assigned for each clinical characteristic, patients with a CURB-65 score of 0 or 1 had 1.5% mortality, patients with a score of 2 had 9.2% mortality, and patients with a score of 3 or more had 22% mortality. Similar values were demonstrated in the validation cohort. Table 5 summarizes the sensitivity, specificity, PPVs, and NPVs of each CURB-65 score for 30-day mortality in both cohorts. As we would expect from a good predictive model, the sensitivity starts out very high and decreases with increasing score, while the specificity starts out very low and increases with increasing score. For the clinical application of their model, the authors selected the cut points of 1, 2, and 3 (see Figure 2).


Implication/Discussion
:
CURB-65 is a simple 5-variable decision aid that is helpful in the initial stratification of mortality risk in patients with CAP.

The wide range of specificities and sensitivities at different values of the CURB-65 score makes it a robust tool for risk stratification. The authors felt that patients with a score of 0-1 were “likely suitable for home treatment,” patients with a score of 2 should have “hospital-supervised treatment,” and patients with score of  ≥ 3 had “severe pneumonia” and should be admitted (with consideration of ICU admission if score of 4 or 5).

Following the publication of the CURB-65 Score, the author of the Pneumonia Severity Index (PSI) published a prospective cohort study of CAP that examined the discriminatory power (area under the receiver operating characteristic curve) of the PSI vs. CURB-65. His study found that the PSI “has a higher discriminatory power for short-term mortality, defines a greater proportion of patients at low risk, and is slightly more accurate in identifying patients at low risk” than the CURB-65 score.

Expert opinion at UpToDate prefers the PSI over the CURB-65 score based on its more robust base of confirmatory evidence. Of note, the author of the PSI is one of the authors of the relevant UpToDate article. In an important contrast from the CURB-65 authors, these experts suggest that patients with a CURB-65 score of 0 be managed as outpatients, while patients with a score of 1 and above “should generally be admitted.”

Further Reading/References:
1. Original publication of the PSI, NEJM (1997)
2. PSI vs. CURB-65 (2005)
3. Wiki Journal Club
4. 2 Minute Medicine
5. UpToDate, “CAP in adults: assessing severity and determining the appropriate level of care”

Summary by Duncan F. Moore, MD

Week 10 – MELD

“A Model to Predict Survival in Patients With End-Stage Liver Disease”

Hepatology. 2001 Feb;33(2):464-70. [free full text]

Prior to the adoption of the Model for End-Stage Liver Disease (MELD) score for the allocation of liver transplants, determination of medical urgency was dependent on the Child-Pugh score. The Child-Pugh score was limited by the inclusion of two subjective variables (severity of ascites and severity of encephalopathy), limited discriminatory ability, and a ceiling effect of laboratory abnormalities. Stakeholders sought an objective, continuous, generalizable index that more accurately and reliably represented disease severity. The MELD score had originally been developed in 2000 to estimate the survival of patients undergoing TIPS. The authors of this 2001 study hypothesized that the MELD score would accurately estimate short-term survival in a wide range of severities and etiologies of liver dysfunction and thus serve as a suitable replacement measure for the Child-Pugh score in the determination of medical urgency in transplant allocation.

This study reported a series of retrospective validation cohorts for the use of MELD in prediction of mortality in advanced liver disease.

Methods:

Populations:

  1. cirrhotic inpatients, Mayo Clinic, 1994-1999, n = 282 (see exclusion criteria)
  2. ambulatory patients with noncholestatic cirrhosis, newly-diagnosed, single-center in Italy, 1981-1984, n = 491 consecutive patients
  3. ambulatory patients with primary biliary cirrhosis, Mayo Clinic, 1973-1984, n = 326 (92 lacked all necessary variables for calculation of MELD)
  4. cirrhotic patients, Mayo Clinic, 1984-1988, n = 1179 patients with sufficient follow-up (≥ 3 months) and laboratory data

Index MELD score was calculated for each patient. Death during follow-up was assessed by chart review.

MELD score = 3.8*ln([bilirubin]) + 11.2*ln(INR) + 9.6*ln([Cr])+6.4*(etiology: 0 if cholestatic or alcoholic, 1 otherwise)

Primary study outcome was the concordance c-statistic between MELD score and 3-month survival. The c-statistic is equivalent to the area under receiver operating characteristic (AUROC). Per the authors, “a c-statistic between 0.8 and 0.9 indicates excellent diagnostic accuracy and a c-statistic greater than 0.7 is generally considered as a useful test.” (See page 455 for further explanation.)

There was no reliable comparison statistic (e.g. c-statistic of MELD vs. Child-Pugh in all groups).

Results:

Primary:

  • hospitalized Mayo patients (late 1990s): c-statistic for prediction of 3-month survival = 0.87 (95% CI 0.82-0.92)
  • ambulatory, non-cholestatic Italian patients: c-statistic for 3-month survival = 0.80 (95% CI 0.69-0.90)
  • ambulatory PBC patients at Mayo: c-statistic for 3-month survival = 0.87 (95% CI 0.83-0.99)
  • cirrhotic patients at Mayo (1980s): c-statistic for 3-month survival = 0.78 (95% CI 0.74-0.81)

Secondary:

  • There was minimal improvement in the c-statistics for 3-month survival with the individual addition of SBP, variceal bleed, ascites, and encephalopathy to the MELD score (see Table 4, highest increase in c-statistic was 0.03).
  • When the etiology of liver disease was excluded from the MELD score, there was minimal change in the c-statistics (see Table 5, all paired CIs overlap).
  • C-statistics for 1-week mortality ranged from 0.80 to 0.95.

Implication/Discussion:
The MELD score is an excellent predictor of short-term mortality in patients with end-stage liver disease of diverse etiology and severity.

Despite the retrospective nature of this study, this study represented a significant improvement upon the Child-Pugh score in determining medical urgency in patients who require liver transplant.

In 2002, the United Network for Organ Sharing (UNOS) adopted a modified version of the MELD score for the prioritization of deceased-donor liver transplants in cirrhosis.

Concurrent with the 2001 publication of this study, Wiesner et al. performed a prospective validation of the use of MELD in the allocation of liver transplantation. When published in 2003, it demonstrated that MELD score accurately predicted 3-month mortality among patients with chronic liver disease on the waitlist.

The MELD score has also been validated in other conditions such as alcoholic hepatitis, hepatorenal syndrome, and acute liver failure (see UpToDate).

Subsequent additions to the MELD score have come out over the years. In 2006, the MELD Exception Guidelines offered extra points for severe comorbidities (e.g HCC, hepatopulmonary syndrome). In January 2016, the MELDNa score was adopted and is now used for liver transplant prioritization.

References and Further Reading:
1. “A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts” (2000)
2. MDCalc “MELD Score”
3. Wiesner et al. “Model for end-stage liver disease (MELD) and allocation of donor livers” (2003)
4. Freeman Jr. et al. “MELD exception guidelines” (2006) 
5. 2 Minute Medicine
6. UpToDate “Model for End-stage Liver Disease (MELD)”

Summary by Duncan F. Moore, MD