Week 31 – Symptom-Triggered Benzodiazepines in Alcohol Withdrawal

“Symptom-Triggered vs Fixed-Schedule Doses of Benzodiazepine for Alcohol Withdrawal”

Arch Intern Med. 2002 May 27;162(10):1117-21. [free full text]

Treatment of alcohol withdrawal with benzodiazepines has been the standard of care for decades. However, in the 1990s, benzodiazepine therapy for alcohol withdrawal was generally given via fixed doses. In 1994, a double-blind RCT by Saitz et al. demonstrated that symptom-triggered therapy based on responses to the CIWA-Ar scale reduced treatment duration and the amount of benzodiazepine used relative to a fixed-schedule regimen. This trial had little immediate impact in the treatment of alcohol withdrawal. The authors of this 2002 double-blind RCT sought to confirm the findings from 1994 in a larger population that did not exclude patients with a history of seizures or severe alcohol withdrawal.

Population: consecutive patients admitted to the inpatient alcohol treatment units at two European universities

Notable exclusion criteria: “major cognitive, psychiatric, or medical comorbidity”

Intervention: placebo (30mg q6hrs x4, followed by 15mg q6hrs x8), with additional oxazepam 15mg for CIWA score 8-15 and 30mg for CIWA score > 15

Comparison: scheduled oxazepam (30mg q6hrs x4, followed by 15mg q6hrs x8), with additional oxazepam 15mg for CIWA score 8-15 and 30mg for CIWA score > 15



  • cumulative oxazepam dose at 72hrs
  • oxazepam treatment duration


  • incidence of seizures, hallucinations, and delirium tremens at 72hrs
  • subjective scales of “health concerns,” anxiety, depression, energy level, physical functioning, and vitality over the preceding 3 days, assessed at 72hrs

Subgroup analysis: exclusion of symptomatic patients who did not require any oxazepam

117 patients completed the trial. 56 had been randomized to the symptom-triggered group, and 61 had been randomized to the fixed-schedule group. The groups were similar in all baseline characteristics except that the fixed-schedule group had on average a 5-hour longer interval since last drink prior to admission. Only 39% of the symptom-triggered group actually received oxazepam, while 100% of the fixed-schedule group did (p < 0.001).

Patients in the symptom-triggered group received a mean cumulative dose of 37.5mg versus 231.4mg in the fixed-schedule group (p < 0.001). The mean duration of oxazepam treatment was 20.0 hours in the symptom-triggered group versus 62.7 hours in the fixed-schedule group.

The group difference in total oxazepam dose persisted even when patients who did not receive any oxazepam were excluded. Among patients who did receive oxazepam, patients in the symptom-triggered group received 95.4 ± 107.7mg versus 231.4 ± 29.4mg in the fixed-dose group (p < 0.001).

Only one patient in the symptom-triggered group sustained a seizure. There were no seizures, hallucinations, or episodes of delirium tremens in any of the other 116 patients. The two treatment groups had similar quality-of-life and symptom scores aside from slightly higher physical functioning in the symptom-triggered group (p < 0.01). See Table 2.

Symptom-triggered administration of benzodiazepines in alcohol withdrawal led to a six-fold reduction in cumulative benzodiazepine use and a much shorter duration of pharmacotherapy than fixed-schedule administration. This more restrictive and responsive strategy did not increase the risk of major adverse outcomes such as seizure or DTs, and also did not result in increased patient discomfort.

Overall, this study confirmed the findings of the landmark study by Saitz et al. from eight years prior. Additionally, this trial was larger and did not exclude patients with a prior history of withdrawal seizures or severe withdrawal. The fact that both studies took place in inpatient specialty psychiatry units limits their generalizability to our inpatient general medicine populations.

Why the initial 1994 study did not gain clinical traction remains unclear. Both studies have been well-cited over the ensuing decades, and the paradigm has shifted firmly toward symptom-triggered benzodiazepine regimens using the CIWA scale. A 2010 Cochrane review cites the 1994 study only, while Wiki Journal Club and 2 Minute Medicine have entries on this 2002 study but not on the equally impressive 1994 study.

Further Reading/References:
1. “Individualized treatment for alcohol withdrawal. A randomized double-blind controlled trial.” JAMA. 1994.
2. Clinical Institute Withdrawal Assessment of Alcohol Scale, Revised (CIWA-Ar)
3. Wiki Journal Club
4. 2 Minute Medicine
5. “Benzodiazepines for alcohol withdrawal.” Cochrane Database Syst Rev. 2010.

Summary by Duncan F. Moore, MD

Week 29 – ALLHAT

“Major Outcomes in High-Risk Hypertensive Patients Randomized to Angiotensin-Converting Enzyme Inhibitor or Calcium Channel Blocker vs. Diuretic”

The Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT)

JAMA. 2002 Dec 18;288(23):2981-97. [free full text]

Hypertension is a ubiquitous disease, and the cardiovascular and mortality benefits of BP control have been well described. However, as the number of available antihypertensive classes proliferated in the past several decades, a head-to-head comparison of different antihypertensive regimens was necessary to determine the optimal first-step therapy. The 2002 ALLHAT trial was a landmark trial in this effort.

33,357 patients aged 55 years or older with hypertension and at least one other coronary heart disease (CHD) risk factor (previous MI or stroke, LVH by ECG or echo, T2DM, current cigarette smoking, HDL < 35 mg/dL, or documentation of other atherosclerotic cardiovascular disease (CVD)). Notable exclusion criteria: history of hospitalization for CHF, history of treated symptomatic CHF, or known LVEF < 35%.

Prior antihypertensives were discontinued upon initiation of the study drug. Patients were randomized to one of three study drugs in a double-blind fashion. Study drugs and additional drugs were added in a step-wise fashion to achieve a goal BP <140/90 mmHg.

Step 1: titrate assigned study drug

  • chlorthalidone: 12.5 –> (sham titration) –> 25 mg/day
  • amlodipine: 2.5 –> 5 –> 10 mg/day
  • lisinopril: 10 –> 20 –> 40 mg/day

Step 2: add open-label agents at treating physician’s discretion (atenolol, clonidine, or reserpine)

  • atenolol: 25 to 100 mg/day
  • reserpine: 0.05 to 0.2 mg/day
  • clonidine: 0.1 to 0.3 mg BID

Step 3: add hydralazine 25 to 100 mg BID

Pairwise comparisons with respect to outcomes of chlorthalidone vs. either amlodipine or lisinopril. A doxazosin arm existed initially, but it was terminated early due to an excess of CV events, primarily driven by CHF.


Primary –  combined fatal CAD or nonfatal MI


  • all-cause mortality
  • fatal and nonfatal stroke
  • combined CHD (primary outcome, PCI, or hospitalized angina)
  • combined CVD (CHD, stroke, non-hospitalized treated angina, CHF [fatal, hospitalized, or treated non-hospitalized], and PAD)

Over a mean follow-up period of 4.9 years, there was no difference between the groups in either the primary outcome or all-cause mortality.

When compared with chlorthalidone at 5 years, the amlodipine and lisinopril groups had significantly higher systolic blood pressures (by 0.8 mmHg and 2 mmHg, respectively). The amlodipine group had a lower diastolic blood pressure when compared to the chlorthalidone group (0.8 mmHg).

When comparing amlodipine to chlorthalidone for the pre-specified secondary outcomes, amlodipine was associated with an increased risk of heart failure (RR 1.38; 95% CI 1.25-1.52).

When comparing lisinopril to chlorthalidone for the pre-specified secondary outcomes, lisinopril was associated with an increased risk of stroke (RR 1.15; 95% CI 1.02-1.30), combined CVD (RR 1.10; 95% CI 1.05-1.16), and heart failure (RR 1.20; 95% CI 1.09-1.34). The increased risk of stroke was mostly driven by 3 subgroups: women (RR 1.22; 95% CI 1.01-1.46), blacks (RR 1.40; 95% CI 1.17-1.68), and non-diabetics (RR 1.23; 95% CI 1.05-1.44). The increased risk of CVD was statistically significant in all subgroups except in patients aged less than 65. The increased risk of heart failure was statistically significant in all subgroups.

In patients with hypertension and one risk factor for CAD, chlorthalidone, lisinopril, and amlodipine performed similarly in reducing the risks of fatal CAD and nonfatal MI.

The study has several strengths: a large and diverse study population, a randomized, double-blind structure, and the rigorous evaluation of three of the most commonly prescribed “newer” classes of antihypertensives. Unfortunately, neither an ARB nor an aldosterone antagonist was included in the study. Additionally, the step-up therapies were not reflective of contemporary practice. (Instead, patients would likely be prescribed one or more of the primary study drugs.)

The ALLHAT study is one of the hallmark studies of hypertension and has played an important role in hypertension guidelines since it was published. Following the publication of ALLHAT, thiazide diuretics became widely used as first line drugs in the treatment of hypertension. The low cost of thiazides and their limited side-effect profile are particularly attractive class features. While ALLHAT looked specifically at chlorthalidone, in practice the positive findings were attributed to HCTZ, which has been more often prescribed. The authors of ALLHAT argued that the superiority of thiazides was likely a class effect, but according to the analysis at Wiki Journal Club, “there is little direct evidence that HCTZ specifically reduces the incidence of CVD among hypertensive individuals.” Furthermore, a 2006 study noted that that HCTZ has worse 24-hour BP control than chlorthalidone due to a shorter half-life. The ALLHAT authors note that “since a large proportion of participants required more than 1 drug to control their BP, it is reasonable to infer that a diuretic be included in all multi-drug regimens, if possible.” The 2017 ACC/AHA High Blood Pressure Guidelines state that, of the four thiazide diuretics on the market, chlorthalidone is preferred because of a prolonged half-life and trial-proven reduction of CVD (via the ALLHAT study).

Further Reading / References:
1. 2017 ACC Hypertension Guidelines
2. Wiki Journal Club
3. 2 Minute Medicine
4. Ernst et al, “Comparative antihypertensive effects of hydrochlorothiazide and chlorthalidone on ambulatory and office blood pressure.” (2006)
5. Gillis Pharmaceuticals: https://www.youtube.com/watch?v=HOxuAtehumc
6. Concepts in Hypertension, Volume 2 Issue 6

Summary by Ryan Commins, MD

Week 20 – CHADS2

“Validation of Clinical Classification Schemes for Predicting Stroke”

JAMA. 2001 June 13;285(22):2864-70. [free full text]

Atrial fibrillation is the most common cardiac arrhythmia and affects 1-2% of the overall population, with increasing prevalence as people age. Atrial fibrillation also carries substantial morbidity and mortality due to the risk of stroke and thromboembolism, although the risk of embolic phenomenon varies widely across various subpopulations. In 2001, the only oral anticoagulation options available were warfarin and aspirin, which had relative risk reductions of 62% and 22%, respectively, consistent across these subpopulations. Clinicians felt that high risk patients should be anticoagulated, but the two common classification schemes, AFI and SPAF, were flawed. Patients were often classified as low risk in one scheme and high risk in the other. The schemes were derived retrospectively and were clinically ambiguous. Therefore, in 2001 a group of investigators combined the two existing schemes to create the CHADS2 scheme and applied it to a new data set.

Population (NRAF cohort): Hospitalized Medicare patients ages 65-95 with non-valvular AF not prescribed warfarin at hospital discharge. Patient records were manually abstracted by five quality improvement organizations in seven US states (California, Connecticut, Louisiana, Maine, Missouri, New Hampshire, and Vermont).

Intervention: Determination of CHADS2 score (1 point for recent CHF, hypertension, age ≥ 75, and DM; 2 points for a history of stroke or TIA)

Comparison: AFI and SPAF risk schemes

Measured Outcome: Hospitalization rates for ischemic stroke (per ICD-9 codes from Medicare claims), stratified by CHADS2 / AFI / SPAF scores.

Calculated Outcome: performance of the various schemes, based on c statistic (a measure of predictive accuracy in a binary logistic regression model)

1733 patients were identified in the NRAF cohort. When compared to the AFI and SPAF trials, these patients tended be older (81 in NRAF vs. 69 in AFI vs. 69 in SPAF), have a higher burden of CHF (56% vs. 22% vs. 21%), more likely to be female (58% vs. 34% vs. 28%), had a history of DM (23% vs. 15% vs. 15%) and prior stroke or TIA (25% vs. 17% vs. 8%). The stroke rate was lowest in the group with a CHADS2 = 0 (1.9 per 100 patient years, adjusting for the assumption that aspirin was not taken). The stroke rate increased by a factor of approximately 1.5 for each 1-point increase in the CHADS2 score.

CHADS2 score            NRAF Adjusted Stroke Rate per 100 Patient-Years
0                                      1.9
1                                      2.8
2                                      4.0
3                                      5.9
4                                      8.5
5                                      12.5
6                                      18.2

The CHADS2 scheme had a c statistic of 0.82 compared to 0.68 for the AFI scheme and 0.74 for the SPAF scheme.

The CHADS2 scheme provides clinicians with a scoring system to help guide decision making for anticoagulation in patients with non-valvular AF.

The authors note that the application of the CHADS2 score could be useful in several clinical scenarios. First, it easily identifies patients at low risk of stroke (CHADS2 = 0) for whom anticoagulation with warfarin would probably not provide significant benefit. The authors argue that these patients should merely be offered aspirin. Second, the CHADS2 score could facilitate medication selection based on a patient-specific risk of stroke. Third, the CHADS2 score could help clinicians make decisions regarding anticoagulation in the perioperative setting by evaluating the risk of stroke against the hemorrhagic risk of the procedure. Although the CHADS2 is no longer the preferred risk-stratification scheme, the same concepts are still applicable to the more commonly used CHA2DS2-VASc.

This study had several strengths. First, the cohort was from seven states that represented all geographic regions of the United States. Second, CHADS2 was pre-specified based on previous studies and validated using the NRAF data set. Third, the NRAF data set was obtained from actual patient chart review as opposed to purely from an administrative database. Finally, the NRAF patients were older and sicker than those of the AFI and SPAF cohorts, thus the CHADS2 appears to be generalizable to the very large demographic of frail, elderly Medicare patients.

As CHADS2 became widely used clinically in the early 2000s, its application to other cohorts generated a large intermediate-risk group (CHADS2 = 1), which was sometimes > 60% of the cohort (though in the NRAF cohort, CHADS2 = 1 accounted for 27% of the cohort). In clinical practice, this intermediate-risk group was to be offered either warfarin or aspirin. Clearly, a clinical-risk predictor that does not provide clear guidance in over 50% of patients needs to be improved. As a result, the CHA2DS2-VASc scoring system was developed from the Birmingham 2009 scheme. When compared head-to-head in registry data, CHA2DS2-VASc more effectively discriminated stroke risk among patients with a baseline CHADS2 score of 0 to 1. Because of this, CHA2DS2-VASc is the recommended risk stratification scheme in the AHA/ACC/HRS 2014 Practice Guideline for Atrial Fibrillation. In modern practice, anticoagulation is unnecessary when CHA2DS2-VASc score = 0, should be considered (vs. antiplatelet or no treatment) when score = 1, and is recommended when score ≥ 2.

Further Reading:
1. AHA/ACC/HRS 2014 Practice Guideline for Atrial Fibrillation
2. CHA2DS2-VASc (2010)
3. 2 Minute Medicine

Summary by Ryan Commins, MD

Week 18 – VERT

“Effects of Risedronate Treatment on Vertebral and Nonvertebral Fractures in Women With Postmenopausal Osteoporosis”

by the Vertebral Efficacy with Risedronate Therapy (VERT) Study Group

JAMA. 1999 Oct 13;282(14):1344-52. [free full text]

Bisphosphonates are a highly effective and relatively safe class of medications for the prevention of fractures in patients with osteoporosis. The VERT trial published in 1999 was a landmark trial that demonstrated this protective effect with the daily oral bisphosphonate risedronate.

Population: post-menopausal women with either 2 or more vertebral fractures per radiography or 1 vertebral fracture with decreased lumbar spine bone mineral density

Intervention: risedronate 2.5mg mg PO daily or risedronate 5mg PO daily

Comparison: placebo PO daily

1. prevalence of new vertebral fracture at 3 years follow-up, per annual imaging
2. prevalence of new non-vertebral fracture at 3 years follow-up, per annual imaging
3. change in bone mineral density, per DEXA q6 months

2458 patients were randomized. During the course of the study, “data from other trials indicated that the 2.5mg risedronate dose was less effective than the 5mg dose,” and thus the authors discontinued further data collection on the 2.5mg treatment arm at 1 year into the study. All treatment groups had similar baseline characteristics. 55% of the placebo group and 60% of the 5mg risedronate group completed 3 years of treatment. The prevalence of new vertebral fracture within 3 years was 11.3% in the risedronate group and 16.3% in the placebo group (RR 0.59, 95% CI 0.43-0.82, p = 0.003; NNT = 20). The prevalence of new non-vertebral fractures at 3 years was 5.2% in the treatment arm and 8.4% in the placebo arm (RR 0.6, 95% CI 0.39-0.94, p = 0.02; NNT = 31). Regarding bone mineral density (BMD), see Figure 4 for a visual depiction of the changes in BMD by treatment group at the various 6-month timepoints. Notably, change from baseline BMD of the lumbar spine and femoral neck was significantly higher (and positive) in the risedronate 5mg group at all follow-up timepoints relative to the placebo group and at all timepoints except 6 months for the femoral trochanter measurements. Regarding adverse events, there was no difference in the incidence of upper GI adverse events among the two groups. GI complaints “were the most common adverse events associated with study discontinuance,” and GI events lead to 42% of placebo withdrawals but only 36% of the 5mg risedronate withdrawals.

Oral risedronate reduces the risk of vertebral and non-vertebral fractures in patients with osteoporosis while increasing bone mineral density.

Overall, this was a large, well-designed RCT that demonstrated a concrete treatment benefit. As a result, oral bisphosphonate therapy has become the standard of care both for treatment and prevention of osteoporosis. This study, as well as others, demonstrated that such therapies are well-tolerated with relatively few side effects.

A notable strength of this study is that it did not exclude patients with GI comorbidities.  One weakness is the modification of the trial protocol to eliminate the risedronate 2.5mg treatment arm after 1 year of study. Although this arm demonstrated a reduction in vertebral fracture at 1 year relative to placebo (p = 0.02), its elimination raises suspicion that the pre-specified analyses were not yielding the anticipated results during the interim analysis and thus the less-impressive treatment arm was discarded.

Further Reading/References:
1. Weekly alendronate vs. weekly risedronate
2. Comparative effectiveness of pharmacologic treatments to prevent fractures: an updated systematic review (2014)

Summary by Duncan F. Moore, MD

Week 13 – CURB-65

“Defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study”

Thorax. 2003 May;58(5):377-82. [free full text]

Community-acquired pneumonia (CAP) is frequently encountered by the admitting medicine team. Ideally, the patient’s severity at presentation and risk for further decompensation should determine the appropriate setting for further care, whether as an outpatient, on an inpatient ward, or in the ICU. At the time of this 2003 study, the predominant decision aid was the 20-variable Pneumonia Severity Index. The authors of this study sought to develop a simpler decision aid for determining the appropriate level of care at presentation.

Population: adults admitted for CAP via the ED at three non-US academic medical centers

Intervention/Comparison: none

Outcome: 30-day mortality

Additional details about methodology: This study analyzed the aggregate data from three previous CAP cohort studies. 80% of the dataset was analyzed as a derivation cohort – meaning it was used to identify statistically significant, clinically relevant prognostic factors that allowed for mortality risk stratification. The resulting model was applied to the remaining 20% of the dataset (the validation cohort) in order to assess the accuracy of its predictive ability.

The following variables were integrated into the final model (CURB-65):

  1. Confusion
  2. Urea > 19mg/dL (7 mmol/L)
  3. Respiratory rate ≥ 30 breaths/min
  4. low Blood pressure (systolic BP < 90 mmHg or diastolic BP < 60 mmHg)
  5. age ≥ 65

1068 patients were analyzed. 821 (77%) were in the derivation cohort. 86% of patients received IV antibiotics, 5% were admitted to the ICU, and 4% were intubated. 30-day mortality was 9%. 9 of 11 clinical features examined in univariate analysis were statistically significant (see Table 2).

Ultimately, using the above-described CURB-65 model, in which 1 point is assigned for each clinical characteristic, patients with a CURB-65 score of 0 or 1 had 1.5% mortality, patients with a score of 2 had 9.2% mortality, and patients with a score of 3 or more had 22% mortality. Similar values were demonstrated in the validation cohort. Table 5 summarizes the sensitivity, specificity, PPVs, and NPVs of each CURB-65 score for 30-day mortality in both cohorts. As we would expect from a good predictive model, the sensitivity starts out very high and decreases with increasing score, while the specificity starts out very low and increases with increasing score. For the clinical application of their model, the authors selected the cut points of 1, 2, and 3 (see Figure 2).

CURB-65 is a simple 5-variable decision aid that is helpful in the initial stratification of mortality risk in patients with CAP.

The wide range of specificities and sensitivities at different values of the CURB-65 score makes it a robust tool for risk stratification. The authors felt that patients with a score of 0-1 were “likely suitable for home treatment,” patients with a score of 2 should have “hospital-supervised treatment,” and patients with score of  ≥ 3 had “severe pneumonia” and should be admitted (with consideration of ICU admission if score of 4 or 5).

Following the publication of the CURB-65 Score, the author of the Pneumonia Severity Index (PSI) published a prospective cohort study of CAP that examined the discriminatory power (area under the receiver operating characteristic curve) of the PSI vs. CURB-65. His study found that the PSI “has a higher discriminatory power for short-term mortality, defines a greater proportion of patients at low risk, and is slightly more accurate in identifying patients at low risk” than the CURB-65 score.

Expert opinion at UpToDate prefers the PSI over the CURB-65 score based on its more robust base of confirmatory evidence. Of note, the author of the PSI is one of the authors of the relevant UpToDate article. In an important contrast from the CURB-65 authors, these experts suggest that patients with a CURB-65 score of 0 be managed as outpatients, while patients with a score of 1 and above “should generally be admitted.”

Further Reading/References:
1. Original publication of the PSI, NEJM (1997)
2. PSI vs. CURB-65 (2005)
3. Wiki Journal Club
4. 2 Minute Medicine
5. UpToDate, “CAP in adults: assessing severity and determining the appropriate level of care”

Summary by Duncan F. Moore, MD

Week 10 – MELD

“A Model to Predict Survival in Patients With End-Stage Liver Disease”

Hepatology. 2001 Feb;33(2):464-70. [free full text]

Prior to the adoption of the Model for End-Stage Liver Disease (MELD) score for the allocation of liver transplants, determination of medical urgency was dependent on the Child-Pugh score. The Child-Pugh score was limited by the inclusion of two subjective variables (severity of ascites and severity of encephalopathy), limited discriminatory ability, and a ceiling effect of laboratory abnormalities. Stakeholders sought an objective, continuous, generalizable index that more accurately and reliably represented disease severity. The MELD score had originally been developed in 2000 to estimate the survival of patients undergoing TIPS. The authors of this 2001 study hypothesized that the MELD score would accurately estimate short-term survival in a wide range of severities and etiologies of liver dysfunction and thus serve as a suitable replacement measure for the Child-Pugh score in the determination of medical urgency in transplant allocation.

This study reported a series of retrospective validation cohorts for the use of MELD in prediction of mortality in advanced liver disease.



  1. cirrhotic inpatients, Mayo Clinic, 1994-1999, n = 282 (see exclusion criteria)
  2. ambulatory patients with noncholestatic cirrhosis, newly-diagnosed, single-center in Italy, 1981-1984, n = 491 consecutive patients
  3. ambulatory patients with primary biliary cirrhosis, Mayo Clinic, 1973-1984, n = 326 (92 lacked all necessary variables for calculation of MELD)
  4. cirrhotic patients, Mayo Clinic, 1984-1988, n = 1179 patients with sufficient follow-up (≥ 3 months) and laboratory data

Index MELD score was calculated for each patient. Death during follow-up was assessed by chart review.

MELD score = 3.8*ln([bilirubin]) + 11.2*ln(INR) + 9.6*ln([Cr])+6.4*(etiology: 0 if cholestatic or alcoholic, 1 otherwise)

Primary study outcome was the concordance c-statistic between MELD score and 3-month survival. The c-statistic is equivalent to the area under receiver operating characteristic (AUROC). Per the authors, “a c-statistic between 0.8 and 0.9 indicates excellent diagnostic accuracy and a c-statistic greater than 0.7 is generally considered as a useful test.” (See page 455 for further explanation.)

There was no reliable comparison statistic (e.g. c-statistic of MELD vs. Child-Pugh in all groups).



  • hospitalized Mayo patients (late 1990s): c-statistic for prediction of 3-month survival = 0.87 (95% CI 0.82-0.92)
  • ambulatory, non-cholestatic Italian patients: c-statistic for 3-month survival = 0.80 (95% CI 0.69-0.90)
  • ambulatory PBC patients at Mayo: c-statistic for 3-month survival = 0.87 (95% CI 0.83-0.99)
  • cirrhotic patients at Mayo (1980s): c-statistic for 3-month survival = 0.78 (95% CI 0.74-0.81)


  • There was minimal improvement in the c-statistics for 3-month survival with the individual addition of SBP, variceal bleed, ascites, and encephalopathy to the MELD score (see Table 4, highest increase in c-statistic was 0.03).
  • When the etiology of liver disease was excluded from the MELD score, there was minimal change in the c-statistics (see Table 5, all paired CIs overlap).
  • C-statistics for 1-week mortality ranged from 0.80 to 0.95.

The MELD score is an excellent predictor of short-term mortality in patients with end-stage liver disease of diverse etiology and severity.

Despite the retrospective nature of this study, this study represented a significant improvement upon the Child-Pugh score in determining medical urgency in patients who require liver transplant.

In 2002, the United Network for Organ Sharing (UNOS) adopted a modified version of the MELD score for the prioritization of deceased-donor liver transplants in cirrhosis.

Concurrent with the 2001 publication of this study, Wiesner et al. performed a prospective validation of the use of MELD in the allocation of liver transplantation. When published in 2003, it demonstrated that MELD score accurately predicted 3-month mortality among patients with chronic liver disease on the waitlist.

The MELD score has also been validated in other conditions such as alcoholic hepatitis, hepatorenal syndrome, and acute liver failure (see UpToDate).

Subsequent additions to the MELD score have come out over the years. In 2006, the MELD Exception Guidelines offered extra points for severe comorbidities (e.g HCC, hepatopulmonary syndrome). In January 2016, the MELDNa score was adopted and is now used for liver transplant prioritization.

References and Further Reading:
1. “A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts” (2000)
2. MDCalc “MELD Score”
3. Wiesner et al. “Model for end-stage liver disease (MELD) and allocation of donor livers” (2003)
4. Freeman Jr. et al. “MELD exception guidelines” (2006) 
5. 2 Minute Medicine
6. UpToDate “Model for End-stage Liver Disease (MELD)”

Summary by Duncan F. Moore, MD

Week 9 – Bicarbonate supplementation in CKD

“Bicarbonate Supplementation Slows Progression of CKD and Improves Nutritional Status”

J Am Soc Nephrol. 2009 Sep;20(9):2075-84. [free full text]

Metabolic acidosis is a common complication of advanced CKD. Some animal models of CKD have suggested that worsening metabolic acidosis is associated with worsening proteinuria, tubulointerstitial fibrosis, and acceleration of decline of renal function. Short-term human studies have demonstrated that bicarbonate administration reduces protein catabolism and that metabolic acidosis is an independent risk factor for acceleration of decline of renal function. However, until the 2009 study by de Brito-Ashurst et al., there were no long-term studies demonstrating the beneficial effects of oral bicarbonate administration on CKD progression and nutritional status.

Population: CKD patients with CrCl 15-30ml/min and plasma bicarbonate 16-20 mEq/L

Intervention: sodium bicarbonate 600mg PO TID with protocolized uptitration to achieve plasma HCO3 ≥ 23 mEq/L, for 2 years

Comparison: routine care

1) decline in CrCl at 2 years
2) “rapid progression of renal failure” (defined as decline of CrCl > 3 ml/min per year)
3) development of ESRD requiring dialysis

1) change in dietary protein intake
2) change in normalized protein nitrogen appearance (nPNA)
3) change in serum albumin
4) change in mid-arm muscle circumference

134 patients were randomized, and baseline characteristics were similar among the two groups. Serum bicarbonate levels increased significantly in the treatment arm (see Figure 2). At two years, CrCl decline was 1.88 ml/min in the treatment group vs. 5.93 ml/min in the control group (p<0.01); rapid progression of renal failure was noted in 9% of intervention group vs. 45% of the control group (RR 0.15, 95% CI 0.06–0.40, p<0.0001, NNT = 2.8); and ESRD developed in 6.5% of the intervention group vs. 33% of the control group (RR 0.13, 95% CI 0.04–0.40, p<0.001; NNT = 3.8). Regarding nutritional status: dietary protein intake increased in the treatment group relative to the control group (p<0.007), normalized protein nitrogen appearance decreased in the treatment group and increased in the control group (p<0.002), serum albumin increased in the treatment group but was unchanged in the control group, and mean mid-arm muscle circumference increased by 1.5 cm in the intervention group vs. no change in the control group (p<0.03).

Oral bicarbonate supplementation in CKD patients with metabolic acidosis reduces the rate of CrCl decline and progression to ESRD and improves nutritional status.

Primarily on the basis of this study, the KDIGO 2012 guidelines for the management of CKD recommend oral bicarbonate supplementation to maintain serum bicarbonate within the normal range (23-29 mEq/L).

This is a remarkably cheap and effective intervention. Importantly, the rates of adverse events, particularly worsening hypertension and increasing edema, were unchanged among the two groups. Of note, sodium bicarbonate induces much less volume expansion than a comparable sodium load of sodium chloride.

In their discussion, the authors suggest that their results support the hypothesis of Nath et al. (1985) that “compensatory changes [in the setting of metabolic acidosis] such as increased ammonia production and the resultant complement cascade activation in remnant tubules in the declining renal mass [are] injurious to the tubulointerstitium.”

The hypercatabolic state of advanced CKD appears to be mitigated by bicarbonate supplementation. The authors note that “an optimum nutritional status has positive implications on the clinical outcomes of dialysis patients, whereas [protein-energy wasting] is associated with increased morbidity and mortality.”

Limitations to this trial include its open label, no placebo design. Also, the applicable population is limited by study exclusion criteria of morbid obesity, overt CHF, and uncontrolled HTN.

Further Reading:
1. Nath et al. “Pathophysiology of chronic tubulo-interstitial disease in rats: Interactions of dietary acid load, ammonia, and complement component-C3” (1985)
2. KDIGO 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease (see page 89)
3. UpToDate

Summary by Duncan F. Moore, MD

Week 7 – FUO

“Fever of Unexplained Origin: Report on 100 Cases”

Medicine (Baltimore). 1961 Feb;40:1-30. [free full text]

In our modern usage, fever of unknown origin (FUO) refers to a persistent unexplained fever despite an adequate medical workup. The most commonly used criteria for this diagnosis stem from the 1961 series by Petersdorf and Beeson.

This study analyzed a prospective cohort of patients evaluated at Yale’s hospital for FUO between 1952 and 1957. Their FUO criteria: 1) illness of more than three week’s duration, 2) fever higher than 101º F on several occasions, 3) diagnosis uncertain after one week of study in hospital. After 126 cases had been noted, retrospective investigation was undertaken to determine the ultimate etiologies of the fevers. The authors winnowed this group to 100 cases based on availability of follow up data and the exclusion of cases that “represented combinations of such common entities as urinary tract infection and thrombophlebitis.”

126 cases were reviewed as noted above, and ultimately 100 were selected for analysis. In 93 cases “a reasonably certain diagnosis was eventually possible.” 6 of the 7 undiagnosed patients ultimately made a full recovery. Underlying etiology (see table 1 on page 3): infectious 36% (including TB in 11%), neoplastic diseases 19%, collagen disease (e.g. SLE) 13%, pulmonary embolism 3%, benign non-specific pericarditis 2%, sarcoidosis 2%, hypersensitivity reaction 4%, cranial arteritis 2%, periodic disease 5%, miscellaneous disease 4%, factitious fever 3%, no diagnosis made 7%.

Clearly, diagnostic modalities have improved markedly since this 1961 study. However, the core etiologies of infection, malignancy, and connective tissue disease / non-infectious inflammatory disease remain most prominent, while the percentage of patients with no ultimate diagnosis has been increasing (for example, see PMIDs 9413425, 12742800, and 17220753). Modifications to the 1961 criteria have been proposed (e.g. 1 week duration of hospital stay not required if certain diagnostic measures have been performed) and implemented in recent FUO trials. One modern definition of FUO: fever ≥ 38.3º C, lasting at least 2-3 weeks, with no identified cause after three days of hospital evaluation or three outpatient visits.

Per UpToDate, the following minimum diagnostic workup is recommended in suspected FUO: blood cultures, ESR or CRP, LDH, HIV, RF, heterophile antibody test, CK, ANA, TB testing, SPEP, CT of abdomen and chest.

Further Reading:
1. “Fever of unknown origin (FUO). I A. prospective multicenter study of 167 patients with FUO, using fixed epidemiologic entry criteria. The Netherlands FUO Study Group.” Medicine (Baltimore). 1997 Nov;76(6):392-400.
2. “From prolonged febrile illness to fever of unknown origin: the challenge continues.” Arch Intern Med. 2003 May 12;163(9):1033-41.
3. “A prospective multicenter study on fever of unknown origin: the yield of a structured diagnostic protocol.” Medicine (Baltimore). 2007 Jan;86(1):26-38.
4. UpToDate, “Approach to the Adult with Fever of Unknown Origin”
5. “Robert Petersdorf, 80, Major Force in U.S. Medicine, Dies” The New York Times, 2006

Summary by Duncan F. Moore, MD