Curious Now

Story

Sentiment in Clinical Notes: A Predictor for Length of Stay?

Health & MedicineMind & Behavior

Key takeaway

A study found that the sentiment expressed in hospital patients' clinical notes may help predict how long they stay, which could improve hospital efficiency.

Read the paper

Quick Explainer

The study explored whether analyzing the sentiment expressed in clinical admission notes could provide useful predictions of patient length of stay in the hospital. The key idea was that unstructured narrative notes might contain latent information about disease complexity and diagnostic uncertainty that could complement typical structured data models for forecasting hospital duration. The researchers evaluated several natural language processing models to generate sentiment scores from the clinical notes, as well as a large language model that could directly estimate length of stay. While the sentiment-based approaches showed modest correlations, the direct LOS estimation by the language model outperformed the sentiment-based methods, suggesting it better captured the underlying clinical complexities reflected in the narrative documentation.

Deep Dive

Technical Deep Dive: Sentiment in Clinical Notes as a Predictor for Length of Stay

Overview

This study explores whether sentiment analysis of clinical admission notes can provide useful predictions of patient length of stay (LOS) in the hospital. Unstructured clinical narratives may contain latent information about disease complexity and diagnostic uncertainty that could supplement typical structured data models for LOS forecasting.

Methodology

  • The researchers conducted a retrospective study of 4,503 adult patients admitted with community-acquired pneumonia from 2013-2023.
  • They extracted physician-written admission history and physical notes, and evaluated four natural language processing models (VADER, TextBlob, Longformer, GPT-oss-20B) to generate zero-shot sentiment scores from the notes.
  • They also prompted the GPT-oss-20B model to directly estimate LOS.
  • The model outputs were correlated with the actual observed LOS using linear regression and Pearson correlation coefficients.

Results

  • Sentiment models showed statistically significant but weak correlations with actual LOS. The Longformer model achieved the highest variance explained (R^2 = 0.019).
  • Direct LOS estimation by the GPT-oss-20B language model outperformed the sentiment-based approaches, demonstrating the strongest correlation with actual hospital duration (r = -0.218, p < 0.001).
  • However, overall model agreement was poor (ICC = 0.059), and computational time varied drastically between models (2.6 seconds per 100 notes for TextBlob vs over 370 seconds for GPT-oss-20B).

Interpretation

  • The authors suggest that the limited predictive power of sentiment analysis stems from the objective, non-evaluative nature of typical clinical documentation. Physicians' notes focus on factual observations rather than subjective assessments.
  • In contrast, the direct LOS estimation by the large language model appears to better capture the latent complexity and uncertainty information in the clinical narratives.
  • The authors recommend that future predictive systems integrate computationally efficient NLP models alongside standard structured data to leverage the complementary strengths of both approaches.

Limitations & Uncertainties

  • The study was limited to a single disease (community-acquired pneumonia) at a single institution. The generalizability to other clinical conditions and settings is unclear.
  • The source of the open-source GPT-oss-20B model is not specified, and its performance relative to commercial or proprietary LLMs is unknown.
  • The paper does not provide details on the specific prompting approach used to elicit direct LOS estimates from the LLM.

What Comes Next

The authors propose several avenues for future research:

  • Exploring hybrid models that combine sentiment features, direct outcome estimation, and structured clinical data
  • Investigating computationally efficient NLP architectures that can capture latent clinical complexity
  • Validating the findings across a broader range of patient populations and clinical conditions

Source