Story
Discordance in pleural mesothelioma response classification and modelling of impact on clinical trials
Key takeaway
Radiologists often disagree on whether cancer treatment is working in pleural mesothelioma patients, which could impact the accuracy of clinical trials testing new treatments.
Quick Explainer
This study examined the inconsistency among radiologists in classifying treatment response for pleural mesothelioma patients using the modified RECIST criteria. By modeling the impact of this discordance on clinical trial endpoints, the researchers found that even moderate levels of disagreement substantially reduce the statistical power and precision for evaluating new therapies. The discordance appears to stem from inherent challenges in applying the response criteria to this disease, rather than differences in reader expertise or tumor characteristics. This highlights the need for more reliable methods to assess treatment response in pleural mesothelioma clinical trials, which is crucial for accurately evaluating the efficacy of new therapies for this devastating disease.
Deep Dive
Technical Deep Dive: Discordance in Pleural Mesothelioma Response Classification
Overview
This study investigated the consistency of radiologist interpretation of treatment response in pleural mesothelioma (PM) patients, and modeled the impact of this inconsistency on the statistical power and precision of clinical trial endpoints.
Problem & Context
- Accurate assessment of treatment response is critical for evaluating new therapies in PM clinical trials.
- However, prior research has suggested poor agreement between radiologists in classifying PM response using the modified RECIST (mRECIST) criteria.
- The impact of this discordance on trial outcomes has not been well quantified.
Methodology
- The study took a mixed methods approach:
- Retrospective cohort analysis:
- Obtained CT scans and data from 4 UK centers for chemotherapy-treated PM patients
- Radiologist experts classified response using mRECIST v1.1, calculating discordance rate and inter-rater agreement
- In silico modeling:
- Simulated two-arm clinical trials with 80% power and 95% confidence intervals
- Varied mRECIST misclassification rate (a proxy for discordance) from 0-100%
- Measured impact on power and endpoint precision for 4 trial endpoints: objective response rate (ORR), disease control rate (DCR), progression-free survival (PFS), overall survival (OS)
- Retrospective cohort analysis:
Results
- 172 cases were included in the cohort analysis
- Discordance rate was 35%, with a kappa agreement of 0.456
- In silico modeling showed:
- At 17% misclassification (equivalent to 35% discordance):
- Power dropped from 80% to 55% for ORR, 53% for DCR, 65% for PFS, and 66% for OS
- Endpoint 95% confidence interval coverage reduced to 88%, 89%, 92%, and 92% respectively
- Higher discordance rates led to further reductions in power and precision
- At 17% misclassification (equivalent to 35% discordance):
- 83% of discordances were due to interpretation or measurement differences intrinsic to mRECIST, not tumor volume
Interpretation
- Inconsistent response classification is common in pleural mesothelioma, with a 35% discordance rate in this cohort.
- This high level of discordance substantially reduces the statistical power and precision of clinical trial endpoints, which has major implications for evaluating new therapies.
- The discordance appears to be driven by challenges inherent in applying the mRECIST criteria to this disease, rather than differences in reader expertise or tumor characteristics.
Limitations & Uncertainties
- The study only examined chemotherapy-treated patients - discordance may differ in trials of other treatment modalities.
- The in silico modeling made simplifying assumptions and did not account for all real-world trial complexities.
- The sample size, while substantial for this rare disease, may not capture the full range of discordance across all PM populations.
What Comes Next
- Efforts to improve the reliability of response assessment in PM trials, such as developing more robust response criteria or using quantitative image analysis.
- Further investigation of the sources of discordance to identify potential mitigations.
- Broader examination of response interpretation challenges across other oncology indications.
Sources:
