Applying Natural Language Processing to Patient Messages to Identify Depression Concerns in Cancer Patients

Authors: Marieke van Buchem, Anne de Hond, Claudio Fanconi, Vaibhavi Shah, Max Schuessler, Ilse Kant, Ewout W Steyerberg, Tina Hernandez-Boussard

Affiliations: Stanford University and collaborating clinical research institutions, as listed in the publication

Venue: Journal of the American Medical Informatics Association

Representative figure for Applying Natural Language Processing to Patient Messages to Identify Depression Concerns in Cancer Patients

Abstract

Objective: This study explored and developed tools for early identification of depression concerns among patients with cancer by leveraging patient messages sent through a secure portal.

Materials and Methods: We trained classifiers based on logistic regression (LR), support vector machines (SVMs), and two Bidirectional Encoder Representations from Transformers (BERT) models (original and Reddit-pretrained) on 6,600 patient messages from a cancer centre (2009-2022), annotated by healthcare professionals. Performance was compared using AUROC, and fairness and explainability were evaluated. We also examined associations with depression diagnosis and treatment.

Results: BERT and RedditBERT achieved AUROCs of 0.88 and 0.86, compared with 0.79 for LR and 0.83 for SVM. BERT showed larger performance differences across sex, race, and ethnicity than RedditBERT. Patients whose messages were classified as concerning were more likely to receive a depression diagnosis, antidepressant prescription, or psycho-oncology referral. Explanations from BERT and RedditBERT differed, with no clear annotator preference.

Discussion: BERT and RedditBERT show potential for identifying depression concerns in patient messages. However, subgroup performance disparities indicate the need for careful bias assessment and responsible deployment in clinical workflows.

Conclusion: This work represents a meaningful methodological step towards early identification of depression concerns in oncology care, with potential to reduce clinical burden and improve patient support.