A LangChain-based pipeline for one-shot synthetic text generation using generative pre-trained transformers in palliative care research

Loading...
Thumbnail Image
Files
Date
2025-10-15
Authors
Ronan, Isabel
Crowley, Patrice
Rombouts, Eva
Cornally, Nicola
Saab, Mohamad M.
Murphy, David
Tabirca, Sabin
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier Inc.
Research Projects
Organizational Units
Journal Issue
Abstract
Objective: As the world’s population ages, nursing homes are of increasing importance. In order to care for a growing number of older adults, intelligent technologies are needed. Artificial Intelligence can be utilised to enhance palliative care in nursing homes. However, the data needed to train artificially intelligent agents is lacking within this sensitive domain due to privacy issues. Therefore, it is difficult for researchers to develop technological solutions. With the advent of large language models, such as ChatGPT, new text generation methods are made possible using limited data. In this pilot study, we investigate the use of large language models to generate synthetic data. Methods: We investigate the feasibility of using GPT-3.5 and GPT-4o models along with one-shot prompting to produce synthetic nurse notes which faithfully describe nursing home residents with met or unmet palliative care needs. We used LangChain to create a repeatable pipeline which can be adapted to different use-cases. We also compare the performance of both models using a set of qualitative and quantitative evaluations to determine which set of notes is more suitable for subsequent research. Results: GPT-3.5 performed slightly better than GPT-4o in our qualitative healthcare professional analysis. Quantitative analysis revealed appropriately heterogenous results across contextual similarity, lexical overlap, sentiment, and readability scores. Conclusion: Our work is the first investigation of such a generation method in the nursing home palliative care domain. Further refinement and validation of such data is needed in order to ensure the safe use of our approach.
Description
Keywords
Synthetic , Palliative , Dataset , GenAI , LLM , Prompts , LangChain
Citation
Ronan, I., Crowley, P., Rombouts, E., Cornally, N., Saab, M. M., Murphy, D. and Tabirca, S. (2025) 'A LangChain-based pipeline for one-shot synthetic text generation using generative pre-trained transformers in palliative care research', Journal of Biomedical Informatics, 171, 104936 (12pp). https://doi.org/10.1016/j.jbi.2025.104936
Link to publisher’s version