UCCNLP@SMM4H’22:Label distribution aware long-tailed learning with post-hoc posterior calibration applied to text classification

Loading...
Thumbnail Image
Files
2022.smm4h-1.26.pdf(230.18 KB)
Published Version
Date
2022-10
Authors
Trust, Paul
Kadusabe, Provia
Zahran, Ahmed
Minghim, Rosane
Omala, Kizito
Journal Title
Journal ISSN
Volume Title
Publisher
Association for Computational Linguistics
Published Version
Research Projects
Organizational Units
Journal Issue
Abstract
The paper describes our submissions for the Social Media Mining for Health (SMM4H) workshop 2022 shared tasks. We participated in 2 tasks: (1) classification of adverse drug events (ADE) mentions in english tweets (Task-1a) and (2) classification of self-reported intimate partner violence (IPV) on twitter (Task 7). We proposed an approach that uses RoBERTa (A Robustly Optimized BERT Pretraining Approach) fine-tuned with a label distribution-aware margin loss function and post-hoc posterior calibration for robust inference against class imbalance. We achieved a 4% and 1 % increase in performance on IPV and ADE respectively when compared with the traditional fine-tuning strategy with unweighted cross-entropy loss.
Description
Keywords
Social Media Mining for Health
Citation
Trust, P., Kadusabe, P., Zahran, A., Minghim, R. and Omala, K. (2022) 'UCCNLP@SMM4H’22:Label distribution aware long-tailed learning with post-hoc posterior calibration applied to text classification', Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, Gyeongju, Republic of Korea, 12-17 October, pp. 90-94. Available at: https://aclanthology.org/2022.smm4h-1.26.pdf (Accessed: 9 November 2022)
Link to publisher’s version