On the Evaluations of ChatGPT and Emotion-enhanced Prompting for Mental Health Analysis
Automated mental health analysis shows great potential for enhancing the efficiency and accessibility of mental health care, whereas the recent dominant methods utilized pre-trained language models (PLMs) as the backbone and incorporated emotional information. The latest large language models (LLMs), such as ChatGPT, exhibit dramatic capabilities on diverse natural language processing tasks. However, existing studies on ChatGPT's zero-shot performance for mental health analysis have limitations in inadequate evaluation, utilization of emotional information, and explainability of methods. In this work, we comprehensively evaluate the mental health analysis and emotional reasoning ability of ChatGPT on 11 datasets across 5 tasks, including binary and multi-class mental health condition detection, cause/factor detection of mental health conditions, emotion recognition in conversations, and causal emotion entailment. We empirically analyze the impact of different prompting strategies with emotional cues on ChatGPT's mental health analysis ability and explainability. Experimental results show that ChatGPT outperforms traditional neural network methods but still has a significant gap with advanced task-specific methods. The qualitative analysis shows its potential in explainability compared with advanced black-box methods but also limitations on robustness and inaccurate reasoning. Prompt engineering with emotional cues is found to be effective in improving its performance on mental health analysis but requires the proper way of emotion infusion.
READ FULL TEXT