A Reinforcement Learning-Based Approach for Promoting Mental Health Using Multimodal Emotion Recognition

Authors

  • Amod Pathirana University of Colombo School of Computing (UCSC)
  • Dumidu Kasun Rajakaruna University of Colombo School of Computing (UCSC)
  • Dharshana Kasthurirathna Sri Lanka Institute of Information Technology
  • Ajantha Atukorale University of Colombo School of Computing (UCSC)
  • Rekha Aththidiye District Health Board Bay of Plenty
  • Maheshi Yatipansalawa University of Colombo School of Computing (UCSC)

DOI:

https://doi.org/10.62411/faith.2024-22

Keywords:

Anxiety and depression symptoms, Cognitive behavioral therapy, DASS-21 questionnaire, Multimodal emotion prediction, Reinforcement learning

Abstract

This research aims to enhance mental well-being by addressing symptoms of anxiety and depression through a personalized, culturally specific multimodal emotion prediction system. It employs an emotionally aware Reinforcement Learning (RL) agent to suggest tailored Cognitive Behavioral Therapy (CBT) activities. The study focuses on developing precise, individualized emotion prediction models using facial expressions, vocal tones, and text, and integrates these models with the RL agent for emotionally aware CBT recommendations. The mHealth approach combines deep learning models with RL, achieving accuracies of 72% for facial expressions, 73% for vocal tones, and 86% for text, all fine-tuned for the Sri Lankan context. Validation through real-world use and user feedback consistently demonstrated that each model exceeds 70% accuracy, fulfilling the objective of precise emotion prediction. A weighted algorithm was introduced to refine the emotion prediction experience and personalize forecasts across the three modalities to enhance mental well-being. The RL-enabled agent suggests CBT activities approved by mental health professionals, tailored based on predicted emotions, and delivered through the same mHealth application. The effectiveness of these interventions was assessed using the DASS-21 questionnaire, revealing significant reductions in depression scores (from 21.08 to 13.54) and anxiety scores (from 19.85 to 10.46) in the study group compared to the control group. The study concludes that integrating multimodal emotion prediction models with RL-based CBT suggestions positively impacts mental well-being and contributes to personalized mental health interventions.

Downloads

Download data is not yet available.

Author Biographies

Amod Pathirana, University of Colombo School of Computing (UCSC)

University of Colombo School of Computing (UCSC), Colombo, Sri Lanka

Dumidu Kasun Rajakaruna, University of Colombo School of Computing (UCSC)

University of Colombo School of Computing (UCSC), Colombo, Sri Lanka

Dharshana Kasthurirathna, Sri Lanka Institute of Information Technology

Sri Lanka Institute of Information Technology, Sri Lanka

Ajantha Atukorale, University of Colombo School of Computing (UCSC)

University of Colombo School of Computing (UCSC), Colombo, Sri Lanka

Rekha Aththidiye, District Health Board Bay of Plenty

District Health Board Bay of Plenty, New Zealand

Maheshi Yatipansalawa, University of Colombo School of Computing (UCSC)

University of Colombo School of Computing (UCSC), Colombo, Sri Lanka

References

World Health Organization, “Mental disorders.” 2022. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/mental-disorders

National Institute of Mental Health (NIMH), “Any Anxiety Disorder.” [Online]. Available: https://www.nimh.nih.gov/health/statistics/any-anxiety-disorder

Kaiser Family Foundation (KFF), “Adults Reporting Symptoms of Anxiety or Depressive Disorder During the COVID-19 Pandemic by Sex.” [Online]. Available: https://www.kff.org/other/state-indicator/adults-reporting-symptoms-of-anxiety-or-depressive-disorder-during-the-covid-19-pandemic-by-sex/

D. F. Santomauro et al., “Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic,” Lancet, vol. 398, no. 10312, pp. 1700–1712, Nov. 2021, doi: 10.1016/S0140-6736(21)02143-7.

A. Ojagbemi and O. Gureje, “Mental health in low- and middle-income countries,” in Oxford Textbook of Social Psychiatry, 1st ed., D. Bhugra, D. Moussaoui, and T. J. Craig, Eds. Oxford: Oxford University PressOxford, 2022, pp. 699–712. doi: 10.1093/med/9780198861478.003.0072.

World Health Organization, “Depressive disorder (depression).” 2023.

C. D. Mathers and D. Loncar, “Projections of Global Mortality and Burden of Disease from 2002 to 2030,” PLoS Med., vol. 3, no. 11, p. e442, Nov. 2006, doi: 10.1371/journal.pmed.0030442.

Psychiatry.org, “Warning Signs of Mental Illness.” [Online]. Available: https://www.psychiatry.org/patients-families/warning-signs-of-mental-illness

P. R. Muskin, “What are anxiety disorders?” 2021. [Online]. Available: https://www.psychiatry.org/patients-families/anxiety-disorders/what-are-anxiety-disorders

Mind, “Anxiety Signs and Symptoms.” 2024. [Online]. Available: https://www.mind.org.uk/information-support/types-of-mental-health-problems/anxiety-and-panic-attacks/symptoms/

J. Truschel, “Depression definition and DSM-5 diagnostic criteria - psycom.” 2022. [Online]. Available: https://www.psycom.net/depression/major-depressive-disorder/dsm-5-depression-criteria

S. Yoon, V. Dang, J. Mertz, and J. Rottenberg, “Are attitudes towards emotions associated with depression? A conceptual and meta-analytic review,” J. Affect. Disord., vol. 232, pp. 329–340, 2018, doi: 10.1016/j.jad.2018.01.009.

P. Ekman, “Universal facial expressions of emotion,” Calif. Ment. Heal. Res. Dig., vol. 8, no. 4, pp. 151–158, 1970.

J. F. Cohn et al., “Detecting depression from facial actions and vocal prosody,” in 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, Sep. 2009, pp. 1–7. doi: 10.1109/ACII.2009.5349358.

U. Smrke, I. Mlakar, S. Lin, B. Musil, and N. Plohl, “Language, Speech, and Facial Expression Features for Artificial Intelligence–Based Detection of Cancer Survivors’ Depression: Scoping Meta-Review,” JMIR Ment. Heal., vol. 8, no. 12, p. e30439, Dec. 2021, doi: 10.2196/30439.

A. K. Silberbogen, E. Ulloa, D. L. Mori, and K. Brown, “A telehealth intervention for veterans on antiviral treatment for the hepatitis C virus.,” Psychol. Serv., vol. 9, no. 2, pp. 163–173, May 2012, doi: 10.1037/a0026821.

E. A. Elliott and A. M. Jacobs, “Facial expressions, emotions, and sign languages,” Front. Psychol., vol. 4, p. 115, 2013, doi: 10.3389/fpsyg.2013.00115.

P. Dursun, M. Emül, F. Gençöz, and others, “A Review of the Literature on Emotional Facial Expression and Its Nature,” Yeni Symp., vol. 48, no. 3, 2010.

Z. Ahanin, M. A. Ismail, N. S. S. Singh, and A. AL-Ashmori, “Hybrid Feature Extraction for Multi-Label Emotion Classification in English Text Messages,” Sustainability, vol. 15, no. 16, p. 12539, Aug. 2023, doi: 10.3390/su151612539.

R. Zhang, C. Jia, and J. Wang, “Text emotion classification system based on multifractal methods,” Chaos, Solitons & Fractals, vol. 156, p. 111867, Mar. 2022, doi: 10.1016/j.chaos.2022.111867.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in arXiv, Sep. 2014, pp. 1–14. [Online]. Available: https://arxiv.org/abs/1409.1556

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, vol. 2016-Decem, pp. 770–778. doi: 10.1109/CVPR.2016.90.

M. A. H. Akhand, S. Roy, N. Siddique, M. A. S. Kamal, and T. Shimamura, “Facial Emotion Recognition Using Transfer Learning in the Deep CNN,” Electronics, vol. 10, no. 9, p. 1036, Apr. 2021, doi: 10.3390/electronics10091036.

B. Council, “Facial Expressions, Cultural Difference, Empathy.” 2024.

R. Jack, “Perception of Facial Expressions differs across Cultures,” American Psychological Association, 2011.

R. Banse and K. R. Scherer, “Acoustic profiles in vocal emotion expression.,” J. Pers. Soc. Psychol., vol. 70, no. 3, pp. 614–636, 1996, doi: 10.1037/0022-3514.70.3.614.

M. El Ayadi, M. S. Kamel, and F. Karray, “Survey on speech emotion recognition: Features, classification schemes, and databases,” Pattern Recognit., vol. 44, no. 3, pp. 572–587, Mar. 2011, doi: 10.1016/j.patcog.2010.09.020.

E. van der Westhuizen and T. R. Niesler, “Synthesised bigrams using word embeddings for code-switched ASR of four South African language pairs,” Comput. Speech Lang., vol. 54, pp. 151–175, Mar. 2019, doi: 10.1016/j.csl.2018.10.002.

C. Doğdu, T. Kessler, D. Schneider, M. Shadaydeh, and S. R. Schweinberger, “A Comparison of Machine Learning Algorithms and Feature Sets for Automatic Vocal Emotion Recognition in Speech,” Sensors, vol. 22, no. 19, p. 7561, Oct. 2022, doi: 10.3390/s22197561.

H. A. Abdulmohsin, H. B. Abdul wahab, and A. M. J. Abdul hossen, “A new proposed statistical feature extraction method in speech emotion recognition,” Comput. Electr. Eng., vol. 93, p. 107172, Jul. 2021, doi: 10.1016/j.compeleceng.2021.107172.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” arXiv. Jan. 16, 2013. [Online]. Available: http://arxiv.org/abs/1301.3781

J. Pennington, R. Socher, and C. Manning, “Glove: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543. doi: 10.3115/v1/D14-1162.

V. K. Jain, S. Kumar, and S. L. Fernandes, “Extraction of emotions from multilingual text using intelligent text processing and computational linguistics,” J. Comput. Sci., vol. 21, pp. 316–326, Jul. 2017, doi: 10.1016/j.jocs.2017.01.010.

J. Guo, “Deep learning approach to text analysis for human emotion detection from big data,” J. Intell. Syst., vol. 31, no. 1, pp. 113–126, Jan. 2022, doi: 10.1515/jisys-2022-0001.

L. Cai, Y. Hu, J. Dong, and S. Zhou, “Audio‐Textual Emotion Recognition Based on Improved Neural Networks,” Math. Probl. Eng., vol. 2019, no. 1, pp. 1–9, Jan. 2019, doi: 10.1155/2019/2593036.

D. Matsumoto and H. S. Hwang, “Culture and Emotion: The Integration of Biological and Cultural Contributions,” J. Cross. Cult. Psychol., vol. 43, no. 1, pp. 91–118, Jan. 2012, doi: 10.1177/0022022111420147.

D. Keltner and J. Haidt, “Social functions of emotions.” The Guilford Press, 2001.

S. An, L.-J. Ji, M. Marks, and Z. Zhang, “Two Sides of Emotion: Exploring Positivity and Negativity in Six Basic Emotions across Cultures,” Front. Psychol., vol. 8, p. 610, Apr. 2017, doi: 10.3389/fpsyg.2017.00610.

Psychology Tools, “Behavioral Activation Activity Diary.” 2024.

B. W. Dunlop, K. Scheinberg, and A. L. Dunlop, “Ten ways to improve the treatment of depression and anxiety in adults.,” Ment. Health Fam. Med., vol. 10, no. 3, pp. 175–81, Sep. 2013, [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/24427185

J. S. Beck, Cognitive behavior therapy: Basics and beyond. New York, NY, USA: The Guilford Press, 2021.

C. E. Ackerman, “CBT techniques: 25 cognitive behavioral therapy worksheets.” 2023.

S. Asokan, P. Geetha Priya, Sn. Natchiyar, and M. Elamathe, “Effectiveness of distraction techniques in the management of anxious children – A randomized controlled pilot trial,” J. Indian Soc. Pedod. Prev. Dent., vol. 38, no. 4, p. 407, 2020, doi: 10.4103/JISPPD.JISPPD_435_20.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, 2018.

M. Banoula, “What is Q-learning: Everything you need to know.” 2023.

A. Violante, “Simple reinforcement learning: Q-learning.” 2019.

M. Milne-Ives, E. Selby, B. Inkster, C. Lam, and E. Meinert, “Artificial intelligence and machine learning in mobile apps for mental health: A scoping review,” PLOS Digit. Heal., vol. 1, no. 8, p. e0000079, Aug. 2022, doi: 10.1371/journal.pdig.0000079.

eMoods, “Track your moods, improve your wellbeing.” Yottaram, LLC.

M. T. H. Le, T. D. Tran, S. Holton, H. T. Nguyen, R. Wolfe, and J. Fisher, “Reliability, convergent validity and factor structure of the DASS-21 in a sample of Vietnamese adolescents,” PLoS One, vol. 12, no. 7, p. e0180557, Jul. 2017, doi: 10.1371/journal.pone.0180557.

J. D. Henry and J. R. Crawford, “The short‐form version of the Depression Anxiety Stress Scales (DASS‐21): Construct validity and normative data in a large non‐clinical sample,” Br. J. Clin. Psychol., vol. 44, no. 2, pp. 227–239, Jun. 2005, doi: 10.1348/014466505X29657.

S. T. Arokkiya Mary and L. Jabasheela, “An Evaluation of Classification Techniques for Depression, Anxiety and Stress Assessment,” in Proceedings of the International Conference for Phoenixes on Emerging Current Trends in Engineering and Management (PECTEAM 2018), 2018, pp. 64–69. doi: 10.2991/pecteam-18.2018.13.

C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” J. Mach. Learn. Res., vol. 21, pp. 1–67, 2020.

P. N. R. Bodavarapu and P. V. V. S. Srinivas, “Facial expression recognition for low resolution images using convolutional neural networks and denoising techniques,” Indian J. Sci. Technol., vol. 14, no. 12, pp. 971–983, 2021, doi: 10.17485/IJST/v14i12.14.

G. P. Kusuma, Jonathan, and A. P. Lim, “Emotion Recognition on FER-2013 Face Images Using Fine-Tuned VGG-16,” Adv. Sci. Technol. Eng. Syst. J., vol. 5, no. 6, pp. 315–322, 2020, doi: 10.25046/aj050638.

M. G. de Pinto, M. Polignano, P. Lops, and G. Semeraro, “Emotions Understanding Model from Spoken Language using Deep Neural Networks and Mel-Frequency Cepstral Coefficients,” in 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), May 2020, pp. 1–5. doi: 10.1109/EAIS48028.2020.9122698.

R. M. B. Novais, “A framework for emotion and sentiment predicting supported in ensembles,” Universidade do Algarve, 2022.

M. Ciolino, J. Kalin, and D. Noever, “Fortify Machine Learning Production Systems: Detect and Classify Adversarial Attacks,” arXiv. Feb. 18, 2021. [Online]. Available: http://arxiv.org/abs/2102.09695

P.-C. Tu and H.-K. Pao, “A Dropout Style Model Augmentation for Cross Domain Few-Shot Learning,” in 2021 IEEE International Conference on Big Data (Big Data), Dec. 2021, pp. 1138–1147. doi: 10.1109/BigData52589.2021.9671673.

Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V Le, “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” in Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, pp. 5753–5763.

Downloads

Published

2024-09-17

How to Cite

[1]
A. Pathirana, D. K. Rajakaruna, D. Kasthurirathna, A. Atukorale, R. Aththidiye, and M. Yatipansalawa, “A Reinforcement Learning-Based Approach for Promoting Mental Health Using Multimodal Emotion Recognition”, J. Fut. Artif. Intell. Tech., vol. 1, no. 2, pp. 124–142, Sep. 2024.

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.