
direction. The coherence score for unigram, bigram, and
trigram is positive so it is upwards.
Mental health is impacting individuals worldwide. Finding a
solution to combat it before it lets a person commit suicide is
of utmost importance as it is still considered taboo to talk
about it and consult a therapist for such problems. With the
advent of social media, it is easy for people to share their
issues with others, be it friends, healthcare professionals,
experts etc. In this work, we proposed a novel topic modeling
technique to find the inherent groups in individuals' mental
health disorder datasets. The motive of the study is to identify
the word usage of users on online social media platforms to
understand how individuals are perceiving and sharing their
experiences about mental health disorders. We applied the
topic modeling technique i.e. Latent Dirichlet Allocation to
understand these disorders. Therefore, using NLP techniques
combined with social media texts can help researchers get
better insights into the problems of individuals who cannot
share them with their healthcare professionals.
This work has many limitations which we can remove in the
future. Some of these are:
• When evaluating in terms of coherence score, we
have only considered the CV coherence score.
Although, there are more versions of the coherence
score available.
• In this work, we have only presented the top words
present in each topic only. We have not analyzed
which of the subreddits are falling into which class.
• We have compared this work with different n-gram
models using LDA.
In the future, we can also focus on improving this work. Some
of them are:
• Talking about coherence score, we have only
worked on one coherence score i.e. cv. There are
other variants of coherence scores also available on
which we are working.
• We have only focused on the Reddit platform. This
work can also be performed on other social media
platforms like Twitter, where users are vocal about
their issues and problems, and try to implement it
with deep learning models, and classify the different
subreddits in the given seven classes.
• We can also compare this work with other topic
modeling models Latent Semantic Indexing (LSI).
[1] David M. Blei, Andrew Y. Ng, and Michael I. Jordan, “Latent dirichlet
allocation.”, J. Mach. Learn. Res., vol. 3, pp. 993-1022, 2003.
[2] Fiksdal, A. S., Kumbamu, A., Jadhav, A. S., Cocos, C., Nelsen, L. A.,
Pathak, J., & McCormick, J. B., “Evaluating the process of online
health information searching: A qualitative approach to exploring
consumer perspectives”,Journal of Medical Internet Research, vol.16,
no. 10, 2014.
[3] Fox, S., “The Social Life of Health Information”, Retrieved from
http://www.pewresearch.org/fact-tank/2014/01/15/the-social-life-of-
health-information, 2014.
[4] G. Shenet al., “Depression detection via harvesting social media:A
multimodal dictionary learning solution,” inProc. 27th Int. Joint
Conf.Artif. Intell., pp. 3838–3844, Aug 2017.
[5] Gkotsis, G., Oellrich, A., Velupillai, S. et al. Characterisation of mental
health conditions in social media using Informed Deep Learning. Sci
Rep 7, 45141, 2017.
[6] Han, S., Huang, H., & Tang, Y., “Knowledge of words: An
interpretable approach for personality recognition from social media.
Knowledge-Based Systems”, Article 105550
[7] Johnson, J. D, “Health-related information seeking: Is it worth it”,
Information Processing & Management, vol.50, no.5, pp. 708–717,
2014
[8] Keh, S.S., .& Cheng, I.T., “Myers-Briggs personality classification and
personality-specific language generation using pre-trained language
models”, arXiv preprint arXiv.06333, 2019.
[9] Lawhon, L., “Patients Rely on Facebook and Condition-Specific Web
Sites to Share Information” , Get Support, and Start Discussions with
Healthcare Providers. Retrievedfromhttps://health-
union.com/news/online-health-experience-survey/, 2016.
[10] Lykke, M., Price, S., & Delcambre, L, “How doctors search: A study
of query behaviour and the impact on search results”, Information
Processing & Management, vol. 48,no.6, pp.1151–1170, 2012
[11] Majumder, N., Poria, S., Gelbukh, A., & Cambria, E, “Deep learning-
based document modeling for personality detection from text”, IEEE
Intelligent Systems, vol. 32, no.2, pp.74–79, 2017.
[12] Matthew R Jamnik and David J Lane., “The Use of Reddit as an
InexpensiveSource for High-Quality Data”, Practical Assessment,
Research & Evaluation, 2017
[13] M. Park, C. Cha, and M. Cha, “Depressive moods of users portrayedin
twitter,” inProc. ACM SIGKDD Workshop Healthcare Informat. (HI-
KDD), pp. 1–8, 2012
[14] Minjoo Yoo, Sangwon Lee, Taehyun Ha, “Semantic network analysis
for understanding user experiences of bipolar and depressive disorders
on Reddit”, Information Processing & Management,Vol. 56,
no.4,2019,Pages 1565-1575,
[15] Monalisha Ghosh; Goutam Sanyal, “Analysing sentiments based on
multi-feature combination with supervised learning”, International
Journal of Knowledge Engineering and Data mining, vol.11, no.4, pp.
391-416, 2019
[16] M. Trotzek, S. Koitka, and C. M. Friedrich, “Utilizing neural
networksand linguistic metadata for early detection of depression
indicationsin text sequences,”IEEE Trans. Knowl. Data Eng., vol. 32,
no. 3,pp. 588–601, Mar. 2020
[17] S. Tsugawa, Y. Kikuchi, F. Kishino, K. Nakajima, Y. Itoh, and H.
Ohsaki,“Recognizing depression from Twitter activity,” inProc. 33rd
Annu.ACM Conf. Hum. Factors Comput. Syst., pp. 3187–3196, 2015.
[18] S. Ghosh and T. Anwar, "Depression Intensity Estimation via Social
Media: A Deep Learning Approach," in IEEE Transactions on
Computational Social Systems, vol. 8, no. 6, pp. 1465-1474, Dec. 2021.
[19] Y. Xue, Q. Li, L. Feng, G. D. Clifford, and D. A. Clifton, “Towardsa
micro-blog platform for sensing and easing adolescent
psychologicalpressures,” inProc. ACM Conf. Pervas. Ubiquitous
Comput. AdjunctPublication, pp. 215–218, Sept 2013.
[20] Y. Xue, Q. Li, L. Jin, L. Feng, D. A. Clifton, and G. D.
Clifford,“Detecting adolescent psychological pressures from micro-
blog,” inProc. Int. Conf. Health Inf. Sci.Melbourne, VIC, Australia:
Springer, pp. 83–94, 2014.
[21] Zhancheng Ren, Qiang Shen, Xiaolei Diao, Hao Xu, “A sentiment-
aware deep learning approach for personality detection from text”,
Information Processing & Management, vol.58, no.3, 2021.
[22] India is the Most Depressed Country in the World,
https://www.indiatoday.in/education-today/gk-current-
affairs/story/india-is-the-most-depressed-country-in-the-world-
mental-health-day/-2018-1360096-2018-10-1, 2018 ,Accessed on May
8, 2022
[23] Mental Health Illnesses, https://www.nami.org/About-Mental-
Illness/Mental-Health-Conditions, Accessed on June 6. 2022
[24] Medline plus, https://medlineplus.gov/mentaldisorders.html, Accessed
on May 3, 2022
[25] Evaluate Topic Models: Latent Dirichlet Allocation (LDA), 2019,
https://towardsdatascience.com/evaluate-topic-model-in-python-
latent-dirichlet-allocation-lda-7d57484bb5d0, Accessed on June 6,
2022
7. Conclusion
8. Limitations and Future Work
References
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the Creative
Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en_US
Engineering World
DOI:10.37394/232025.2022.4.11