Detection cyberbullying using AI and sentiment analysis to examine psychological impacts on vulnerable groups

dc.contributor.authorFashakh, Abdulnaser M.
dc.contributor.authorÇevik, Mesut
dc.contributor.authorKocakoyun Aydoğan, Şenay
dc.contributor.authorİbrahim, Abdullahi Abdu
dc.date.accessioned2025-12-18T12:18:33Z
dc.date.available2025-12-18T12:18:33Z
dc.date.issued2025
dc.departmentMeslek Yüksekokulu, Gedik Meslek Yüksekokulu, Bilişim Güvenliği Teknolojisi Programı
dc.description.abstractThis study aims to assess the effectiveness of machine learning and deep learning models in detecting cyberbullying and evaluating its psychological impact on vulnerable groups using textual and emotional features. The models assessed include traditional classifiers—Logistic Regression, Decision Tree, and Random Forest and deep learning models, such as MLP, CNN, RNN, and (LSTM) networks. TF-IDF for text vectorization and TextBlob for sentiment analysis were utilized. In spite of TF-IDF's shortcoming. Its simplicity enabled quick prototyping and insight results. The dataset contained 58,000 tweets, with 46,000 obtained from Kaggle and 12,000 collected via the Twitter API. Tweets were labeled into cyberbullying_type (gender, age, religion, and ethnicity) and subcategories: gender (male, female, LGBT, other), age (adult, teenager, other), religion (Muslim, Christian, Jewish, other), and ethnicity (ethical, unethical, other). Keyword-based classification was used for Subcategory assignment. The emotional score derived from text served as a proxy for measuring psychological impact. We emphasize that this study is observational and does not rely on clinical psychological evaluation. Results showed that female and LGBT users experienced the highest levels of cyberbullying among gender subcategories. Teenagers were most affected by age-based bullying. Unethical content dominated ethnicity-based attacks, and Muslims faced the highest frequency of cyberbullying and negative sentiment in religious categories. Sentiment analysis assisted in identifying emotional patterns concerning online abuse. Among models RNN and LSTM models achieved the highest accuracy (0.98), outperforming others. Among the traditional models, Random Forest performed better, while Logistic Regression was the worst performing. The inclusion of sentiment features significantly improved calssification accuracy, particularly in LSTM. A multi-output LSTM model was created to predict cyberbullying_type, sub_category and sentiment all at once, providing an end-to-end detection system. This framwork enables proactive monitoring of online harm and support timely interventions.
dc.identifier.doi10.1016/j.eij.2025.100856
dc.identifier.issn1110-8665
dc.identifier.scopus2-s2.0-105024488186
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.1016/j.eij.2025.100856
dc.identifier.urihttps://hdl.handle.net/11501/2556
dc.identifier.volume32
dc.indekslendigikaynakScopus
dc.institutionauthorKocakoyun Aydoğan, Şenay
dc.institutionauthorid0000-0002-3405-6497
dc.language.isoen
dc.publisherElsevier B.V.
dc.relation.ispartofEgyptian Informatics Journal
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectCyberbullying
dc.subjectDeep Learning
dc.subjectMachine Learning
dc.subjectPsychological Impact
dc.subjectSentiment Analysis
dc.subjectText Classification
dc.subjectVulnerable Groups
dc.titleDetection cyberbullying using AI and sentiment analysis to examine psychological impacts on vulnerable groups
dc.typeArticle

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
Tam Metin / Full Text
Boyut:
1.47 MB
Biçim:
Adobe Portable Document Format
Lisans paketi
Listeleniyor 1 - 1 / 1
Kapalı Erişim
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed to upon submission
Açıklama: