Detection cyberbullying using AI and sentiment analysis to examine psychological impacts on vulnerable groups
Dosyalar
Tarih
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
This study aims to assess the effectiveness of machine learning and deep learning models in detecting cyberbullying and evaluating its psychological impact on vulnerable groups using textual and emotional features. The models assessed include traditional classifiers—Logistic Regression, Decision Tree, and Random Forest and deep learning models, such as MLP, CNN, RNN, and (LSTM) networks. TF-IDF for text vectorization and TextBlob for sentiment analysis were utilized. In spite of TF-IDF's shortcoming. Its simplicity enabled quick prototyping and insight results. The dataset contained 58,000 tweets, with 46,000 obtained from Kaggle and 12,000 collected via the Twitter API. Tweets were labeled into cyberbullying_type (gender, age, religion, and ethnicity) and subcategories: gender (male, female, LGBT, other), age (adult, teenager, other), religion (Muslim, Christian, Jewish, other), and ethnicity (ethical, unethical, other). Keyword-based classification was used for Subcategory assignment. The emotional score derived from text served as a proxy for measuring psychological impact. We emphasize that this study is observational and does not rely on clinical psychological evaluation. Results showed that female and LGBT users experienced the highest levels of cyberbullying among gender subcategories. Teenagers were most affected by age-based bullying. Unethical content dominated ethnicity-based attacks, and Muslims faced the highest frequency of cyberbullying and negative sentiment in religious categories. Sentiment analysis assisted in identifying emotional patterns concerning online abuse. Among models RNN and LSTM models achieved the highest accuracy (0.98), outperforming others. Among the traditional models, Random Forest performed better, while Logistic Regression was the worst performing. The inclusion of sentiment features significantly improved calssification accuracy, particularly in LSTM. A multi-output LSTM model was created to predict cyberbullying_type, sub_category and sentiment all at once, providing an end-to-end detection system. This framwork enables proactive monitoring of online harm and support timely interventions.











