A Text Classification Approach for Detecting Cyberbullying Risk on
Twitter Using Support Vector Machine with Naive Bayes and Random
Forest Comparison
Dublin Core
Title
A Text Classification Approach for Detecting Cyberbullying Risk on
Twitter Using Support Vector Machine with Naive Bayes and Random
Forest Comparison
Twitter Using Support Vector Machine with Naive Bayes and Random
Forest Comparison
Subject
Cyberbullying Detection, Twitter, Support Vector Machine, TF-IDF, Text Classification, Machine Learning
Description
The rapid development of social media as a means of digital interaction also presents serious challenges in the form of the spread of negative
content, including cyberbullying. Cyberbullying is a form of verbal violence committed online and has a significant impact on mental health,
especially in adolescents. This research aims to develop a text classification model to detect the risk of cyberbullying using the Support Vector
Machine (SVM) algorithm. The data used comes from a collection of cyberbullying-themed tweets. The research stages include text preprocessing
(normalization, cleaning, tokenization, stopword removal, and stemming), feature extraction using Term Frequency-Inverse Document Frequency
(TF-IDF), data division into training and testing sets, and model training using linear kernel of SVM. The model was evaluated using accuracy,
precision, recall, and F1-score metrics. The results show that this approach is able to identify risky comments quite accurately, with optimal
performance on the linear kernel. This research contributes to the development of automated detection systems to create a safer and healthier
digital ecosystem, and supports preventive efforts in mitigating cyberbullying online.
content, including cyberbullying. Cyberbullying is a form of verbal violence committed online and has a significant impact on mental health,
especially in adolescents. This research aims to develop a text classification model to detect the risk of cyberbullying using the Support Vector
Machine (SVM) algorithm. The data used comes from a collection of cyberbullying-themed tweets. The research stages include text preprocessing
(normalization, cleaning, tokenization, stopword removal, and stemming), feature extraction using Term Frequency-Inverse Document Frequency
(TF-IDF), data division into training and testing sets, and model training using linear kernel of SVM. The model was evaluated using accuracy,
precision, recall, and F1-score metrics. The results show that this approach is able to identify risky comments quite accurately, with optimal
performance on the linear kernel. This research contributes to the development of automated detection systems to create a safer and healthier
digital ecosystem, and supports preventive efforts in mitigating cyberbullying online.
Creator
Sri Yarsasi1,*
, Angga Iskoko2
, Angga Iskoko2
Source
https://ijiis.org/index.php/IJIIS/article/view/290/173
Publisher
Amikom Purwokerto University
Date
desember 2025
Contributor
Fajar bagus W
Format
PDF
Language
English
Type
Text
Files
Collection
Citation
Sri Yarsasi1,*
, Angga Iskoko2
, “A Text Classification Approach for Detecting Cyberbullying Risk on
Twitter Using Support Vector Machine with Naive Bayes and Random
Forest Comparison,” Repository Horizon University Indonesia, accessed January 1, 2026, https://repository.horizon.ac.id/items/show/9737.
Twitter Using Support Vector Machine with Naive Bayes and Random
Forest Comparison,” Repository Horizon University Indonesia, accessed January 1, 2026, https://repository.horizon.ac.id/items/show/9737.