TELKOMNIKA Telecommunication, Computing, Electronics and Control
Natural language processing and machine learning based cyberbullying detection for Bangla and Romanized Bangla texts

Dublin Core

Title

TELKOMNIKA Telecommunication, Computing, Electronics and Control
Natural language processing and machine learning based cyberbullying detection for Bangla and Romanized Bangla texts

Subject

Cyberbullying, Machine learning, Natural language processing, YouTube comments

Description

The popularity of social media has been increasing tremendously in recent times and thus cyberbullying towards people has also increased at an alarming rate. Many cyberbullying texts can be found in the comment sections of many well-known Bangladeshi social media personalities YouTube videos. It has the potential to cause severe emotional and psychological distress. Therefore, texts containing cyberbullying should be detected at the earliest stage and prevented from being displayed. In this study, we use natural language processing (NLP) techniques and various machine learning classifiers and presented model for cyberbullying detection in Bangla and Romanized Bangla texts obtained from YouTube video comments. We developed our own datasets using YouTube application programming interface (API) version 3.0. We collected 5000 Bangla comments, as well as 7000 Romanized Bangla comments from videos of different well-known social media personals. These two datasets, as well as a third dataset of 12000 texts which was the combination of the first two datasets were used to train the classifiers. These datasets were used to train machine learning classifiers after being preprocessed using NLP techniques. With an accuracy score of 76%, support vector machine (SVM) outperformed the other classifiers for the first dataset. The highest accuracy scores for the second and third datasets were 84% and 80%, respectively, which were both achieved by multinomial naive Bayes.

Creator

Md. Tofael Ahmed, Maqsudur Rahman, Shafayet Nur, Abu Zafor Muhammad Touhidul Islam, Dipankar Das

Source

DOI: 10.12928/TELKOMNIKA.v20i1.18630

Publisher

Universitas Ahmad Dahlan

Date

February 2022

Contributor

Sri Wahyuni

Rights

ISSN: 1693-6930

Relation

http://journal.uad.ac.id/index.php/TELKOMNIKA

Format

PDF

Language

English

Type

Text

Coverage

TELKOMNIKA Telecommunication, Computing, Electronics and Control

Files

Collection

Tags

,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon ,

Citation

Md. Tofael Ahmed, Maqsudur Rahman, Shafayet Nur, Abu Zafor Muhammad Touhidul Islam, Dipankar Das, “TELKOMNIKA Telecommunication, Computing, Electronics and Control
Natural language processing and machine learning based cyberbullying detection for Bangla and Romanized Bangla texts,” Repository Horizon University Indonesia, accessed November 21, 2024, https://repository.horizon.ac.id/items/show/4250.