TELKOMNIKA Telecommunication, Computing, Electronics and Control
A hybrid naïve Bayes based on similarity measure to optimize the mixed-data classification

Dublin Core

Title

TELKOMNIKA Telecommunication, Computing, Electronics and Control
A hybrid naïve Bayes based on similarity measure to optimize the mixed-data classification

Subject

CSBS
Mixed data
Multi-classification
Naïve Bayes
Short text
Similarity-based

Description

In this paper, a hybrid method has been introduced to improve the
classification performance of naïve Bayes (NB) for the mixed dataset and
multi-class problems. This proposed method relies on a similarity measure
which is applied to portions that are not correctly classified by NB. Since the
data contains a multi-valued short text with rare words that limit the NB
performance, we have employed an adapted selective classifier based on
similarities (CSBS) classifier to exceed the NB limitations and included the
rare words in the computation. This action has been achieved by transforming
the formula from the product of the probabilities of the categorical variable to
its sum weighted by numerical variable. The proposed algorithm has been
experimented on card payment transaction data that contains the label of
transactions: the multi-valued short text and the transaction amount. Based on
K-fold cross validation, the evaluation results confirm that the proposed
method achieved better results in terms of precision, recall, and F-score
compared to NB and CSBS classifiers separately. Besides, the fact of
converting a product form to a sum gives more chance to rare words to
optimize the text classification, which is another advantage of the proposed
method.

Creator

Fatima El Barakaz, Omar Boutkhoum, Abdelmajid El Moutaouakkil

Source

http://journal.uad.ac.id/index.php/TELKOMNIKA

Date

Sep 16, 2020

Contributor

peri irawan

Format

pdf

Language

english

Type

text

Files

Collection

Tags

,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon ,

Citation

Fatima El Barakaz, Omar Boutkhoum, Abdelmajid El Moutaouakkil, “TELKOMNIKA Telecommunication, Computing, Electronics and Control
A hybrid naïve Bayes based on similarity measure to optimize the mixed-data classification,” Repository Horizon University Indonesia, accessed September 20, 2024, https://repository.horizon.ac.id/items/show/3643.