Journal of ICT Research and Applications ITB Bandung Vol. 15 No. 2 2021
A New Term Frequency with Gaussian Technique for Text Classification and Sentiment Analysis
Dublin Core
Title
Journal of ICT Research and Applications ITB Bandung Vol. 15 No. 2 2021
A New Term Frequency with Gaussian Technique for Text Classification and Sentiment Analysis
A New Term Frequency with Gaussian Technique for Text Classification and Sentiment Analysis
Subject
customer reviews; clinical note; machine learning; natural language
processing; suicide risk classification; tweets of travelers.
processing; suicide risk classification; tweets of travelers.
Description
Abstract. This paper proposes a new term frequency with a Gaussian technique (TF-G) to classify the risk of suicide from Thai clinical notes and to perform sentiment analysis based on Thai customer reviews and English tweets of travelers that use US airline services. This research compared TF-G with term weighting
techniques based on Thai text classification methods from previous researches, including the bag-of-words (BoW), term frequency (TF), term frequency-inverse document frequency (TF-IDF), and term frequency-inverse corpus document frequency (TF-ICF) techniques. Suicide risk classification and sentiment analysis
were performed with the decision tree (DT), naïve Bayes (NB), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) techniques. The experimental results showed that TF-G is appropriate for feature extraction to classify the risk of suicide and to analyze the sentiments of customer reviews and tweets of travelers. The TF-G technique was more accurate than BoW, TF, TF-IDF and TF-ICF for term weighting in Thai suicide risk classification, for term weighting in sentiment analysis of Thai customer reviews for Burger King, Pizza Hut, and Sizzler restaurants, and for the sentiment analysis of English tweets
of travelers using US airline services.
techniques based on Thai text classification methods from previous researches, including the bag-of-words (BoW), term frequency (TF), term frequency-inverse document frequency (TF-IDF), and term frequency-inverse corpus document frequency (TF-ICF) techniques. Suicide risk classification and sentiment analysis
were performed with the decision tree (DT), naïve Bayes (NB), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) techniques. The experimental results showed that TF-G is appropriate for feature extraction to classify the risk of suicide and to analyze the sentiments of customer reviews and tweets of travelers. The TF-G technique was more accurate than BoW, TF, TF-IDF and TF-ICF for term weighting in Thai suicide risk classification, for term weighting in sentiment analysis of Thai customer reviews for Burger King, Pizza Hut, and Sizzler restaurants, and for the sentiment analysis of English tweets
of travelers using US airline services.
Creator
Vuttichai Vichianchai & Sumonta Kasemvilas
Source
DOI: 10.5614/itbj.ict.res.appl.2021.15.2.4
Publisher
IRCS-ITB
Date
07 Juli 2021
Contributor
Sri Wahyuni
Rights
ISSN: 2337-5787
Format
PDF
Language
English
Type
Text
Coverage
Journal of ICT Research and Applications ITB Bandung Vol. 15 No. 2 2021
Files
Collection
Citation
Vuttichai Vichianchai & Sumonta Kasemvilas, “Journal of ICT Research and Applications ITB Bandung Vol. 15 No. 2 2021
A New Term Frequency with Gaussian Technique for Text Classification and Sentiment Analysis,” Repository Horizon University Indonesia, accessed November 21, 2024, https://repository.horizon.ac.id/items/show/3420.
A New Term Frequency with Gaussian Technique for Text Classification and Sentiment Analysis,” Repository Horizon University Indonesia, accessed November 21, 2024, https://repository.horizon.ac.id/items/show/3420.