Classification of Economic Activities in Indonesia Using IndoBERT Language Model
Dublin Core
Title
Classification of Economic Activities in Indonesia Using IndoBERT Language Model
Subject
Multiclass Classification, IndoBERT, DistilBERT, CatBoost, Activity Economy
Description
Classification of economic activities plays a vital role in understanding, analyzing, and managing complex economic processes in a society or country. It facilitates economic analysis, data collection, policy formulation, and informed decision-making. In Indonesia, economic activities are classified according to the Indonesian Standard Industrial Classification (KBLI). This classification process requires in-depth knowledge about KBLI, and this process is still performed manually, which is therefore time-consuming. To address this challenge, this paper proposes to use a transformer-based language model that was pretrained using a large Indonesian corpus, i.e., IndoBERT, to better understand the contextual meanings of text in order to improve the accuracy of automatic economic activity classification. Our results show that the finetuned IndoBERTLARGE model achieves superior results, with an F1 score of 96.82% and a balanced accuracy of 96.10%, outperforming other recent methods used for similar task, i.e., CatBoost and DistilBERT models.
Creator
Muhammad Rizki Syazal, Evi Yulianti
Source
DOI: http://dx.doi.org/10.21609/jiki.v1
Publisher
Faculty of Computer Science UI
Date
2025-06-26
Contributor
Sri Wahyuni
Rights
ISSN : 2502-9274
Format
PDF
Language
English
Type
Text
Files
Collection
Citation
Muhammad Rizki Syazal, Evi Yulianti, “Classification of Economic Activities in Indonesia Using IndoBERT Language Model,” Repository Horizon University Indonesia, accessed January 11, 2026, https://repository.horizon.ac.id/items/show/9885.