TELKOMNIKA Telecommunication, Computing, Electronics and Control
RoBERTa: language modelling in building Indonesian question-answering systems

Dublin Core

Title

TELKOMNIKA Telecommunication, Computing, Electronics and Control
RoBERTa: language modelling in building Indonesian question-answering systems

Subject

ALBERT, ELECTRA, Indonesian QAS, Language modelling, RoBERTa

Description

This research aimed to evaluate the performance of the A Lite BERT
(ALBERT), efficiently learning an encoder that classifies token
replacements accurately (ELECTRA) and a robust optimized BERT
pretraining approach (RoBERTa) models to support the development of the Indonesian language question and answer system model. The evaluation carried out used Indonesian, Malay and Esperanto. Here, Esperanto was used as a comparison of Indonesian because it is international, which does not belong to any person or country and this then make it neutral. Compared to other foreign languages, the structure and construction of Esperanto is relatively simple. The dataset used was the result of crawling Wikipedia for Indonesian and Open Super-large Crawled ALMAnaCH coRpus (OSCAR) for Esperanto. The size of the token dictionary used in the test used approximately 30,000 sub tokens in both the SentencePiece and byte-level byte pair encoding methods (ByteLevelBPE). The test was carried out with the learning rates of 1e-5 and 5e-5 for both languages in accordance with the reference from the bidirectional encoder representations from transformers (BERT) paper. As shown in the final result of this study, the ALBERT and RoBERTa models in Esperanto showed the results of the loss calculation that were not much different. This showed that the RoBERTa model was better to implement an Indonesian question and answer system.

Creator

Wiwin Suwarningsih, Raka Aditya Pratama, Fadhil Yusuf Rahadika, Mochamad Havid Albar Purnomo

Source

DOI: 10.12928/TELKOMNIKA.v20i6.24248

Publisher

Universitas Ahmad Dahlan

Date

December 2022

Contributor

Sri Wahyuni

Rights

ISSN: 1693-6930

Relation

http://journal.uad.ac.id/index.php/TELKOMNIKA

Format

PDF

Language

English

Type

Text

Coverage

TELKOMNIKA Telecommunication, Computing, Electronics and Control

Files

Collection

Tags

,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon , ,Repository, Repository Horizon University Indonesia, Repository Universitas Horizon Indonesia, Horizon.ac.id, Horizon University Indonesia, Universitas Horizon Indonesia, HorizonU, Repo Horizon ,

Citation

Wiwin Suwarningsih, Raka Aditya Pratama, Fadhil Yusuf Rahadika, Mochamad Havid Albar Purnomo, “TELKOMNIKA Telecommunication, Computing, Electronics and Control
RoBERTa: language modelling in building Indonesian question-answering systems,” Repository Horizon University Indonesia, accessed April 4, 2025, https://repository.horizon.ac.id/items/show/4483.