Indonesian continuous speech recognition optimization with convolution bidirectional long short-term memory architecture

Dublin Core

Title

Subject

Bidirectional long short-term memory
Continuous speech
Convolution bidirectional long short-term memory
Indonesian speech recognition
Speech recognition

Description

Speech recognition can be defined as converting voice signals into text or lines of words by using algorithms implemented in computer programs. There are several types of speech recognition, including recognition for isolated word speech, continuous speech, spontaneous speech, and conversational speech. Research on continuous speech recognition, especially in Indonesian, has been developed using both stochastic methods such as Hidden Markov model (HMM) and deep learning methods. Currently, deep learning approaches are more widely used in speech recognition applications. This research optimizes Indonesian speech recognition by adding convolution layers to the bidirectional long short-term memory (Bi-LSTM) architecture. The goal of this research is to find the best architecture so that better Indonesian continuous speech recognition results can be obtained. The dataset used in this research was created by the intelligent systems research group in the Department of Informatics at Universitas Diponegoro. All speakers who participated in this dataset came from five ethnic groups in Indonesia, representing the dialects of their respective ethnic groups. The research results show that by adding a convolution layer to the Bi-LSTM architecture, speech recognition performance increases significantly with an average word error rate (WER) reduction of 15.56% compared to using only the Bi-LSTM architecture.

Creator

Sukmawati Nur Endah, Rismiyati, Priyo Sidik Sasongko, Anwar Petrus F. Naiborhu

Source

Journal homepage: http://journal.uad.ac.id/index.php/TELKOMNIKA

Date

Mar 11, 2025

Contributor

PERI IRAWAN

Format

PDF

Language

ENGLISH

Type

TEXT

Files

24994-70207-1-PB.pdf

Collection

VOL. 23, NO.3 2025

Citation

Sukmawati Nur Endah, Rismiyati, Priyo Sidik Sasongko, Anwar Petrus F. Naiborhu, “Indonesian continuous speech recognition optimization with convolution bidirectional long short-term memory architecture,” Repository Horizon University Indonesia, accessed April 26, 2026, https://repository.horizon.ac.id/items/show/10035.

Indonesian continuous speech recognition optimization with convolution bidirectional long short-term memory architecture

Dublin Core

Title

Subject

Description

Creator

Source

Date

Contributor

Format

Language

Type

Files

Collection

Tags

Citation