Speaker Identification Using a Convolutional Neural Network

Dublin Core

Title

Speaker Identification Using a Convolutional Neural Network

Subject

speaker identification, CNN, spectrogram, feature extraction

Description

Speech, a mode of communication between humans and machines, has various applications, including biometric systems for
identifying people have access to secure systems. Feature extraction is an important factor in speech recognition with high
accuracy. Therefore, we implemented a spectrogram, which is a pictorial representation of speech in terms of raw features, to
identify speakers. These features were inputted into a convolutional neural network (CNN), and a CNN-visual geometry group
(CNN-VGG) architecture was used to recognize the speakers. We used 780 primary data from 78 speakers, and each speaker
uttered a number in Bahasa Indonesia. The proposed architecture, CNN-VGG-f, has a learning rate of 0.001, batch size of
256, and epoch of 100. The results indicate that this architecture can generate a suitable model for speaker identification. A
spectrogram was used to determine the best features for identifying the speakers. The proposed method exhibited an accuracy
of 98.78%, which is significantly higher than the accuracies of the method involving Mel-frequency cepstral coefficients
(MFCCs; 34.62%) and the combination of MFCCs and deltas (26.92%). Overall, CNN-VGG-f with the spectrogram can
identify 77 speakers from the samples, validating the usefulness of the combination of spectrograms and CNN in speech
recognition applications.

Creator

Suci Dwijayanti1
, Alvio Yunita Putri2
, Bhakti Yudho Suprapto3

Publisher

Universitas Sriwijaya

Date

: 27-02-2022

Contributor

Fajar bagus W

Format

PDF

Language

Indonesia

Type

Text

Files

Collection

Citation

Suci Dwijayanti1 , Alvio Yunita Putri2 , Bhakti Yudho Suprapto3, “Speaker Identification Using a Convolutional Neural Network,” Repository Horizon University Indonesia, accessed May 30, 2025, https://repository.horizon.ac.id/items/show/9116.