Overview of the progression of state-of-the-art language
models
Dublin Core
Title
Overview of the progression of state-of-the-art language
models
models
Subject
Artificial intelligence
BERT model
Generative pre-trained transformer
Machine learning
Question-answering
BERT model
Generative pre-trained transformer
Machine learning
Question-answering
Description
This review provides a concise overview of key transformer-based language
models, including bidirectional encoder representations from transformers
(BERT), generative pre-trained transformer 3 (GPT-3), robustly optimized
BERT pretraining approach (RoBERTa), a lite BERT (ALBERT), text-to-text
transfer transformer (T5), generative pre-trained transformer 4 (GPT-4), and extra
large neural network (XLNet). These models have significantly advanced
natural language processing (NLP) capabilities, each bringing unique contributions
to the field. We delve into BERT’s bidirectional context understanding,
GPT-3’s versatility with 175 billion parameters, and RoBERTa’s optimization
of BERT. ALBERT emphasizes model efficiency, T5 introduces a text-to-text
framework, and GPT-4, with 170 trillion parameters, excels in multimodal tasks.
Safety considerations are highlighted, especially in GPT-4. Additionally, XLNet’s
permutation-based training achieves bidirectional context understanding.
The motivations, advancements, and challenges of these models are explored,
offering insights into the evolving landscape of large-scale language models.
models, including bidirectional encoder representations from transformers
(BERT), generative pre-trained transformer 3 (GPT-3), robustly optimized
BERT pretraining approach (RoBERTa), a lite BERT (ALBERT), text-to-text
transfer transformer (T5), generative pre-trained transformer 4 (GPT-4), and extra
large neural network (XLNet). These models have significantly advanced
natural language processing (NLP) capabilities, each bringing unique contributions
to the field. We delve into BERT’s bidirectional context understanding,
GPT-3’s versatility with 175 billion parameters, and RoBERTa’s optimization
of BERT. ALBERT emphasizes model efficiency, T5 introduces a text-to-text
framework, and GPT-4, with 170 trillion parameters, excels in multimodal tasks.
Safety considerations are highlighted, especially in GPT-4. Additionally, XLNet’s
permutation-based training achieves bidirectional context understanding.
The motivations, advancements, and challenges of these models are explored,
offering insights into the evolving landscape of large-scale language models.
Creator
Asmae Briouya, Hasnae Briouya, Ali Choukri
Source
Journal homepage: http://journal.uad.ac.id/index.php/TELKOMNIKA
Date
Jan 15, 2024
Contributor
PERI IRAWAN
Format
PDF
Language
ENGLISH
Type
TEXT
Files
Collection
Citation
Asmae Briouya, Hasnae Briouya, Ali Choukri, “Overview of the progression of state-of-the-art language
models,” Repository Horizon University Indonesia, accessed February 3, 2026, https://repository.horizon.ac.id/items/show/10238.
models,” Repository Horizon University Indonesia, accessed February 3, 2026, https://repository.horizon.ac.id/items/show/10238.