TELKOMNIKA Telecommunication, Computing, Electronics and Control
A stochastic algorithm for solving the posterior inference problem in topic models
Dublin Core
Title
TELKOMNIKA Telecommunication, Computing, Electronics and Control
A stochastic algorithm for solving the posterior inference problem in topic models
A stochastic algorithm for solving the posterior inference problem in topic models
Subject
Latent Dirichlet allocation, Posterior inference, Stochastic optimization, Topic models
Description
Latent Dirichlet allocation (LDA) is an important probabilistic generative
model and has usually used in many domains such as text mining, retrieving information, or natural language processing domains. The posterior inference is the important problem in deciding the quality of the LDA model, but it is usually non-deterministic polynomial (NP)-hard and often intractable, especially in the worst case. For individual texts, some proposed methods such as variational Bayesian (VB), collapsed variational Bayesian (CVB), collapsed Gibb’s sampling (CGS), and online maximum a posteriori estimation (OPE) to avoid solving this problem directly, but they usually do not have any guarantee of convergence rate or quality of learned models excepting variants of OPE. Based on OPE and using the Bernoulli distribution combined, we design an algorithm namely general online maximum a posteriori estimation using two stochastic bounds (GOPE2) for solving the posterior inference problem in LDA model. It also is the NP-hard non-convex optimization problem. Via proof of theory and experimental results on the large datasets, we realize that GOPE2 is performed to develop the efficient method for learning topic models from big text collections especially massive/streaming texts, and more efficient than previous methods.
model and has usually used in many domains such as text mining, retrieving information, or natural language processing domains. The posterior inference is the important problem in deciding the quality of the LDA model, but it is usually non-deterministic polynomial (NP)-hard and often intractable, especially in the worst case. For individual texts, some proposed methods such as variational Bayesian (VB), collapsed variational Bayesian (CVB), collapsed Gibb’s sampling (CGS), and online maximum a posteriori estimation (OPE) to avoid solving this problem directly, but they usually do not have any guarantee of convergence rate or quality of learned models excepting variants of OPE. Based on OPE and using the Bernoulli distribution combined, we design an algorithm namely general online maximum a posteriori estimation using two stochastic bounds (GOPE2) for solving the posterior inference problem in LDA model. It also is the NP-hard non-convex optimization problem. Via proof of theory and experimental results on the large datasets, we realize that GOPE2 is performed to develop the efficient method for learning topic models from big text collections especially massive/streaming texts, and more efficient than previous methods.
Creator
Hoang Quang Trung, Xuan Bui
Source
DOI: 10.12928/TELKOMNIKA.v20i5.23764
Publisher
Universitas Ahmad Dahlan
Date
October 2022
Contributor
Sri Wahyuni
Rights
ISSN: 1693-6930
Relation
http://journal.uad.ac.id/index.php/TELKOMNIKA
Format
PDF
Language
English
Type
Text
Coverage
TELKOMNIKA Telecommunication, Computing, Electronics and Control
Files
Collection
Citation
Hoang Quang Trung, Xuan Bui, “TELKOMNIKA Telecommunication, Computing, Electronics and Control
A stochastic algorithm for solving the posterior inference problem in topic models,” Repository Horizon University Indonesia, accessed April 3, 2025, https://repository.horizon.ac.id/items/show/4427.
A stochastic algorithm for solving the posterior inference problem in topic models,” Repository Horizon University Indonesia, accessed April 3, 2025, https://repository.horizon.ac.id/items/show/4427.