Revisiting probabilistic latent semantic analysis: extensions, challenges and insights

Figuera, Pau; García Bringas, Pablo

Revisiting probabilistic latent semantic analysis: extensions, challenges and insights

dc.contributor.author	Figuera, Pau
dc.contributor.author	García Bringas, Pablo
dc.date.accessioned	2025-05-21T07:34:06Z
dc.date.available	2025-05-21T07:34:06Z
dc.date.issued	2024-01-03
dc.date.updated	2025-05-21T07:34:06Z
dc.description.abstract	This manuscript provides a comprehensive exploration of Probabilistic latent semantic analysis (PLSA), highlighting its strengths, drawbacks, and challenges. The PLSA, originally a tool for information retrieval, provides a probabilistic sense for a table of co-occurrences as a mixture of multinomial distributions spanned over a latent class variable and adjusted with the expectation–maximization algorithm. The distributional assumptions and the iterative nature lead to a rigid model, dividing enthusiasts and detractors. Those drawbacks have led to several reformulations: the extension of the method to normal data distributions and a non-parametric formulation obtained with the help of Non-negative matrix factorization (NMF) techniques. Furthermore, the combination of theoretical studies and programming techniques alleviates the computational problem, thus making the potential of the method explicit: its relation with the Singular value decomposition (SVD), which means that PLSA can be used to satisfactorily support other techniques, such as the construction of Fisher kernels, the probabilistic interpretation of Principal component analysis (PCA), Transfer learning (TL), and the training of neural networks, among others. We also present open questions as a practical and theoretical research window.	en
dc.identifier.citation	Figuera, P., & García Bringas, P. (2024). Revisiting probabilistic latent semantic analysis: extensions, challenges and insights [Review of Revisiting Probabilistic Latent Semantic Analysis: Extensions, Challenges and Insights]. Technologies, 12(1). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/TECHNOLOGIES12010005
dc.identifier.doi	10.3390/TECHNOLOGIES12010005
dc.identifier.eissn	2227-7080
dc.identifier.uri	http://hdl.handle.net/20.500.14454/2795
dc.language.iso	eng
dc.publisher	Multidisciplinary Digital Publishing Institute (MDPI)
dc.rights	© 2024 by the authors
dc.subject.other	Nonnegative matrix factorization
dc.subject.other	Probabilistic latent semantic analysis
dc.subject.other	Probabilistic semantic indexing
dc.subject.other	Singular value decomposition
dc.title	Revisiting probabilistic latent semantic analysis: extensions, challenges and insights	en
dc.type	review
dcterms.accessRights	open access
oaire.citation.issue	1
oaire.citation.title	Technologies
oaire.citation.volume	12
oaire.licenseCondition	https://creativecommons.org/licenses/by/4.0/
oaire.version	VoR

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: figuera_revisiting_2024.pdf
Tamaño:: 776.83 KB
Formato:: Adobe Portable Document Format

Descargar

Colecciones

Artículos