Clustering validation inference

dc.contributor.authorFiguera, Pau
dc.contributor.authorCuzzocrea, Alfredo
dc.contributor.authorGarcía Bringas, Pablo
dc.date.accessioned2025-03-06T12:03:06Z
dc.date.available2025-03-06T12:03:06Z
dc.date.issued2024-08
dc.date.updated2025-03-06T12:03:06Z
dc.description.abstractClustering validation is applied to evaluate the quality of classifications. This step is crucial for unsupervised machine learning. A plethora of methods exist for this purpose; however, a common drawback is that statistical inference is not possible. In this study, we construct a density function for the cluster number. For this purpose, we use smooth techniques. Then, we apply non-negative matrix factorization using the Kullback–Leibler divergence. Employing a unique linearly independent uncorrelated observational variable hypothesis, we construct a sequence by varying the dimension of the span space of the factorization only using analytical techniques. The expectation of the limit of this sequence follows a gamma probability density function. Then, identifying the dimension of the factorization of the space span with clusters, we transform the estimation of the suitable dimension of the factorization into a probabilistic estimate of the number of clusters. This approach is an internal validation method that is suitable for numerical and categorical multivariate data and independent of the clustering technique. Our main achievement is a predictive clustering validation model with graphical abilities. It provides results in terms of credibility, thus making it possible to compare results such as expert judgment on a quantitative basis.en
dc.identifier.citationFiguera, P., Cuzzocrea, A., & García Bringas, P. (2024). Clustering Validation Inference. Mathematics, 12(15). https://doi.org/10.3390/MATH12152349
dc.identifier.doi10.3390/MATH12152349
dc.identifier.eissn2227-7390
dc.identifier.urihttp://hdl.handle.net/20.500.14454/2470
dc.language.isoeng
dc.publisherMultidisciplinary Digital Publishing Institute (MDPI)
dc.rights© 2024 by the authors
dc.subject.otherClustering validation
dc.subject.otherInferential clustering validation
dc.subject.otherNon-negative matrix factorization
dc.subject.otherTrace sequence limit
dc.titleClustering validation inferenceen
dc.typejournal article
dcterms.accessRightsopen access
oaire.citation.issue15
oaire.citation.titleMathematics
oaire.citation.volume12
oaire.licenseConditionhttps://creativecommons.org/licenses/by/4.0/
oaire.versionVoR
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
figuera_clustering_2024.pdf
Tamaño:
721.52 KB
Formato:
Adobe Portable Document Format
Colecciones