Loading…

Greedy learning of latent tree models for multidimensional clustering

Real-world data are often multifaceted and can be meaningfully clustered in more than one way. There is a growing interest in obtaining multiple partitions of data. In previous work we learnt from data a latent tree model (LTM) that contains multiple latent variables (Chen et al. 2012). Each latent...

Full description

Saved in:

Bibliographic Details
Published in:	Machine learning 2015-01, Vol.98 (1-2), p.301-330
Main Authors:	Liu, Teng-Fei, Zhang, Nevin L., Chen, Peixian, Liu, April Hua, Poon, Leonard K. M., Wang, Yi
Format:	Article
Language:	English
Subjects:	Algorithms Artificial Intelligence Cluster analysis Clustering Clusters Computer Science Control Learning Machine learning Mathematical models Mechatronics Natural Language Processing (NLP) Partitions Real time Robotics Simulation and Modeling Trees
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Real-world data are often multifaceted and can be meaningfully clustered in more than one way. There is a growing interest in obtaining multiple partitions of data. In previous work we learnt from data a latent tree model (LTM) that contains multiple latent variables (Chen et al. 2012). Each latent variable represents a soft partition of data and hence multiple partitions result in. The LTM approach can, through model selection, automatically determine how many partitions there should be, what attributes define each partition, and how many clusters there should be for each partition. It has been shown to yield rich and meaningful clustering results. Our previous algorithm EAST for learning LTMs is only efficient enough to handle data sets with dozens of attributes. This paper proposes an algorithm called BI that can deal with data sets with hundreds of attributes. We empirically compare BI with EAST and other more efficient LTM learning algorithms, and show that BI outperforms its competitors on data sets with hundreds of attributes. In terms of clustering results, BI compares favorably with alternative methods that are not based on LTMs.
ISSN:	0885-6125 1573-0565
DOI:	10.1007/s10994-013-5393-0