computational complexity, which severely restricts the usage of subspace clustering. The problem gets even worse with the
increase of the data’s dimensionality. In this paper, we propose to summarize the set of subspace clusters into k representative clusters to alleviate the problem. Typically, subspace clusters can be clustered further into k groups, and the set of representative clusters can be selected from each group. In such a way, only the most representative
subspace clusters will be returned to user. Unfortunately, when the size of the set of representative clusters is specified,
the problem of finding the optimal set is NP-hard. To solve this problem efficiently, we present two approximate methods:
PCoC and HCoC. The greatest advantage of our methods is that we only need a subset of subspace clusters as the input instead
of the complete set of subspace clusters. Precisely, only the clusters in low-dimensional subspaces are computed and assembled
into representative clusters in high-dimensional subspaces. The approximate results can be found in polynomial time. Our performance
study shows both the effectiveness and efficiency of these methods.
- Content Type Journal Article
- Pages 1-9
- DOI 10.1007/s00500-010-0552-8
- Authors
- Guanhua Chen, Peking University School of Electronics Engineering and Computer Science Beijing 100871 China
- Xiuli Ma, Peking University School of Electronics Engineering and Computer Science Beijing 100871 China
- Dongqing Yang, Peking University School of Electronics Engineering and Computer Science Beijing 100871 China
- Shiwei Tang, Peking University School of Electronics Engineering and Computer Science Beijing 100871 China
- Meng Shuai, Peking University School of Electronics Engineering and Computer Science Beijing 100871 China
- Kunqing Xie, Peking University School of Electronics Engineering and Computer Science Beijing 100871 China
- Journal Soft Computing – A Fusion of Foundations, Methodologies and Applications
- Online ISSN 1433-7479
- Print ISSN 1432-7643