Search results
Results from the WOW.Com Content Network
Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions.Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology can produce many measurements at once, and the clustering of text documents, where, if a word-frequency vector is used, the number of dimensions ...
The focus of the first release was on subspace clustering and correlation clustering algorithms. [12] Version 0.2 (July 2009) added functionality for time series analysis, in particular distance functions for time series. [13] Version 0.3 (March 2010) extended the choice of anomaly detection algorithms and visualization modules. [14]
SUBCLU is an algorithm for clustering high-dimensional data by Karin Kailing, Hans-Peter Kriegel and Peer Kröger. [1] It is a subspace clustering algorithm that builds on the density-based clustering algorithm DBSCAN. SUBCLU can find clusters in axis-parallel subspaces, and uses a bottom-up, greedy strategy to remain efficient.
This led to new clustering algorithms for high-dimensional data that focus on subspace clustering (where only some attributes are used, and cluster models include the relevant attributes for the cluster) and correlation clustering that also looks for arbitrary rotated ("correlated") subspace clusters that can be modeled by giving a correlation ...
Various extensions to the DBSCAN algorithm have been proposed, including methods for parallelization, parameter estimation, and support for uncertain data. The basic idea has been extended to hierarchical clustering by the OPTICS algorithm. DBSCAN is also used as part of subspace clustering algorithms like PreDeCon and SUBCLU.
Different Gaussian model-based clustering methods have been developed with an eye to handling high-dimensional data. These include the pgmm method, [11] which is based on the mixture of factor analyzers model, and the HDclassif method, based on the idea of subspace clustering. [12]
HiSC [7] is a hierarchical subspace clustering (axis-parallel) method based on OPTICS. HiCO [8] is a hierarchical correlation clustering algorithm based on OPTICS. DiSH [9] is an improvement over HiSC that can find more complex hierarchies. FOPTICS [10] is a faster implementation using random projections.
Biclustering, block clustering, [1] [2] Co-clustering or two-mode clustering [3] [4] [5] is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. The term was first introduced by Boris Mirkin [ 6 ] to name a technique introduced many years earlier, [ 6 ] in 1972, by John A. Hartigan .