文档详情

扔休矣.PDF

发布:2017-06-14约7.76万字共19页下载文档
文本预览下载声明
Polynomial Time Approximation Schemes for Geometric k-Clustering  y z Rafail Ostrovsky Yuval Rabani July 1, 2001 Abstract The Johnson-Lindenstrauss lemma states that n p oints in a high dimensional Hilb ert space can b e emb edded with small distortion of the distances into an O (log n ) dimensional space by applying a random linear transformation. We show that similar (though weaker) prop erties hold for certain random linear transformations over the Hamming cub e. We use these transformations to solve NP- hard clustering problems in the cub e as well as in geometric settings. More sp eci cally, we address the following clustering problem. Given n p oints in a larger set (for example, Rd ) endowed with a distance function (for example, L2 distance), we would like to partition the data set into k disjoint clusters, each with a \cluster center, so as to minimize the sum over all data p oints of the distance b etween the p oint and the center of the cluster containing the p oint. The problem is provably NP-hard in some high dimensional geometric settings, even for k = 2. We give p olynomial d time approximation schemes for this problem in several settings, including the binary cub e f0 ; 1g with Hamming distance, and Rd either with L 1 distance, or with L2 distance, or with the square of L2 distance. In all these settings, the b est p
显示全部
相似文档