J4 ›› 2014, Vol. 41 ›› Issue (3): 123-130.doi: 10.3969/j.issn.1001-2400.2014.03.018

• Original Articles • Previous Articles     Next Articles

New nearest neighbor affinity similarity function based on separation and compactness between samples

LI Juan1,2;WANG Yuping1   

  1. (1. School of Computer Science and Technology, Xidian Univ., Xi'an  710071, China;
    2. School of Distance Education, Shaanxi Normal Univ., Xi'an  710062, China)
  • Received:2013-03-13 Online:2014-06-20 Published:2014-07-10
  • Contact: LI Juan E-mail:ally_2004@126.com

Abstract:

Traditional distance and similarity measurements did not take into account the influence of the individual sample on the whole sample set. To deal with this issue, a new similarity improvement strategy of k-nearest neighbor algorithm (KNN) is proposed in the paper. First, a new affinity distance function is introduced, which focuses on the separation and compactness between each individual sample and the whole sample set. Second, a new similarity function using this affinity distance function is proposed and taken as the similarity measure function in the KNN. Third, a theoretical analysis of and experiments on eighteen numerical UCI (University of California Irvine) datasets are made to compare the affinity similarity function proposed in this paper with classical distance or similarity functions through 5-fold partitioning cross-validations. Finally, classification results indicate that the proposed affinity similarity function is not only an effective similarity strategy for classification, but can reduce the classification time for large-scale data sets by combining efficient indexing algorithms.

Key words: machine learning, nearest neighbors, affinity similarity, separation, compactness