Nearest_Neighbor_(Pattern_Recognition) Nearest_Neighbor_(Pattern_Recognition)

Nearest Neighbor (Pattern Recognition) - Definition and Overview

The nearest neighbor algorithm in pattern recognition is a method for classifying phenomena based upon observable features.

In the algorithm, each feature is assigned a dimension to form a multidimensional feature space. A training set of objects with apriori known class are processed by feature extraction and plotted within the multi-dimensional feature space. The offsets in each dimension are referred to as the feature vector. This is the training or learning stage. Because the engine can be retrained to classify various phenomena, pattern recognition is part of machine learning.

The testing phase begins with phenomena to be classified (the class not being known apriori) and extracts the same set of features. The geometric distance is computed between the new feature vector and each apriori feature vector from the training set. The shortest distance thus computed is to the nearest neighbor. The apriori class of the nearest neighbor is now assigned to the phenomena to be classified.

Obviously, this algorithm will be more computationally intensive as the size of the training set grows. Many optimizations have been given over the years, these generally seek to reduce the number of distances actually computed. Some optimizations involve partitioning the feature space, and only computing distances within specific nearby volumes.

Other variations of the algorithm include the N Nearest Neighbor algorithm where several of the nearest feature vectors are computed, and the classification is made with the highest confidence only if all of the nearest neighbors are of the same class.

The nearest neighbor algorithm is a heuristic algorithm that is not guaranteed to produce a correct result in most cases.

Bayesian pattern recognition involves a similar multi-dimensional feature space, but computes probabilities of the phenomena being in various classes, rather than just distance from known samples. In general, Bayesian classification will produce somewhat better results at a higher computational cost.

See also

Copyright 2009 WordIQ.com - Privacy Policy  :: Terms of Use  :: Contact Us  :: About Us
This article is licensed under the GNU Free Documentation License. It uses material from the this Wikipedia article.