The solutions of practical decision problems are frequently obtained using sets of reference data. For example, an automobile insurance company may estimate the risk of a new customer based on existing records of drivers of similar age, gender, and address. By limiting the scope of the data that is consulted, a more reliable decision can be obtained.
Applying this paradigm to pattern classification yields a family of algorithms that assign class labels to input patterns using a reference sample of correctly classified patterns. For each input pattern, the reference sample is consulted using a weighting function that emphasizes the importance of classified patterns that are close to the input. Popular examples include nearest neighbor and kernel based classifiers. Unlike neural network classifiers, for example, these local classifiers do not explicitly construct a parametric model of the reference data, and are thus called "nonparametric."
In this talk we first describe some classic and recent results of nearest neighbor algorithms (in a probabilistic framework) that illuminate how the accuracies of nonparametric classifiers may depend upon the number of labeled reference patterns. Then we describe work in progress towards a pattern classifier that approximates class decision boundaries using local polynomial models of classified reference data. The development of a local polynomial classifier is motivated by the widely acclaimed success of local polynomial regression in statistics, which we shall also summarize.
Some of the results we obtained in collaboration with Santosh S. Venkatesh and Ron Meir.