Wolfram Language & System Documentation Center

"NearestNeighbors" (Machine Learning Method)

Method for Classify and Predict.
Infers the class or value of a new example by analyzing its nearest neighbors in the feature space.

Details & Suboptions

Nearest neighbors is a type of instance-based learning. In its simplest form, it picks the commonest class or averages the values among the k nearest neighbors.
The following options can be given:

"NeighborsNumber"	Automatic	the number of neighbors to consider (k)
"DistributionSmoothing"	0.5	regularization parameter
"NearestMethod"	Automatic	the method to use for computing the k-nearest examples

Possible settings for "NearestMethod" include:

	"KDtree"	uses a k‐d tree data structure for storing the data
	"Octree"	uses an octree data structure for storing the data
	"Scan"	exaustive search on the entire dataset

Examples

open all close all

Basic Examples (2)

Train a classifier function on labeled examples:

Wolfram Language code: c = Classify[{1, 2, 3, 4} -> {1, 1, 2, 2}, Method -> "NearestNeighbors"]

Obtain information about the classifier:

Wolfram Language code: Information[c]

Classify a new example:

Wolfram Language code: c[1.3]

Generate some data and visualize it:

Wolfram Language code:

data = Table[x -> x + RandomVariate[NormalDistribution[0, 2]], {x, RandomReal[{-10, 10}, 40]}];
ListPlot[List@@@data]

Train a predictor function on it:

Wolfram Language code: p = Predict[data, Method -> "NearestNeighbors"]

Compare the data with the predicted values and look at the standard deviation:

Wolfram Language code:

Show[Plot[{p[x], 
	p[x] + StandardDeviation[p[x, "Distribution"]], p[x] - StandardDeviation[p[x, "Distribution"]]}, 
	{x, -2, 6}, 
	PlotStyle -> {Blue, Gray, Gray}, 
	Filling -> {2 -> {3}}, 
	Exclusions -> False, 
	PerformanceGoal -> "Speed", PlotLegends -> {"Prediction", "Confidence Interval"}], ListPlot[List@@@data, PlotStyle -> Red, PlotLegends -> {"Data"}]]

Options (6)

"DistributionSmoothing" (2)

Train a classifier using the "DistributionSmoothing" suboption:

Wolfram Language code:

Classify[{1.98, 3.83, 1.69, 0.04, 2.48, 1.66} -> {"a", "a", "b", "b", "a", "b"}, Method -> {"NearestNeighbors", "DistributionSmoothing" -> 2}]

Train two classifiers on an imbalanced dataset by varying the value of "DistributionSmoothing":

Wolfram Language code: data = {1 -> True, 2 -> True, 3 -> True, 4 -> True, 5 -> False, 6 -> True};

Wolfram Language code: c1 = Classify[data, Method -> {"NearestNeighbors", "DistributionSmoothing" -> .1}]

Wolfram Language code: c2 = Classify[data, Method -> {"NearestNeighbors", "DistributionSmoothing" -> 10}]

Look at the probabilities for the two classifiers:

Wolfram Language code: c1[5, "Probabilities"]

Wolfram Language code: c2[5, "Probabilities"]

"NearestMethod" (2)

Train a classifier using a specific "NearestMethod":

Wolfram Language code:

Classify[{1.98, 3.83, 1.69, 0.04, 2.48, 1.66} -> {"a", "a", "b", "b", "a", "b"}, Method -> {"NearestNeighbors", "NearestMethod" -> "Scan"}]

Generate a large dataset and visualize it:

Wolfram Language code:

gaussian[μ_, σ_, n_] := RandomVariate[MultinormalDistribution[μ, {{σ, 0}, {0, σ}}], n];
positions = {{4, 2}, {-2, 2}, {0, -3}, {3, 0}};
sizes = {2, 1, 5, 0.5};
colors = {RGBColor[1, 0, 0], RGBColor[0, 0, 1], RGBColor[0, 1, 0], RGBColor[1., 0.77, 0.]};
nums = {10000, 10000, 50000, 20000};

Wolfram Language code:

clusters = MapThread[gaussian, {positions, sizes, nums}];
trainigset = AssociationThread[colors, clusters];
plot = ListPlot[clusters, PlotStyle -> Darker[colors, 0.1], ImageSize -> 200, PlotRange -> {{-5, 5}, {-5, 5}}, Frame -> True, AspectRatio -> 1, PlotLabel -> "data"]

Train several classifiers using the different methods and compare their training times:

Wolfram Language code:

classifiers = Classify[trainigset, Method -> {"NearestNeighbors", "NearestMethod" -> #}]& /@ {"Octree", "KDtree", "Scan"};

Compare the corresponding training times:

Wolfram Language code: Information[#, "TrainingTime"]& /@ classifiers

"NeighborsNumber" (2)

Train a predictor function using a specific "NeighborsNumber":

Wolfram Language code:

Predict[{1.98, 3.83, 1.69, 0.04, 2.48, 1.66} -> {-1.41, -0.71, -0.701, -0.4, -1.91, -1.6}, Method -> {"NearestNeighbors", "NeighborsNumber" -> 2}]

Generate a labeled training set and visualize it:

Wolfram Language code:

trainingset = Table[x -> Sin[4x] + RandomReal[.4], {x, RandomReal[{0, 6}, 30]}];
ListPlot[List@@@trainingset]

Train a predictor using a small "NeighborsNumber":

Wolfram Language code: p2 = Predict[trainingset, Method -> {"NearestNeighbors", "NeighborsNumber" -> 2}];

Train a predictor using a large "NeighborsNumber":

Wolfram Language code: p10 = Predict[trainingset, Method -> {"NearestNeighbors", "NeighborsNumber" -> 10}];

Compare the two predictors:

Wolfram Language code: Plot[{p2[x], p10[x]}, {x, 0, 6}]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

"NearestNeighbors" (Machine Learning Method)

Details & Suboptions

Examples

Basic Examples (2)

Options (6)

"DistributionSmoothing" (2)

"NearestMethod" (2)

"NeighborsNumber" (2)

"NearestNeighbors" (Machine Learning Method)

Details & Suboptions

Examples

Basic Examples (2)

Options (6)

"DistributionSmoothing" (2)

"NearestMethod" (2)

"NeighborsNumber" (2)

See Also

Related Links

History