Wolfram Language & System Documentation Center

"KernelDensityEstimation" (Machine Learning Method)

Method for LearnDistribution.
Models probability density with a mixture of simple distributions.

Details & Suboptions

"KernelDensityEstimation" is a nonparametric method that models the probability density of a numeric space with a mixture of simple distributions (called kernels) centered around each training example, as in KernelMixtureDistribution.
The probability density function for a vector is given by for a kernel function , kernel size and a number of training examples m.
The following options can be given:

Method	"Fixed"	kernel size method
"KernelSize"	Automatic	size of the kernels when Method"Fixed"
"KernelType"	"Gaussian"	type of kernel used
"NeighborsNumber"	Automatic	kernel size expressed as a number of neighbors

Possible settings for "KernelType" include:
"Gaussian" each kernel is a Gaussian distribution

"Ball" each kernel is a uniform distribution on a ball
Possible settings for Method include:
"Adaptive" kernel sizes can differ from each other

"Fixed" all kernels have the same size
When "KernelType""Gaussian", each kernel is a spherical Gaussian (product of independent normal distributions ), and "KernelSize" h refers to the standard deviation of the normal distribution.
When "KernelType""Ball", each kernel is a uniform distribution inside a sphere, and "KernelSize" refers to the radius of the sphere.
The value of "NeighborsNumber"k is converted into kernel size(s), so that a kernel centered around a training example typically "contains" k other training examples. If "KernelType""Ball", "contains" refers to examples that are inside the ball. If "KernelType""Gaussian", "contains" refers to examples that are inside a ball of radius h where n is the dimension of the data.
When Method"Fixed" and "NeighborsNumber"k, a unique kernel size is found such that training examples contain on average k other examples.
When Method"Adaptive" and "NeighborsNumber"k, each training example adapts its kernel size such that it contains about k other examples.
Because of preprocessing, the "NeighborsNumber" option is typically a more convenient way to control kernel sizes than "KernelSize". When Method"Fixed", the value of "KernelSize" supersedes the value of "NeighborsNumber".
Information[LearnedDistribution[…],"MethodOption"] can be used to extract the values of options chosen by the automation system.
LearnDistribution[…,FeatureExtractor"Minimal"] can be used to remove most preprocessing and directly access the method.

Examples

open all close all

Basic Examples (3)

Train a "KernelDensityEstimation" distribution on a numeric dataset:

Wolfram Language code: ld = LearnDistribution[{1.2, 2.1, 3.5, 4.3}, Method -> "KernelDensityEstimation"]

Look at the distribution Information:

Wolfram Language code: Information[ld]

Obtain options information:

Wolfram Language code: Information[ld, "MethodOption"]

Obtain an option value directly:

Wolfram Language code: Information[ld, "KernelType"]

Compute the probability density for a new example:

Wolfram Language code: PDF[ld, 1.3]

Plot the PDF along with the training data:

Wolfram Language code:

Show[Plot[PDF[ld, x], {x, -6, 12}, Filling -> Bottom], NumberLinePlot[{1.2, 2.1, 3.5, 4.3}, Spacings -> 0, PlotStyle -> Red]]

Generate and visualize new samples:

Wolfram Language code: Histogram[RandomVariate[ld, 10000], 50]

Train a "KernelDensityEstimation" distribution on a two-dimensional dataset:

Wolfram Language code: iris = ExampleData[{"MachineLearning", "FisherIris"}, "Data"][[All, 1, {1, 3}]];

Wolfram Language code: ld = LearnDistribution[iris, Method -> "KernelDensityEstimation"]

Plot the PDF along with the training data:

Wolfram Language code:

Show[ContourPlot[PDF[ld, {x, y}], {x, 4, 8}, {y, 1, 7}, PlotRange -> All, Contours -> 10], ListPlot[iris, PlotStyle -> Red]]

Use SynthesizeMissingValues to impute missing values using the learned distribution:

Wolfram Language code: SynthesizeMissingValues[ld, {5.5, Missing[]}]

Wolfram Language code: Histogram[Table[Last@SynthesizeMissingValues[ld, {5.5, Missing[]}], 1000]]

Train a "KernelDensityEstimation" distribution on a nominal dataset:

Wolfram Language code: ld = LearnDistribution[{"A", "A", "B", "B", "B"}, Method -> "KernelDensityEstimation"]

Because of the necessary preprocessing, the PDF computation is not exact:

Wolfram Language code: PDF[ld, "A"]

Use ComputeUncertainty to obtain the uncertainty on the result:

Wolfram Language code: PDF[ld, "A", ComputeUncertainty -> True]

Increase MaxIterations to improve the estimation precision:

Wolfram Language code: PDF[ld, "A", MaxIterations -> 1000, ComputeUncertainty -> True]

Options (4)

"KernelSize" (1)

Train a kernel mixture distribution with a kernel size of 0.2:

Wolfram Language code: iris = ExampleData[{"MachineLearning", "FisherIris"}, "Data"][[All, 1, {1, 3}]];

Wolfram Language code: ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", "KernelSize" -> 0.2}]

Evaluate the PDF of the distribution at a specific point:

Wolfram Language code: PDF[ld, {6, 4}]

Visualize the PDF obtained after training a kernel mixture distribution with various kernel sizes:

Wolfram Language code:

plots = (
	ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", "KernelSize" -> #}, FeatureExtractor -> "Minimal"];
	Show[ContourPlot[PDF[ld, {x, y}], {x, 2, 10}, {y, 0, 8}, PlotRange -> All, Contours -> 10], ListPlot[iris, PlotStyle -> Red], ImageSize -> 250, PlotLabel -> "h = " <> ToString[#]]
	) & /@ {0.05, 0.1, 0.2, 0.35, 0.5, 1};

Wolfram Language code: Grid[Partition[plots, 2]]

"KernelType" (1)

Train a "KernelDensityEstimation" distribution with a "Ball" kernel:

Wolfram Language code: iris = ExampleData[{"MachineLearning", "FisherIris"}, "Data"][[All, 1, {1, 3}]];

Wolfram Language code: ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", "KernelType" -> "Ball"}]

Evaluate the PDF of the distribution at a specific point:

Wolfram Language code: PDF[ld, {6, 4}]

Visualize the PDF obtained after training a kernel mixture distribution with a "Ball" and a "Gaussian" kernel:

Wolfram Language code:

plots = (ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", "KernelType" -> #}, FeatureExtractor -> "Minimal"];
	Show[ContourPlot[PDF[ld, {x, y}], {x, 2, 10}, {y, 0, 8}, PlotRange -> All, Contours -> 10, PlotLabel -> #], ListPlot[iris, PlotStyle -> Red], ImageSize -> 250]) &  /@ {"Ball", "Gaussian"};

Wolfram Language code: Row[plots]

Method (1)

Train a "KernelDensityEstimation" distribution with the "Adaptive" method:

Wolfram Language code: iris = ExampleData[{"MachineLearning", "FisherIris"}, "Data"][[All, 1, {1, 3}]];

Wolfram Language code: ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", Method -> "Adaptive"}]

Evaluate the PDF of the distribution at a specific point:

Wolfram Language code: PDF[ld, {6, 4}]

Visualize the PDF obtained after training a kernel mixture distribution with a "Ball" and a "Gaussian" kernel:

Wolfram Language code:

plots = (ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", Method -> #}, FeatureExtractor -> "Minimal"];
	Show[ContourPlot[PDF[ld, {x, y}], {x, 2, 10}, {y, 0, 8}, PlotRange -> All, Contours -> 10, PlotLabel -> #], ListPlot[iris, PlotStyle -> Red], ImageSize -> 250]) & /@ {"Fixed", "Adaptive"};

Wolfram Language code: Row[plots]

"NeighborsNumber" (1)

Train a kernel mixture distribution with a kernel size of about 10 neighbors:

Wolfram Language code: iris = ExampleData[{"MachineLearning", "FisherIris"}, "Data"][[All, 1, {1, 3}]];

Wolfram Language code: ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", "NeighborsNumber" -> 10}]

Evaluate the PDF of the distribution at a specific point:

Wolfram Language code: PDF[ld, {6, 4}]

Visualize the PDF obtained after training a kernel mixture distribution with various kernel sizes expressed as neighbors numbers:

Wolfram Language code:

plots = (
	ld = LearnDistribution[iris, Method -> {"KernelDensityEstimation", "NeighborsNumber" -> #}, FeatureExtractor -> "Minimal"];
	Show[ContourPlot[PDF[ld, {x, y}], {x, 2, 10}, {y, 0, 8}, PlotRange -> All, Contours -> 10], ListPlot[iris, PlotStyle -> Red], ImageSize -> 250, PlotLabel -> "k = " <> ToString[#]]
	) & /@ {1, 2, 5, 10, 20, 50};

Wolfram Language code: Grid[Partition[plots, 2]]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

"KernelDensityEstimation" (Machine Learning Method)

Details & Suboptions

Examples

Basic Examples (3)

Options (4)

"KernelSize" (1)

"KernelType" (1)

Method (1)

"NeighborsNumber" (1)

"KernelDensityEstimation" (Machine Learning Method)

Details & Suboptions

Examples

Basic Examples (3)

Options (4)

"KernelSize" (1)

"KernelType" (1)

Method (1)

"NeighborsNumber" (1)

See Also

Related Links

History