"ContingencyTable" (Machine Learning Method)
- Method for LearnDistribution.
- Use a table to store the probabilities of a nominal vector for each possible outcome.
Details & Suboptions
- A contingency table models the probability distribution of a nominal vector space by storing a probability value for each possible outcome.
- If the data is unidimensional, the distribution corresponds to a categorical distribution.
- The following options can be given:
-
"AdditiveSmoothing" Automatic value to be added to each count - If the data contains numerical values, they are discretized. The resulting distribution is still a valid distribution in the original space.
- Information[LearnedDistribution[…],"MethodOption"] can be used to extract the values of options chosen by the automation system.
- LearnDistribution[…,FeatureExtractor"Minimal"] can be used to remove most preprocessing and directly access the method.
Examples
open all close allBasic Examples (3)
Train a contingency-table distribution on a nominal dataset:
ld = LearnDistribution[{"A", "A", "B", "B", "B"}, Method -> "ContingencyTable"]Look at the distribution Information:
Information[ld]Information[ld, "MethodOption"]Obtain an option value directly:
Information[ld, "AdditiveSmoothing"]Compute the probabilities for the values "A" and "B":
PDF[ld, "A"]PDF[ld, "B"]RandomVariate[ld, 10]Train a contingency-table distribution on a numeric dataset:
ld = LearnDistribution[{1.2, 2.1, 3.5, 4.3}, Method -> "ContingencyTable"]Look at the distribution Information:
Information[ld]Compute the probability density for a new example:
PDF[ld, 1.3]Plot the PDF along with the training data:
Show[Plot[PDF[ld, x], {x, -2, 8}, Filling -> Bottom], NumberLinePlot[{1.2, 2.1, 3.5, 4.3}, Spacings -> 0, PlotStyle -> Red]]Generate and visualize new samples:
Histogram[RandomVariate[ld, 10000], 50]Train a contingency-table distribution on a two-dimensional dataset:
iris = ExampleData[{"MachineLearning", "FisherIris"}, "Data"][[All, 1, {1, 3}]];ld = LearnDistribution[iris, Method -> "ContingencyTable"]Plot the PDF along with the training data:
Show[ContourPlot[PDF[ld, {x, y}], {x, 4, 8}, {y, 1, 7}, PlotRange -> All, Contours -> 10], ListPlot[iris, PlotStyle -> Red]]Use SynthesizeMissingValues to impute missing values using the learned distribution:
SynthesizeMissingValues[ld, {5.5, Missing[]}]Histogram[Table[Last@SynthesizeMissingValues[ld, {5.5, Missing[]}], 1000]]Options (1)
"AdditiveSmoothing" (1)
Train a contingency-table distribution on a nominal dataset without any smoothing:
ld = LearnDistribution[{"A", "A", "B", "B", "B"}, Method -> {"ContingencyTable", "AdditiveSmoothing" -> 0}]Compute the probabilities for the values "A" and "B":
PDF[ld, {"A", "B"}]Compare with the probabilities obtained after adding 1 and 10 counts to each outcome:
ld = LearnDistribution[{"A", "A", "B", "B", "B"}, Method -> {"ContingencyTable", "AdditiveSmoothing" -> 1}]PDF[ld, {"A", "B"}]ld = LearnDistribution[{"A", "A", "B", "B", "B"}, Method -> {"ContingencyTable", "AdditiveSmoothing" -> 10}]PDF[ld, {"A", "B"}]