FindDistribution[data]
finds a simple functional form to fit the distribution of data.
FindDistribution[data,n]
finds up to n best distributions.
FindDistribution[data,n,prop]
returns up to n best distributions associated with property prop.
FindDistribution[data,n,{prop1,prop2,…}]
returns up to n best distributions associated with properties prop1, prop2, etc.
FindDistribution
FindDistribution[data]
finds a simple functional form to fit the distribution of data.
FindDistribution[data,n]
finds up to n best distributions.
FindDistribution[data,n,prop]
returns up to n best distributions associated with property prop.
FindDistribution[data,n,{prop1,prop2,…}]
returns up to n best distributions associated with properties prop1, prop2, etc.
Details and Options
- The data must be a list of possible outcomes from a univariate distribution.
- FindDistribution[data,n,All] creates a Dataset object with all possible properties.
- Properties supported include:
-
"BIC" Bayesian information criterion "AIC" Akaike information criterion "HQIC" Hannan–Quinn information criterion "Score" internal score "Complexity" complexity of the distribution "LogLikelihood" LogLikelihood value "PearsonChiSquare" PearsonChiSquareTest p-value "CramerVonMises" CramerVonMisesTest p-value All all the previous properties - The following options can be given:
-
MaxItems Infinity maximum number of distributions in a mixture distribution PerformanceGoal Automatic aspect of performance to optimize RandomSeeding Automatic what seeding of pseudorandom generators should be done internally TargetFunctions Automatic types of distributions to consider - Possible settings for PerformanceGoal include:
-
"Speed" minimize the time spent to find distributions "Quality" try to find better distributions - Possible settings for TargetFunctions include:
-
Automatic automatically chosen distributions All all built-in distributions "Continuous" all continuous distributions "Discrete" all discrete distributions {dist1,
,
}distributions disti {
{dist1,
,
}}distributions disti using weights wi - Possible settings for RandomSeeding include:
-
Automatic automatically reseed every time the function is called Inherited use externally seeded random numbers seed use an explicit integer or strings as a seed - Possible continuous distributions for TargetFunctions are: BetaDistribution, CauchyDistribution, ChiDistribution, ChiSquareDistribution, ExponentialDistribution, ExtremeValueDistribution, FrechetDistribution, GammaDistribution, GumbelDistribution, HalfNormalDistribution, InverseGaussianDistribution, LaplaceDistribution, LevyDistribution, LogisticDistribution, LogNormalDistribution, MaxwellDistribution, NormalDistribution, ParetoDistribution, RayleighDistribution, StudentTDistribution, UniformDistribution, WeibullDistribution, HistogramDistribution.
- Possible discrete distributions for TargetFunctions are: BenfordDistribution, BinomialDistribution, BorelTannerDistribution, DiscreteUniformDistribution, GeometricDistribution, LogSeriesDistribution, NegativeBinomialDistribution, PascalDistribution, PoissonDistribution, WaringYuleDistribution, ZipfDistribution, HistogramDistribution, EmpiricalDistribution.
- The internal information criterion uses a Bayesian information criterion together with priors over TargetFunctions.
Examples
open all close allBasic Examples (2)
Create a list of uniformly distributed random integers:
RandomInteger[10, 100]Find the underlying distribution from the data:
FindDistribution[%]Generate data sampled from an exponential distribution:
𝒟 = ExponentialDistribution[1];
data = RandomVariate[𝒟, 1000];Find the best distribution from the data:
estimated𝒟 = FindDistribution[data]Compare the PDFs for the original and estimated distributions:
Plot[{PDF[𝒟, x], PDF[estimated𝒟, x]}, {x, 0, 10}, PlotLegends -> {"𝒟", "e𝒟"}]Return the best three distributions:
FindDistribution[data, 3]Compare their Bayesian information criterion and Akaike information criterion values:
FindDistribution[data, 3, {"BIC", "AIC"}]Scope (3)
Generate data sampled from a mixture distribution:
𝒟 = MixtureDistribution[{1, 1}, {ExponentialDistribution[1], NormalDistribution[5, 0.8]}];data = RandomVariate[𝒟, 1000];Estimate the best distribution from this data:
e𝒟 = FindDistribution[data]Compare the PDFs for the original and estimated distributions:
Plot[{PDF[𝒟, x], PDF[e𝒟, x]}, {x, -8, 10}, PlotLegends -> {"𝒟", "e𝒟"}, PlotRange -> All]Estimate parameters for a particular distribution:
𝒟 = WeibullDistribution[1, 2];data = RandomVariate[𝒟, 1000];By default, FindDistribution returns a simpler distribution:
e𝒟 = FindDistribution[data]Specify the type of distribution to look for:
e𝒟 = FindDistribution[data, TargetFunctions -> {WeibullDistribution}]Generate data sampled from an exponential distribution:
𝒟 = ExponentialDistribution[1];
data = RandomVariate[𝒟, 1000];Generate a Dataset object containing all properties for the top 2 distributions:
report = FindDistribution[data, 2, All]Options (5)
TargetFunctions (3)
Generate data samples from a mixture distribution:
𝒟 = MixtureDistribution[{1, 1}, {ExponentialDistribution[1], NormalDistribution[5, 0.8]}];data = RandomVariate[𝒟, 1000];Estimate parameters for specific distributions:
e𝒟 = FindDistribution[data, TargetFunctions -> {NormalDistribution, GammaDistribution}]Compare the PDFs for the original and estimated distributions:
Plot[{PDF[𝒟, x], PDF[e𝒟, x]}, {x, -8, 10}, PlotLegends -> {"𝒟", "e𝒟"}, PlotRange -> All]Time between geyser eruptions:
waiting = ExampleData[{"Statistics", "OldFaithful"}][[All, 2]];Estimate the distribution of the data:
e𝒟1 = FindDistribution[waiting]Estimate the distribution of the data when treated as continuous:
e𝒟2 = FindDistribution[waiting, TargetFunctions -> "Continuous"]Estimate the distribution of the data when treated as continuous using GammaDistribution:
e𝒟3 = FindDistribution[waiting, TargetFunctions -> {GammaDistribution}]Compare the histogram of the data to the PDF of the estimated distributions:
legend = SwatchLegend[{Red, ColorData[97, 1], ColorData[97, 2]}, {"e𝒟1", "e𝒟2", "e𝒟3"}];Show[Histogram[waiting, 20, "ProbabilityDensity"],
DiscretePlot[{PDF[e𝒟1, x]}, {x, 0, 100}, PlotStyle -> {PointSize[.02], Red}],
Plot[{PDF[e𝒟2, x], PDF[e𝒟3, x]}, {x, 0, 100}, PlotLegends -> legend]]Estimate parameters for specific distributions, assuming priors over them:
magnitudes = Select[ExampleData[{"Statistics", "USEarthquakes"}], #[[1]] ≥ 1935&][[All, 7]];The magnitudes of earthquakes in the United States in the years 1935–1989 have two modes:
h = Histogram[magnitudes, 20, "ProbabilityDensity"]Estimate the best fit without using TargetFunctions:
Subscript[``e𝒟``, 1] = FindDistribution[magnitudes]Estimate the best fit using priors over distributions:
Subscript[``e𝒟``, 2 ] = FindDistribution[magnitudes, TargetFunctions -> {{10, 2} -> {CauchyDistribution, GammaDistribution}}]Compare the histogram to the PDFs of the estimated distributions:
Show[h, Plot[{PDF[Subscript[``e𝒟``, 1], x], PDF[Subscript[``e𝒟``, 2 ], x]}, {x, 0, 10}, PlotStyle -> Thick, PlotRange -> All]]PerformanceGoal (1)
Generate data samples from a mixture distribution:
𝒟 = MixtureDistribution[{1, 2}, {ChiDistribution[0.6], GammaDistribution[20, 1]}]data = RandomVariate[𝒟, 10000];Estimate the best fit for a big dataset and compare the AbsoluteTiming for different settings of PerformanceGoal:
AbsoluteTiming[e𝒟1 = FindDistribution[data, PerformanceGoal -> "Speed"]]AbsoluteTiming[e𝒟2 = FindDistribution[data, PerformanceGoal -> "Quality"]]Compare the LogLikelihood of the solutions:
LogLikelihood[#, data]& /@ {e𝒟1, e𝒟2}RandomSeeding (1)
Generate data samples from a mixture distribution:
𝒟 = MixtureDistribution[{1, 2}, {NormalDistribution[-6, 1], GammaDistribution[20, 1]}]data = RandomVariate[𝒟, 1000];Compare different rounds of FindDistribution and notice how they differ:
Table[FindDistribution[data], 3]Use the option RandomSeeding to avoid having different results:
Table[FindDistribution[data, RandomSeeding -> 1], 3]Applications (5)
Lengths of Words Beginning with a Particular Letter (1)
Lengths of all English words in a dictionary that begin with different vowels:
letters = {"a", "e", "i", "o", "u", "y"};worddata = Table[StringLength /@ DictionaryLookup[l ~~ ___], {l, letters}];Estimate the distribution for different vowels:
e𝒟 = Table[FindDistribution[i, MaxItems -> 1], {i, worddata}]Compare the histograms of the original data to the PDFs of the estimated distributions:
Partition[Table[Show[Histogram[worddata[[i]], {Range[25] - 1 / 2}, "ProbabilityDensity", PlotLabel -> letters[[i]]], DiscretePlot[PDF[e𝒟[[i]], x], {x, 0, 25}, PlotRange -> All, PlotStyle -> PointSize[.025]], ImageSize -> Small], {i, Length[letters]}], 3]//GridText Frequency (1)
Count the number of occurrences of words in the Declaration of Independence:
text = ExampleData[{"Text", "DeclarationOfIndependence"}, "Words"];wordCount = Tally[text][[All, 2]];Estimate the distribution of the word count:
e𝒟 = FindDistribution[wordCount, MaxItems -> 1]Compare the histograms of the original data to the PDF of the estimated distribution:
Show[Histogram[wordCount, {0.5, 9.5, 1}, "ProbabilityDensity"], DiscretePlot[PDF[e𝒟, x], {x, 1, 10}, PlotStyle -> PointSize[Medium], PlotRange -> All]]Melanoma in Denmark (1)
Age of patients affected by melanoma:
melanomaAge = ExampleData[{"Statistics", "DenmarkMelanoma"}][[All, 4]];Estimate the distribution of the data:
e𝒟 = FindDistribution[melanomaAge]Compare the histogram of the data to the PDF of the estimated distribution:
Show[Histogram[melanomaAge, {4, 95, 4}, "ProbabilityDensity"], Plot[PDF[e𝒟, x], {x, 4, 95}, PlotStyle -> Thick, PlotRange -> All]]Infection Time for AIDS (1)
Infection time for AIDS in years:
aids = ExampleData[{"Statistics", "TimeToAIDS"}][[All, 1]];Estimate the distribution of the data:
e𝒟 = FindDistribution[aids]Compare the histogram of the data to the PDF of the estimated distribution:
Show[Histogram[aids, {0, 8, 0.65}, "ProbabilityDensity"], Plot[PDF[e𝒟, x], {x, 0, 8}, PlotStyle -> Thick, PlotRange -> All]]Time to Kidney Infection after Catheter Replacement (1)
Time to kidney infection in months:
KidneyInfection = ExampleData[{"Statistics", "KidneyInfection"}][[All, 1]];Estimate the distribution of the data:
e𝒟 = FindDistribution[KidneyInfection]Compare the histogram of the data to the PDF of the estimated distribution:
Show[Histogram[KidneyInfection, {0, 28, 2}, "ProbabilityDensity"], Plot[PDF[e𝒟, x], {x, 0, 28}, PlotStyle -> Thick, PlotRange -> All]]Properties & Relations (1)
By default, FindDistributionParameters uses maximum likelihood to estimate distribution parameters for a fixed distribution. FindDistribution uses a full Bayesian approach by combining the Bayesian information criterion with priors over distributions to select both the best distribution and the best parameters for it.
Generate data sampled from a StudentTDistribution:
SeedRandom[5]
data = RandomVariate[StudentTDistribution[1, 1, 4], 800];
Histogram[data, {-10, 10, Automatic}, "ProbabilityDensity"]Use FindDistribution to estimate the best distribution that fits the data:
dist1 = FindDistribution[data, RandomSeeding -> 1]Use FindDistributionParameters to estimate the best parameters, assuming a StudentTDistribution:
distribution = StudentTDistribution[α, β, ν];
estimatedParameters = FindDistributionParameters[data, distribution];
dist2 = distribution /. estimatedParametersEven though the StudentTDistribution minimized the log likelihood, the LogisticDistribution has larger prior and smaller complexity compared to it.
Compare the corresponding LogLikelihood:
LogLikelihood[#, data]& /@ {dist1, dist2}The option TargetFunctions can be used if you want to find roughly the same parameters as FindDistributionParameters:
dist3 = FindDistribution[data, TargetFunctions -> {StudentTDistribution}]Text
Wolfram Research (2015), FindDistribution, Wolfram Language function, https://reference.wolfram.com/language/ref/FindDistribution.html (updated 2017).
CMS
Wolfram Language. 2015. "FindDistribution." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2017. https://reference.wolfram.com/language/ref/FindDistribution.html.
APA
Wolfram Language. (2015). FindDistribution. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/FindDistribution.html
BibTeX
@misc{reference.wolfram_2026_finddistribution, author="Wolfram Research", title="{FindDistribution}", year="2017", howpublished="\url{https://reference.wolfram.com/language/ref/FindDistribution.html}", note=[Accessed: 13-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_finddistribution, organization={Wolfram Research}, title={FindDistribution}, year={2017}, url={https://reference.wolfram.com/language/ref/FindDistribution.html}, note=[Accessed: 13-June-2026]}