EGDF (Entropic Global Distribution Function)

The EGDF is Machine Gnostics' primary distribution estimation method. It constructs a smooth, continuous cumulative distribution function directly from data using entropy-based algebraic optimization, without assuming any parametric form such as normal or Weibull.

Why It Matters

Every statistical quality calculation — capability indices, control limits, tolerance intervals, process comparisons — depends on knowing the distribution of your data. Traditional methods force you to choose a distribution family first (normal, lognormal, Weibull, etc.), then fit parameters. If you choose wrong, every downstream calculation is biased.

The EGDF eliminates the distribution selection step entirely. It learns the distribution shape from the data itself, producing a continuous CDF that can be used anywhere a parametric CDF would be used — but without the risk of model misspecification.

This is particularly important for automated quality systems where an engineer is not manually verifying each dataset's distribution. An API call to EGDF produces a reliable distribution estimate regardless of what shape the data takes.

The EntropyStat Perspective

The EGDF is the core of EntropyStat's analytical engine. Built on over 40 years of mathematical research at the Czech Academy of Sciences, it uses gnostic algebra — a deterministic optimization framework based on entropy principles and error geometry — to construct distribution functions.

Key properties that distinguish EGDF from parametric fitting and kernel density estimation: it produces the same result every time (deterministic, no random seeds), it is inherently robust to outliers because it uses supremum-based optimization rather than least-squares, and it works reliably with as few as 5–8 data points because it does not need to estimate parametric distribution parameters.

The EGDF supports both additive form (for data spanning positive and negative values) and multiplicative form (for strictly positive data with proportional variation). The Scale parameter controls smoothness and is auto-optimized using the Kolmogorov-Smirnov test. The result is a continuous CDF with well-defined bounds, from which EntropyStat derives all downstream metrics: percentiles, capability indices, tolerance intervals, and control limits.

Related Terms

ELDF (Entropic Local Distribution Function)

The ELDF is Machine Gnostics' local distribution analysis method. While the EGDF provides a global view of the entire distribution, the ELDF focuses on local structure — revealing peaks, clusters, and multimodal features hidden within the data.

Entropy in Statistics

Entropy, originally from thermodynamics and information theory, quantifies the uncertainty or disorder in a system. In statistics, entropy-based methods use this principle to build distribution estimates that make the fewest unwarranted assumptions about the data.

Distribution Fitting

Distribution fitting is the process of finding a probability distribution that best describes a dataset. Traditional methods involve selecting a parametric family (normal, Weibull, lognormal) and estimating its parameters, then validating the fit with a goodness-of-fit test.

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov (K-S) test is a nonparametric goodness-of-fit test that measures the maximum distance between an empirical cumulative distribution function and a reference distribution. It determines whether a sample plausibly comes from a specified distribution.

Small Sample Statistics

Small sample statistics deals with drawing reliable conclusions from limited data — typically fewer than 30 observations. Traditional methods lose reliability with small samples because parametric distribution estimates become unstable, and the Central Limit Theorem provides weaker guarantees.

The Distribution Fitting Trap: Weibull, Lognormal, or None of the Above?

Distribution fitting replaces the normality assumption with a different guess. With typical sample sizes, Weibull, lognormal, and gamma all pass goodness-of-fit tests — giving different Cpk values. The distribution fitting step that should fix your analysis becomes its own error source.

Mar 13, 2026

Hidden Clusters in Your Process Data — and Why Cpk Hides Them

Hidden clusters from multi-cavity molds, shift changes, and material lots produce aggregate Cpk that looks capable — while one subpopulation ships defects. ELDF detects what Cpk can’t see.

Mar 11, 2026

EntropyStat vs. Minitab: What Distribution-Free Analysis Actually Means

Minitab offers non-normal options. EntropyStat is distribution-free. Those aren’t the same thing. Offering a menu of distributions to choose from is distribution-flexible — not distribution-free. Here’s why that distinction determines whether your Cpk is correct.

Mar 10, 2026

See Entropy-Powered Analysis in Action

Upload your data and compare traditional SPC with entropy-based methods. Free demo — no credit card required.

Try the Demo Book a Consultation