You have a small sample — 10 parts from a prototype run. Your customer wants Cpk. What do you report — and how much should anyone trust it?
This is the question nobody answers honestly. The textbook says "collect at least 30 measurements." Your production timeline says "we have 10 and the PPAP deadline is Friday." So you compute Cpk from 10 parts, report 1.38, and hope for the best.
First pass yield says 98.2%. Cpk says 0.94. One measures what happened. The other predicts what will happen next. When they disagree, something important is hiding — and knowing which to trust prevents costly mistakes.
Here's what that number actually means: with 10 measurements, your process capability estimate has a 95% confidence interval roughly 0.6 units wide. Your reported 1.38 could be anywhere from 1.05 to 1.71. Below capability threshold or comfortably above it. You don't know which.
Why Small Sample Cpk Falls Apart Below n = 25
The standard Cpk formula computes (USL - μ) / 3σ. Both μ and σ are estimated from your sample. With large samples, those estimates are precise. With small samples, they carry substantial uncertainty — and σ is the bigger problem.
Standard deviation estimation from small samples is asymmetrically biased. With n = 10, your sample σ underestimates the true σ roughly 4% of the time and overestimates it dramatically in the tails of the chi-squared distribution. That asymmetry means your Cpk isn't just uncertain — it's systematically optimistic more often than it should be.
Put differently: when you compute Cpk from 10 parts, the number you get is more likely to be higher than the true Cpk than lower. That's not conservative. That's dangerous.
The Confidence Interval Nobody Reports
Ask a quality engineer for Cpk and they give you one number. Ask for the confidence interval and you'll usually get a blank look.
Here's what the intervals look like for Cpk = 1.33 at different sample sizes:
n = 10: 95% CI = [0.89, 1.77] — useless for decisions
n = 25: 95% CI = [1.06, 1.60] — still wide
n = 50: 95% CI = [1.15, 1.51] — starting to be useful
n = 100: 95% CI = [1.21, 1.45] — decision-grade precision
At n = 10, your "Cpk = 1.33" is compatible with a process that's genuinely incapable (0.89) and one that's excellent (1.77). Reporting the point estimate without the interval is statistically irresponsible. But it's standard practice because most SPC tools don't even display the confidence interval by default.
Your customer sees "1.33" and thinks that's a fact. It's a guess with enormous error bars.
The Parametric Trap
The problem compounds because small-sample Cpk depends on estimating σ, and σ estimation depends on the normality assumption. If your 10 measurements aren't normally distributed — and with 10 points, you can't reliably test — the σ estimate has additional, unquantified error.
Run a Shapiro-Wilk test on 10 measurements. The test has almost no power at that sample size. It will "pass" nearly anything — normal, skewed, bimodal, uniform. A normality test that passes everything isn't a normality test. It's a rubber stamp.
So with small samples, you can't verify the assumption that makes the formula valid. You use the formula anyway. The result is a number built on an assumption you can't check, with uncertainty you don't report.
What Entropy-Based Estimation Does Differently
The EGDF approach to capability doesn't estimate μ and σ. It doesn't need to. Instead of fitting parameters to an assumed distribution, it constructs the distribution function directly from the data.
Why does this matter for small samples? Three reasons:
No σ estimation. The primary source of small-sample instability is σ. EGDF bypasses it entirely. The distribution function is built from data ranks and entropy weighting, not from moment estimation. Remove σ and you remove the biggest uncertainty driver.
No distributional assumption. With 10 parts, you can't distinguish normal from lognormal from Weibull. EGDF doesn't try. It builds a function that is faithful to whatever shape 10 measurements reveal — without naming it.
Faster convergence. Parametric Cpk converges to the true value as n grows because μ and σ estimates improve. EGDF converges faster because it's estimating a function, not parameters. The difference is most pronounced exactly where it matters most — at n = 5 to 25.
The result: entropy-based Cpk from 10 measurements has a narrower effective confidence band than parametric Cpk from the same 10 measurements. Not because it invents precision — because it eliminates a layer of estimation error.
When 10 Parts Is Enough (and When It Isn't)
No method makes 10 parts as reliable as 100. But the question isn't "is n = 10 perfect?" The question is "what's the best estimate I can get from n = 10?"
Prototype qualification (n = 5–15). You need a directional answer: is this process likely capable? Entropy-based Cpk gives a defensible estimate where traditional Cpk gives a guess. Report it with appropriate caveats.
PPAP submission (n = 25–50). Most automotive OEMs accept n = 30 for initial capability studies. At this sample size, entropy-based and traditional methods begin to converge for normal data. For non-normal data, the entropy estimate is still more reliable.
Process monitoring (n > 50 cumulative). Once you have production data flowing, sample size stops being a constraint. But the initial qualification — the gatekeeper decision — happens at small n.
The critical insight: small-sample capability isn't about lowering your standards. It's about using a method that extracts maximum information from limited data instead of one that wastes half the information on assumption-dependent parameter estimation.
What to Report When the Sample Is Small
Four practices that make small-sample capability honest:
Report the method. "Capability computed using entropy-based distribution-free methods" is one sentence. It tells the reader that the number accounts for potential non-normality — a material improvement over "Cpk = 1.38" with no context.
Report the sample size. Always. Every capability number should state n. "Cpk = 1.38 (n = 10, entropy-based)" is far more informative than "Cpk = 1.38."
Report both estimates. When traditional and entropy-based Cpk agree, your confidence increases. When they disagree, you've discovered that the distributional assumption matters for this dataset — and you know which number to trust.
Plan for confirmation. A small-sample estimate is a first answer, not a final answer. Build a plan to confirm with production data. The initial estimate tells you whether to proceed or investigate — which is exactly the decision you need to make at n = 10.
The Prototype Dilemma Has an Answer
You have 10 parts. Your customer wants Cpk. You can report a number built on an assumption you can't verify, with uncertainty you don't disclose. Or you can report a number that makes no assumption, extracts maximum information from your data, and comes with honest uncertainty bounds.
Robust statistics for small samples isn't about making small samples behave like large ones. It's about using methods designed for the reality of limited data instead of methods designed for textbook scenarios that don't exist on the shop floor.
Your 10 parts contain real information about your process. The question is whether your method can extract it.
Upload your small-batch data and see how entropy-based Cpk compares to traditional estimates — same data, honest uncertainty, no normality assumption. Try EntropyStat free →
Your PPAP got rejected — not for bad parts, but for bad statistics. OEM auditors now scrutinize whether your Cpk method matches your data. Build a PPAP capability evidence chain that withstands the toughest audits.
Minitab offers non-normal options. EntropyStat is distribution-free. Those aren’t the same thing. Offering a menu of distributions to choose from is distribution-flexible — not distribution-free. Here’s why that distinction determines whether your Cpk is correct.