SPC & Quality Analytics Glossary
Key terms in statistical process control and quality engineering — with EntropyStat's entropy perspective
Process Capability
Process Capability (Cpk/Ppk)
Process capability indices (Cpk and Ppk) quantify how well a manufacturing process can produce parts within specification limits. Cpk measures short-term capability using within-subgroup variation, while Ppk measures long-term performance using overall variation.
Tolerance Intervals
Tolerance intervals define a range expected to contain a specified proportion of the population with a given confidence level. Unlike confidence intervals (which estimate a parameter) or prediction intervals (which bound the next observation), tolerance intervals bound a percentage of all future production.
DPMO (Defects Per Million Opportunities)
DPMO measures the number of defects expected per million opportunities for a defect to occur. It normalizes defect rates across products with different complexity levels, enabling fair comparison between a simple stamped bracket (few opportunities) and a complex PCB assembly (thousands of opportunities).
Yield Analysis
Yield analysis measures the proportion of products that pass all quality checks without rework or rejection. It includes first pass yield (FPY), rolled throughput yield (RTY), and final yield — each capturing different aspects of process quality across single or multiple manufacturing steps.
Sigma Level
Sigma level expresses process capability as the number of standard deviations between the process mean and the nearest specification limit. A higher sigma level indicates fewer defects: 3 sigma ≈ 66,807 DPMO, 4 sigma ≈ 6,210 DPMO, 6 sigma ≈ 3.4 DPMO (with the 1.5σ shift).
Defects Per Unit (DPU)
DPU measures the average number of defects found per unit produced, regardless of how many opportunities for defects exist on each unit. Unlike DPMO, DPU does not normalize for product complexity — it simply counts total defects divided by total units inspected.
First Pass Yield (FPY)
First pass yield measures the percentage of units that pass all quality checks on the first attempt without rework, repair, or rejection. It quantifies the true process quality by excluding the hidden factory of rework loops that inflate final yield numbers.
Control Charts
Statistical Process Control (SPC)
Statistical Process Control is a methodology that uses statistical methods to monitor and control a manufacturing process. SPC distinguishes between common-cause variation (inherent to the process) and special-cause variation (assignable to specific events).
Control Charts
Control charts are time-ordered plots of a process measurement with statistically derived upper and lower control limits. They visually separate normal process variation from signals that indicate the process has shifted or become unstable.
Alarm Fatigue in Quality
Alarm fatigue occurs when operators and engineers become desensitized to frequent quality alerts, leading them to ignore or dismiss genuine signals. It is typically caused by excessive false alarms from control charts with inappropriate statistical limits.
Process Drift Detection
Process drift is a gradual shift in the central tendency or variation of a manufacturing process over time. Drift detection identifies these slow changes before they cause out-of-specification production, using statistical methods to distinguish drift from normal random variation.
Real-Time Process Monitoring
Real-time process monitoring is the continuous tracking of manufacturing process parameters and quality measurements as production occurs. It combines data acquisition from sensors and gauges with statistical analytics to provide immediate visibility into process health and trigger alerts when intervention is needed.
Histogram
A histogram is a bar chart that displays the frequency distribution of continuous data by grouping measurements into equal-width intervals (bins). It provides a visual summary of data shape, center, spread, and any unusual features like skewness, bimodality, or outliers.
Run Charts
A run chart plots individual measurements in time order against a centerline (typically the median). It is a simpler alternative to control charts that does not require statistical control limits, making it useful for identifying trends, shifts, cycles, and other non-random patterns in process data.
Pareto Analysis
Pareto analysis ranks defect types or quality problems by frequency or impact, identifying the vital few causes that account for the majority of issues. Based on the 80/20 principle, it prioritizes improvement efforts on the problems that will yield the greatest quality and cost benefit.
Process Stability
Process stability means a process is operating in statistical control — only common-cause variation is present, and the process distribution is consistent over time. A stable process is predictable: its mean, spread, and shape do not change from sample to sample.
Distribution Analysis
Normal Distribution
The normal (Gaussian) distribution is a symmetric, bell-shaped probability distribution fully described by its mean and standard deviation. It is the foundational assumption behind most classical statistical quality methods, including Cpk, Shewhart charts, and Six Sigma calculations.
Distribution Fitting
Distribution fitting is the process of finding a probability distribution that best describes a dataset. Traditional methods involve selecting a parametric family (normal, Weibull, lognormal) and estimating its parameters, then validating the fit with a goodness-of-fit test.
Non-Normal Data
Non-normal data is process data whose distribution does not follow the Gaussian (bell curve) pattern. Common non-normal patterns in manufacturing include skewed distributions, bimodal distributions, truncated distributions, and heavy-tailed distributions.
Weibull Distribution
The Weibull distribution is a versatile probability distribution widely used in reliability engineering and failure analysis. Its shape parameter allows it to model increasing failure rates (wear-out), constant failure rates (random failures), or decreasing failure rates (early mortality).
Lognormal Distribution
The lognormal distribution describes data whose logarithm follows a normal distribution. It is right-skewed, bounded below by zero, and commonly arises in manufacturing processes involving multiplicative effects — such as particle sizes, surface roughness, and chemical concentrations.
Exponential Distribution
The exponential distribution models the time between independent events occurring at a constant rate. In quality engineering, it describes time between random failures, wait times, and any process where events occur independently with a constant hazard rate.
Uniform Distribution
The uniform distribution describes data where every value within a range is equally likely. In manufacturing, it appears when measurement resolution is coarse relative to process variation, or when a process is controlled within tight bounds but has no tendency toward a central value.
Entropy Methods
EGDF (Entropic Global Distribution Function)
The EGDF is Machine Gnostics' primary distribution estimation method. It constructs a smooth, continuous cumulative distribution function directly from data using entropy-based algebraic optimization, without assuming any parametric form such as normal or Weibull.
ELDF (Entropic Local Distribution Function)
The ELDF is Machine Gnostics' local distribution analysis method. While the EGDF provides a global view of the entire distribution, the ELDF focuses on local structure — revealing peaks, clusters, and multimodal features hidden within the data.
Entropy in Statistics
Entropy, originally from thermodynamics and information theory, quantifies the uncertainty or disorder in a system. In statistics, entropy-based methods use this principle to build distribution estimates that make the fewest unwarranted assumptions about the data.
Homogeneity Testing
Homogeneity testing determines whether a dataset comes from a single statistical population or contains multiple subpopulations. In manufacturing, non-homogeneous data indicates that the process was not operating in a single stable mode during data collection.
Cluster Detection
Cluster detection in quality analytics identifies distinct subgroups (modes) within process data. Unlike outlier detection, which flags individual extreme points, cluster detection finds coherent subpopulations that may have different means, variances, or distribution shapes.
Assumption-Free Statistics
Assumption-free statistics are methods that do not require data to follow a specific probability distribution (like normal, Weibull, or exponential). They derive results directly from the data structure using algebraic and geometric principles rather than probabilistic models with parametric assumptions.
Quality Standards
Quality 4.0
Quality 4.0 is the application of Industry 4.0 technologies — digital connectivity, AI, cloud computing, and advanced analytics — to quality management. It shifts quality from reactive inspection to predictive and prescriptive analytics driven by real-time data.
Six Sigma
Six Sigma is a data-driven quality methodology that aims to reduce defects to 3.4 per million opportunities. It uses the DMAIC framework (Define, Measure, Analyze, Improve, Control) and relies heavily on statistical tools to identify and eliminate sources of variation.
IATF 16949
IATF 16949 is the international quality management system standard for the automotive industry. It integrates ISO 9001 requirements with automotive-specific requirements for defect prevention, variation reduction, and supply chain quality management.
Measurement System Analysis (MSA)
MSA evaluates the quality of a measurement system — including the instrument, operator, environment, and procedure — to quantify how much of the observed variation is due to the measurement process itself rather than actual part-to-part differences.
Data-Driven Manufacturing
Data-driven manufacturing uses real-time data collection and statistical analytics to guide production decisions, replacing experience-based rules of thumb with evidence-based process control. It encompasses SPC, predictive maintenance, automated quality monitoring, and closed-loop process optimization.
Gage R&R (Repeatability & Reproducibility)
Gage R&R is a measurement system analysis technique that quantifies how much of observed process variation comes from the measurement system itself — split into repeatability (same operator, same part, same gage) and reproducibility (different operators measuring the same part).
PPAP (Production Part Approval Process)
PPAP is a standardized process in the automotive industry that demonstrates a supplier can consistently manufacture parts meeting all customer engineering design specifications. It requires documented evidence including process capability studies, measurement system analysis, and control plans.
APQP (Advanced Product Quality Planning)
APQP is a structured framework for developing and launching new products in the automotive industry. It defines five phases from planning through production validation, with quality tools (FMEA, control plans, MSA, capability studies) integrated at specific gates.
FMEA (Failure Mode and Effects Analysis)
FMEA is a systematic risk assessment method that identifies potential failure modes in a product or process, evaluates their severity, occurrence likelihood, and detectability, and prioritizes corrective actions. It produces a Risk Priority Number (RPN) or Action Priority (AP) for each failure mode.
8D Problem Solving
8D is a structured eight-discipline problem-solving methodology used in manufacturing to identify root causes, implement corrective actions, and prevent recurrence. It is widely required by automotive OEMs for formal customer complaint responses.
Statistical Concepts
Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov (K-S) test is a nonparametric goodness-of-fit test that measures the maximum distance between an empirical cumulative distribution function and a reference distribution. It determines whether a sample plausibly comes from a specified distribution.
Outlier Detection
Outlier detection identifies data points that deviate significantly from the expected pattern of a dataset. In manufacturing, outliers may indicate measurement errors, tooling failures, material defects, or genuine process excursions that require investigation.
Small Sample Statistics
Small sample statistics deals with drawing reliable conclusions from limited data — typically fewer than 30 observations. Traditional methods lose reliability with small samples because parametric distribution estimates become unstable, and the Central Limit Theorem provides weaker guarantees.
Robust Statistics
Robust statistics are methods that remain reliable even when data contains outliers, contamination, or deviations from assumed distributions. They provide stable estimates where classical methods (like the mean and standard deviation) would be significantly distorted.
Anderson-Darling Test
The Anderson-Darling test is a statistical goodness-of-fit test that measures how well data follows a specified distribution. It gives extra weight to the tails of the distribution, making it more sensitive than the Kolmogorov-Smirnov test for detecting departures from normality.
Shapiro-Wilk Test
The Shapiro-Wilk test is a statistical test for normality that compares ordered sample values against their expected values under a normal distribution. It is widely considered the most powerful normality test for small to moderate sample sizes (n < 50).
Chi-Square Test
The chi-square test is a statistical test used for two purposes in quality engineering: testing goodness-of-fit (does observed data match an expected distribution?) and testing independence (are two categorical variables related?). It compares observed frequencies to expected frequencies across categories.
Student's t-Test
The t-test is a statistical test that compares means between two groups (two-sample t-test) or against a reference value (one-sample t-test). It determines whether observed differences are statistically significant or likely due to random sampling variation.
ANOVA (Analysis of Variance)
ANOVA is a statistical method that tests whether the means of three or more groups differ significantly. It partitions total variation into between-group and within-group components, determining if observed group differences exceed what random variation alone would produce.
Acceptance Sampling
Acceptance sampling is a statistical quality control method where a random sample is inspected from a lot to decide whether to accept or reject the entire lot. It balances inspection cost against the risk of accepting defective lots or rejecting good ones.
AQL (Acceptable Quality Level)
AQL is the maximum percentage of defective items in a lot that is considered acceptable for ongoing production. It serves as the primary index for acceptance sampling plans, defining the quality level at which lots will be accepted most of the time (typically 95%).
OC Curves (Operating Characteristic)
An OC curve plots the probability of accepting a lot as a function of the lot's true quality level (fraction defective). It characterizes a sampling plan's ability to discriminate between good and bad lots, showing both producer's risk (rejecting good lots) and consumer's risk (accepting bad lots).
Sample Size Determination
Sample size determination is the process of calculating the minimum number of measurements needed to achieve a desired level of statistical confidence and precision. It depends on the expected variability, the required precision (margin of error), and the acceptable error rates (Type I and Type II).
Confidence Intervals
A confidence interval is a range of values that, with a specified probability (typically 95%), contains the true population parameter. In quality engineering, confidence intervals quantify the uncertainty in estimates like process mean, standard deviation, and capability indices.
Type I and Type II Errors
A Type I error (false positive, alpha risk) occurs when a statistical test incorrectly rejects a true null hypothesis. A Type II error (false negative, beta risk) occurs when a test fails to reject a false null hypothesis. In quality engineering, these map to false alarms and missed signals.
Subgroup Analysis
Subgroup analysis divides process data into rational subgroups — small groups of measurements collected under similar conditions (same machine, operator, material lot, time window). Variation within subgroups estimates short-term process noise, while variation between subgroups reveals shifts and trends.
See Entropy-Powered Analysis in Action
Upload your data and compare traditional SPC with entropy-based methods. Free demo — no credit card required.