MEGR 3171 — Week 2: Statistical Analysis & Error Characterization

Learning Objectives

Distinguish systematic (bias) error from random (precision) error and identify sources of each.
Compute mean, variance, standard deviation, and coefficient of variation from raw data.
Apply the normal (Gaussian) distribution and Student's t-distribution to measurement data.
Construct confidence intervals for population means using small samples.
Perform a two-sample t-test to compare the calibrations of two sensors and interpret the result.

1. Types of Error

Systematic (Bias) Error

Consistent, repeatable error that shifts all readings in the same direction. Cannot be reduced by averaging. Causes: poor calibration, zero offset, EMI, loading effects. Corrected by calibration.

Random (Precision) Error

Unpredictable scatter about the mean. Can be reduced by averaging more readings. Causes: electrical noise, turbulence, vibration, operator variability. Quantified by standard deviation.

Drift

A slow systematic change in sensor output over time, often temperature-related. Cannot be reduced by averaging — requires recalibration or temperature compensation.

Repeatability Error

The inability to return the same output for the same input under identical conditions applied consecutively. A subset of precision error measured under controlled, short-interval conditions.

2. Descriptive Statistics

For a dataset of N measurements x₁, x₂, ..., x_N:

Key Formulas

Sample Mean:       x̄ = (1/N) ∑ x_i

Sample Variance:   s² = (1/(N-1)) ∑(x_i - x̄)²

Std Deviation:     s = √s²

Coeff of Variation: CV = (s / x̄) × 100%     (dimensionless precision measure)

Why N-1 (not N)? Dividing by N-1 rather than N gives the unbiased estimator of the population variance. When computing the mean from the same dataset, one degree of freedom is lost. This is the Bessel correction.

3. Probability Distributions

Normal (Gaussian) Distribution

Random measurement errors are well-approximated by the normal distribution when many independent noise sources are present (Central Limit Theorem). Key intervals:

x̄ ± 1s contains approximately 68.3% of observations
x̄ ± 2s contains approximately 95.4% of observations
x̄ ± 3s contains approximately 99.7% of observations

Student's t-Distribution

When the population standard deviation is unknown and the sample is small (N < 30), the t-distribution must be used instead of the standard normal. It has heavier tails, reflecting greater uncertainty from the small sample. The t-distribution approaches the normal distribution as N increases.

The t-value depends on both the confidence level and the degrees of freedom (df = N − 1). Critical values are found in t-tables or computed via software.

Confidence Intervals for the Mean

Confidence Interval

CI: x̄ ± t_(α/2, N-1) · (s / √N)

where t_(α/2, N-1) is the two-tailed critical t-value
for confidence level (1-α) and df = N-1.

A 95% confidence interval means: if this experiment were repeated many times and a CI computed each time, approximately 95% of those intervals would contain the true population mean.

4. Two-Sample t-Test for Sensor Comparison

To determine if two sensors have statistically different mean outputs for the same measurand, use the two-sample (Welch's) t-test.

Two-Sample t-Statistic

t = (x̄_1 - x̄_2) / √(s_1²/N_1 + s_2²/N_2)

Degrees of freedom (Welch): complex formula
Simplified: df ≈ min(N_1, N_2) - 1  (conservative)

Decision: Reject H_0 (no difference) if |t| > t_critical

The null hypothesis H₀ is that the two sensors measure the same mean value. Rejecting H₀ at the 5% significance level means there is a 95% probability the difference is real, not due to random chance.

Practice Problems

Problem 1 — Descriptive Statistics A pressure sensor is tested 8 times at a reference pressure of 200 kPa. Readings (kPa): 198.5, 199.2, 197.8, 200.1, 199.5, 198.9, 200.3, 199.0. Compute the mean, sample standard deviation, and coefficient of variation.

Sum = 198.5+199.2+197.8+200.1+199.5+198.9+200.3+199.0 = 1593.3

Mean x̄ = 1593.3 / 8 = 199.163 kPa

Deviations squared: 0.437, 0.001, 1.796, 0.876, 0.110, 0.069, 1.295, 0.027. Sum = 4.611

s² = 4.611 / 7 = 0.659 s = √0.659 = 0.811 kPa

CV = (0.811 / 199.163) × 100% = 0.407%

x̄ = 199.16 kPa, s = 0.81 kPa, CV = 0.41%

Problem 2 — Confidence Interval Using the data from Problem 1, construct a 95% confidence interval for the true mean pressure. (Use t_{0.025, 7} = 2.365.)

CI = x̄ ± t · (s / √N) = 199.163 ± 2.365 · (0.811 / √8)

= 199.163 ± 2.365 · 0.2867 = 199.163 ± 0.678

95% CI: [198.49 kPa, 199.84 kPa]

Problem 3 — Error Type Classification A thermocouple reads consistently 3°C below the true temperature in a warm lab. After the lab cools overnight, it reads correctly. Classify this error and recommend a correction strategy.

This is a systematic (bias) error with a drift component linked to ambient temperature. The thermocouple's reference junction compensation circuit is likely affected by the elevated room temperature, introducing a consistent offset.

Systematic/drift error. Correction: cold-junction compensation, isothermal block, or recalibration at operating temperature.