Statistical Outlier Detection in Geological Data

Written by

in

Share the knowledge

Finding Hidden Data Quality Issues Before They Become Costly Problems

Geological and geotechnical databases often contain millions of records collected over many years by multiple organizations, drilling contractors, geologists, laboratory technicians, and field personnel. While validation rules can identify obvious errors such as negative depths, overlapping intervals, or missing coordinates, many data quality issues are far more subtle.

A recovery value of 150% is clearly invalid and can be detected using a simple rule. However, what about a recovery value of 98% in highly weathered rock where nearby intervals average only 40%? Or an assay result that is technically within an acceptable range but significantly different from surrounding samples?

These situations are known as statistical outliers.

Outlier detection is an advanced quality control technique that identifies unusual values that deviate significantly from expected patterns. Modern geological database systems increasingly use statistical methods, trend analysis, and artificial intelligence to detect anomalies automatically and help organizations improve data quality.

This article explores statistical outlier detection techniques in geological data, including Z-score validation, trend analysis, AI-assisted anomaly detection, managing false positives, and confidence scoring.

What Is an Outlier?

An outlier is a value that differs significantly from other observations within a dataset.

Outliers may represent:

Data entry errors
Measurement errors
Unit conversion mistakes
Laboratory issues
Geological anomalies
Exceptional but valid conditions

The challenge is determining which category applies.

For example:

Recovery (%)
75
78
80
82
77
79
81
35

Depth	RQD
10 m	82
11 m	79
12 m	81
13 m	84
14 m	8
15 m	80

Validation Type	Result
Negative depth	Error
Overlapping intervals	Error
Statistical anomaly	Warning

Score	Interpretation
0–25	Low concern
26–50	Moderate concern
51–75	High concern
76–100	Very high concern

Statistical Outlier Detection in Geological Data

Finding Hidden Data Quality Issues Before They Become Costly Problems

What Is an Outlier?

Why Outlier Detection Matters

Z-Score Validation

Example: Recovery Data

Example: Laboratory Results

Trend Analysis

Depth-Based Trends

Spatial Trends

Temporal Trends

AI-Assisted Anomaly Detection

Beyond Single Values

Pattern Recognition

Multi-Dataset Validation

False Positives

Geological Data Is Naturally Variable

Warnings Instead of Errors

Reducing False Positives

Contextual Validation

Depth Windows

Project-Specific Thresholds

Multi-Factor Analysis

Confidence Scoring

What Is Confidence Scoring?

Example

Scenario 1

Scenario 2

Workflow Integration

Best Practices for Implementing Outlier Detection

Use Multiple Detection Methods

Treat Outliers as Review Candidates

Provide Clear Explanations

Track Resolution History

Conclusion

More posts

Core Logging QA/QC in Mineral Exploration

Borehole Data Governance for Transportation Projects

Geotechnical QA/QC for Infrastructure Projects

SPT and N-Value QA/QC Procedures