Quality

Confidence labels

What 'low', 'medium', 'high' confidence on an aggregation means and when to use which.

How we assign confidence

Confidence is a categorical label on every aggregation, based on three factors: sample size, source diversity, and time-window completeness. We use it as a UI dim/highlight signal — low-confidence aggregates are still computed and shown but are visually de-emphasized.

High confidence

Sample ≥ 100 observations, at least 4 distinct data sources contributing, and the time window is fully covered with no gaps > 14 days. High-confidence aggregates are used in indexes, time series, comparison tables, and CAGR.

Medium confidence

Sample ≥ 30 but < 100, or coverage from 2-3 sources, or with a gap of 14-30 days. Medium-confidence aggregates appear in variant statistics but are excluded from index construction.

Low confidence

Sample < 30, or single-source, or gaps > 30 days. We compute and display the aggregate but explicitly flag it. We never include low-confidence aggregates in index values or in CAGR calculations.

How we assign confidence

High confidence

Medium confidence

Low confidence

Other topics