Reference Class Forecasting (RCF), popularized by Bent Flyvbjerg, is used to address optimism bias in project cost and schedule estimation. The logic is simple: compare your project to similar completed projects and adjust expectations based on their actual outcomes. It is one of the most influential forecasting innovations in project governance. But a foundational question that requires more attention is: how is similarity defined in the first place?
In practice, reference classes are often formed using broad administrative categories such as “light rail,” “metro,” or “heavy rail.” Yet these similarity criteria are frequently underspecified, even though they determine the empirical distribution from which percentile uplifts are calculated. Before any statistical adjustment occurs, a methodological decision has already shaped the forecast.
In my recent study using a dataset of U.S. mass transit projects, I examined the sensitivity of RCF to alternative reference class formation. Rather than relying on predefined categories, I applied unsupervised clustering techniques to construct alternative reference classes based on structural attributes such as track length, number of stations, underground proportion, and rolling stock. Cost and schedule outcomes were excluded to preserve the outside-view logic.
The results showed that changing similarity groupings altered cost and schedule distributions. Percentile-based uplifts shifted, dispersion patterns changed, and the implied contingency requirements varied across alternative clusters. In other words, RCF outcomes proved structurally contingent on how similarity was operationalized, not merely on statistical adjustment.
This does not undermine RCF. The behavioral foundations established by scholars such as Daniel Kahneman and Amos Tversky remain essential. The outside view is still one of the strongest correctives to optimism bias in major projects. However, the findings suggest that the credibility of RCF depends not only on selecting the appropriate percentile (P50, P80, etc.), but also on transparently justifying how the reference class was formed.
For practitioners and governance bodies, this has important implications. Reference class formation should be treated as a methodological decision requiring documentation and testing. Specifically:
- Similarity criteria should be explicitly defined and justified.
- Alternative class constructions should be tested for sensitivity.
- Forecast reports should disclose how grouping decisions influence uplift outcomes.
Posted on: May 04, 2026 08:00 AM |
Permalink




Community Champion