Project Management

Statistical Misuse of Ordinal Scales: The Mathematical and Ethical Flaws of Averaging Planning Poker Metrics

From the The Agile Enterprise Blog
by
This blog will explore agility at the enterprise level, examining how agile principles can be implemented throughout the organization—and in departments other than IT.

About this Blog

RSS

Recent Posts

Statistical Misuse of Ordinal Scales: The Mathematical and Ethical Flaws of Averaging Planning Poker Metrics

Metric Integrity, Semiquantitative Traps & Ethics: The Fallacy of Velocity as a Performance Metric

Goodhart's Law in Agile Delivery: When Metrics Become Targets

Aligning Agile Practices with the PMI Code of Ethics: Intersecting Responsibility, Respect, Fairness, and Honesty with the Agile Manifesto

Fabricating Estimates Under Executive Pressure: Navigating the Ethics of Adjusting to Fit the Budget

Categories

Agile, Artificial Intelligence, Benefits Realization, Change Management, Communications Management, Complexity, Consulting, Decision Making, Disciplined Agile, Diversity, Earned Value Management, Estimating, Ethics, General, Governance, History, Innovation, Knowledge Management, Leadership, Lessons Learned, Metrics, Organizational Culture, Product Management, Risk Management, Scope Management, Scrum, Social Impact, Stakeholder Management, Teams, Testing/Test Management

Date

linkedin twitter facebook Request to reuse this  

Categories: Agile, Estimating, Ethics


Statistical Misuse of Ordinal Scales: The Mathematical and Ethical Flaws of Averaging Planning Poker Metrics

Introduction
In Agile software development, metrics like Planning Poker story points are widely used to estimate the size and complexity of work items. These metrics are based on ordinal scales—a type of ranking where the relative order of items matters, but the exact differences between them do not. Despite this, it’s common practice to calculate averages, run regressions, and otherwise apply standard mathematical operations to such data. This statistical misuse isn’t just a technical mistake; it has real-world consequences for decision-making and can cross into the realm of ethical misrepresentation. In this blog post, we examine the nature of ordinal data, why treating it as interval data is problematic, and the ethical implications for teams and organizations. We also provide guidance to help avoid these pitfalls, concluding with a question for readers to reflect on their own experiences.

Understanding Ordinal Scales in Agile Contexts
What Is an Ordinal Scale?
An ordinal scale is a way of ranking items or outcomes according to some criterion, but without specifying the degree of difference between them. For example, a restaurant rating system (poor, fair, good, excellent) or a pain scale (mild, moderate, severe) are ordinal. In Agile, Planning Poker uses a sequence of numbers (often Fibonacci: 1, 2, 3, 5, 8, 13, etc.) to estimate effort, but the gaps between these numbers are not consistent or meaningful in a mathematical sense.
Why Do Teams Use Ordinal Scales?
Ordinal scales like Planning Poker sequences are practical for group estimation, helping to drive consensus and discussion. They acknowledge the uncertainty and subjectivity inherent in software estimation, allowing teams to quickly rank work items from smallest to largest without worrying about precise measurement.

Statistical Misuse: Averages and Regressions on Ordinal Data
The Mathematics of Ordinal Data
Ordinal data only tells us the order of items, not the magnitude of differences. For example, the difference in effort between a 2-point and a 3-point story is not necessarily the same as between a 5-point and an 8-point story. Treating these numbers as if they are evenly spaced (like real numbers on a ruler) violates the fundamental properties of ordinal data.
The Flaws of Mathematical Averages
Despite this, many teams and organizations calculate the average story point value for a sprint, or the average velocity across sprints. They may even run regressions to forecast future delivery. However, calculating averages or running arithmetic operations on ordinal data is mathematically unsound because:
  • The intervals between points are not consistent or meaningful.
  • The results can be misleading, producing averages that do not correspond to any real scenario (e.g., an average story size of 4.2 points).
  • It gives a false sense of precision and objectivity.
Regression and Advanced Analytics
Some organizations take it further, applying regression analysis or more complex statistical models to ordinal data. These methods assume interval or ratio-level data, where arithmetic operations are valid. Using them on ordinal metrics produces results that are, at best, spurious and, at worst, drive misguided decisions.

Real-World Consequences of Statistical Misuse
Poor Decision-Making
Relying on mathematically flawed averages or projections leads to poor planning, unrealistic commitments, and ultimately, failed projects. Teams may be pushed to deliver "average" story sizes that are not grounded in reality or pressured to meet forecasted velocities that have no statistical validity.
Erosion of Trust
When stakeholders realize that the numbers don’t add up—or worse, when projects fail due to flawed metrics—trust in the estimation process and in leadership breaks down.
Ethical Implications
Misrepresenting ordinal metrics as if they were interval or ratio data is more than just a technical error; it’s an ethical lapse. It can:
  • Deceive stakeholders about team performance or project predictability.
  • Lead to unfair evaluations of teams or individuals based on invalid data.
  • Undermine psychological safety, as teams feel pressured to "hit the numbers."
Ethical reporting requires honesty about what metrics can and cannot tell us. Using the wrong statistical tools is, in effect, a form of data manipulation, even if unintentional.

Best Practices: Using Ordinal Metrics Responsibly
  1. Recognize the Limits: Treat story points and other ordinal metrics as relative rankings, not precise measurements.
  2. Avoid Arithmetic Operations: Don’t calculate averages or run regressions on ordinal data. Instead, look at frequency counts, medians, or modes.
  3. Educate Stakeholders: Ensure that everyone understands what ordinal metrics mean and how they should (and should not) be used.
  4. Report with Integrity: Be transparent about the limitations of your data and the methods used to analyse it.
  5. Focus on Conversation: Use ordinal metrics to drive discussion and consensus, not to produce misleading statistics.
The bottom line
Ordinal metrics like Planning Poker story points have value when used as intended—to facilitate team discussion and consensus. But applying standard mathematical operations to these numbers is both mathematically invalid and ethically questionable. By respecting the true nature of ordinal data and reporting it with integrity, teams and organizations can avoid misleading themselves and their stakeholders, making better decisions and building greater trust.

Question for Readers:
Have you encountered situations where averages or advanced analytics were applied to ordinal metrics like story points or Planning Poker estimates? How did it affect planning, transparency, or trust in your teams?
Share your experiences and insights below.
Posted on: June 15, 2026 01:21 AM | Permalink

Comments (1)

Please login or join to subscribe to this item
avatar
Luis Branco CEO| Business Insight, Consultores de Gestão, Ldª Carcavelos, Lisboa, Portugal
Excellent article.

Perhaps the greatest risk is not the statistical error itself, but the illusion of certainty it creates. When ordinal estimates are treated as precise measurements rather than contextual indicators, organizations can gradually become more confident in their forecasts while becoming less connected to reality.

In that sense, metric integrity is about more than mathematics. It is about preserving the quality of judgment, decision-making and organizational learning.

A valuable reminder that precision should never be mistaken for understanding.

Please Login/Register to leave a comment.

ADVERTISEMENTS

If trees could scream, would we be so cavalier about cutting them down? We might, if they screamed all the time, for no good reason.

- Jack Handey

ADVERTISEMENT

Sponsors