<aside> 📏 Discipline: Measure what matters, not what is easy to count.
</aside>
1. Actionable: Does this metric tell you what to do differently?
2. Timely: Can you get this data fast enough to act on it?
3. Granular: Does this capture variation across segments, not just aggregates?
4. Comparable: Can you compare this to a baseline or benchmark?
5. Resistant to gaming: Is it hard to improve the metric without improving the outcome?
AI optimizes what you measure. Full stop.
If you measure response time when the objective is first-contact resolution, AI produces fast but incomplete answers. If you measure throughput when the objective is accuracy, AI processes more items with more errors.
The solution: Measure both proxy and objective simultaneously. When proxy improves and objective does not, stop optimizing the proxy.
<aside> ⚠️ Goodhart’s Law in practice: When a measure becomes a target, it ceases to be a good measure. For every metric you optimize, identify how it could be gamed. Then measure the game.
</aside>