There is a quiet line item growing on enterprise AI budgets, and almost no one is watching it. It's the cost of using a frontier model to do work a far smaller, cheaper model would have handled just as well. Classifying a support ticket. Extracting a date from a form. Routing an email. Tasks that don't need a PhD in a box, but get billed as though they do.

The instinct is understandable. The biggest model is the safest choice in a demo, because it rarely embarrasses you. But provisioning premium compute for routine work is how AI programs become expensive without becoming valuable.

Capability is a spectrum, and so is cost

Modern AI isn't one tool. Think of it as a ladder. At the bottom sit small, fast, inexpensive models that excel at narrow, well-defined tasks. Above them are mid-tier models, and at the top are the frontier systems that handle open-ended reasoning. Cost rises sharply as you climb, so the real skill is knowing how high you actually need to go.

Most enterprise workloads cluster at the bottom and middle of that ladder. A surprising share of "AI" in production is classification, extraction, and routing, which is exactly the work a tuned small model does cheaply, quickly, and often more reliably than a giant one.

You don't hire a senior partner to file the paperwork. The same economics apply to models.

A framework for matching model to task

Before choosing a model, we ask four questions about the task itself:

  1. How open-ended is it? Narrow, well-specified tasks reward small models. Open-ended reasoning is where frontier models earn their cost.
  2. How high are the stakes of an error? A wrong product recommendation and a wrong clinical summary do not warrant the same spend or the same guardrails.
  3. What's the volume? At scale, a few cents of difference per call becomes a budget line. High-volume tasks are where right-sizing pays off most.
  4. Can a smaller model get there with structure? Often a modest model plus good retrieval, prompting, or a private fine-tune beats a frontier model used naively, at a fraction of the cost.

The goal isn't the cheapest model or the smartest one. It's the right one, chosen on purpose.

Private and small models change the math

The rise of capable small and open models, including private, purpose-trained ones you run in your own environment, shifts the economics further. For sensitive data or high-volume tasks, a private small language model can be cheaper, faster, and easier to govern than calling a frontier API for every request. That's not a reason to abandon frontier models. It's a reason to build a portfolio and route each job to the right tier.

This is, in the end, an engineering discipline rather than a procurement one. Orchestrating models, sending the easy work down the ladder and reserving the expensive reasoning for where it matters, is how AI gets both better and cheaper at once. The organizations that master it won't just spend less. They'll ship more, because their budget stretches across more use cases.