AI is no longer a line item buried in R&D. According to the FinOps Foundation's 2026 State of FinOps report, 98% of respondent organisations now manage AI spend — up from just 31% two years ago. What was an experimental budget has become the fastest-growing, least-governed cost centre in enterprise technology.
Yet most organisations still lack the economic discipline to answer a basic question: Is our AI creating enterprise value — or just enterprise spend?
The answer lies in understanding token economics — the financial mechanics of how large language models consume, price, and bill for the compute that powers every AI interaction.
A token is the fundamental unit of measurement in LLM pricing. It is not a word or a character, but a subword unit the model uses to process text. On average, one token equals roughly four English characters or about 0.75 words. A 1,000-word document typically consumes 1,300–1,500 tokens.
Every major AI provider — OpenAI, Anthropic, Google, Mistral — charges for API usage based on two distinct categories:
- Input tokens (prompt tokens): the text you send to the model, including system prompts, conversation history, and retrieved documents.
- Output tokens (completion tokens): the text the model generates in response.
Output tokens universally cost 3–8× more than input tokens. Generation requires sequential computation — the model must run a full probability calculation for every single output token. This asymmetry is the single most important factor in enterprise cost modelling.
Applications that send long documents for summarisation have an input-dominated cost profile. Applications that generate long-form content have an output-dominated profile. Knowing your ratio before choosing a model is essential.
Enterprise AI costs do not scale like traditional software. They are driven by five interconnected factors that compound at scale:
The most common mistake enterprises make is measuring AI spend at the token level. Tokens are a unit of consumption, not a unit of value. The right KPI is cost per successful business outcome.
| Business Function | Better KPI |
|---|---|
| Customer Support | Cost per resolved ticket |
| Finance | Cost per invoice processed |
| Insurance | Cost per claim adjudicated |
| Sales | Cost per qualified proposal |
| Healthcare | Cost per documented clinical summary |
| Legal | Cost per contract reviewed |
This shift from consumption metrics to outcome metrics is what separates organisations that control AI economics from those that simply watch their bills grow.
Based on our work with enterprise clients, we have identified four pillars of token cost optimisation that consistently deliver 40–60% cost reductions without sacrificing output quality.
McKinsey's analysis of over $3 billion in cloud spend found that most organisations have 10–20% in untapped savings. They estimate $120 billion in global value could be unlocked by embedding cost logic directly into engineering workflows.
The FinOps Foundation's 2025 framework redefined the discipline from "cloud cost management" to "advancing people who manage the value of technology." AI is now explicitly recognised as a primary FinOps scope.
The absolute cost of tokens will continue to fall. But the relative cost of wasteful token usage remains constant. The winners will be organisations that treat tokens as a governed financial asset — not an invisible technical metric.