2025 | Trends in Artificial Intelligence (145/340)

144 To understand the surge in AI developer activity, it’s instructive to look at the extraordinary drop in inference costs and the growing accessibility of capable models. Between 2022 and 2024, the cost-per-token to run language models fell by an estimated 99.7% – a decline driven by massive improvements in both hardware and algorithmic efficiency. What was once prohibitively expensive for all but the largest companies is now within reach of solo developers, independent app builders, researchers on a laptop, and mom-and-pop shop employees. The cost collapse has made experimentation cheap, iteration fast, and productization feasible for virtually anyone with an idea. At the same time, performance convergence is shifting the calculus on model selection. The gap between the top-performing frontier models and smaller, more efficient alternatives is narrowing. For many use cases – summarization, classification, extraction, or routing – the difference in real-world performance is negligible. Developers are discovering they no longer need to pay a premium for a top-tier model to get reliable outputs. Instead, they can run cheaper models locally or via lower-cost API providers and achieve functionally similar results, especially when fine-tuned on task-specific data. This shift is weakening the pricing leverage of model incumbents and leveling the playing field for AI development… AI Model Compute Costs High / Rising + Inference Costs Per Token Falling = Performance Converging + Developer Usage Rising

2025 | Trends in Artificial Intelligence - Page 145

2025 | Trends in Artificial Intelligence Page 144 Page 146