Field noteMar 3, 2026Sam Gaddis

The Orchestration Thesis

Perplexity just launched a $200/month product that coordinates 19 AI models. If the orchestration layer captures the value, what are the model makers actually selling?

Fig / Classical still life with orchestral instruments

Last week Perplexity launched a product called "Computer." It coordinates 19 different AI models to complete complex workflows. Claude handles one subtask. Gemini handles another. Grok picks up a third. The system decides which model gets which job.

It costs $200 a month. The Verge described it as sitting "somewhere between OpenClaw and Claude Cowork."

I want to walk through why I think this matters more than most people realize.

The layer question

Every technology stack has a layer that captures most of the economic value. In cloud computing it was infrastructure (AWS). In mobile it was the app store (Apple, Google). In social it was the feed algorithm (Facebook, TikTok).

In AI, the consensus bet has been on the model layer. OpenAI, Anthropic, and Google have raised a combined $50+ billion on the premise that building the best model wins the market.

Perplexity is betting that's wrong. Their CEO Aravind Srinivas put it directly: "When models specialize, they just become tools similar to the file system, CLI tools, connectors, browser, search."

If he's right, the model makers are building commodities. The orchestrator captures the margin.

What we're seeing in practice

I wrote about the AI Automation Spectrum last fall. The framework runs from Level 0 (manual prompting) to Level 2 (autonomous agents that decide how to accomplish a goal). Perplexity Computer is a Level 2 system available to anyone with a credit card.

Six months ago this was a Series A pitch deck. Now it's a consumer product.

At Runpoint we've been building orchestration layers for clients since mid-2025. A few patterns keep showing up:

The model matters less than the routing. Clients who obsess over which model to use (GPT-5.2 vs. Claude vs. Gemini) are asking the wrong question. The right question is: which model for which subtask, and how do you move between them without losing context?
Specialization is accelerating. Claude is better at code. Gemini is better at multimodal. Grok is faster for simple queries. Six months ago the performance gaps were small. They're widening. This favors orchestration over loyalty to a single provider.
The pricing is moving in one direction. Model inference costs have dropped roughly 10x in the last year. When your raw input costs 90% less than it did twelve months ago, the value shifts to what you do with the output.

The uncomfortable question for model makers

David Sacks wrote in August that the leading models are "clustering around similar performance benchmarks" and "leapfrogging each other with their latest versions." Tyler Cowen called this "Goldilocks" for the industry.

I see it differently. Clustering benchmarks means the models are becoming interchangeable for most tasks. Leapfrogging means no single model holds a durable advantage. Both of those are very good for orchestrators and very bad for model makers trying to justify their valuations.

OpenAI is valued at $300 billion. Anthropic at $61 billion. Those valuations assume the model layer captures most of the value in AI. If Perplexity is right that models are just tools in someone else's orchestration layer, those numbers need to come down.

Maybe they won't. Maybe OpenAI's distribution advantage (200M+ users) protects them regardless of what happens at the model layer. Maybe Anthropic's enterprise relationships create enough switching costs. I'm not sure. But the Perplexity launch is the first time a well-funded company has made the orchestration thesis its entire product strategy, and the early reviews suggest it works.

What this means for companies adopting AI

If you're building an AI strategy around a single model provider, you should reconsider. The companies I work with that are getting the most value from AI are the ones treating models as interchangeable components in a larger system.

Practically, that means:

Abstract your model calls. Don't hardcode OpenAI or Anthropic into your stack. Use a routing layer that can swap models without changing your application code.
Benchmark by task, not by model. Test Claude on your code tasks. Test Gemini on your document tasks. Test GPT on your customer-facing tasks. Use what works where it works.
Watch orchestration costs, not model costs. Model inference is getting cheaper. The expensive part is building and maintaining the logic that connects models to your actual business processes.

The model layer is important. I don't want to overstate this. But the assumption that bigger models automatically mean bigger moats is starting to break down. Perplexity just made that visible to everyone.

The Runpoint Letter

The worldview behind the letter →