Together AI Raises $800M at $8.3B Valuation
Together AI raised $800M at an $8.3B valuation on July 1, 2026. What the neocloud round means for open-model AI infrastructure.
On July 1, 2026, Together AI said it had raised $800 million at an $8.3 billion valuation โ a round confirmed by Reuters, TechCrunch, and the company's own BusinessWire release. TechCrunch filed it under "neocloud," and that's the useful frame. This is not a model lab raising to train a bigger foundation model. It's an infrastructure company raising to sell GPUs and inference for models it mostly didn't build.
The headline number isn't the interesting part. The trajectory is. Together AI was reported to have raised its Series B at roughly a $3.3 billion valuation in early 2025. Getting to $8.3 billion in about 16 months โ a 2.5x step-up โ says investors believe demand for serving open-weight models is compounding faster than most people modeling this market assumed a year ago.
What Together AI actually does
Together AI runs a GPU cloud built specifically for open and open-weight models. If you want to serve DeepSeek, Meta's Llama, Z.ai's GLM, MiniMax, Qwen, or a dozen other open models at production scale โ behind an API that looks like everyone else's API โ Together is one of a handful of companies you'd call. The product surface is roughly three things: hosted inference endpoints, fine-tuning and training on rented clusters, and dedicated GPU capacity for teams that have outgrown pay-per-token.
The company's technical credibility is worth noting because it's unusual for an infra provider. Chief scientist Tri Dao is a co-author of FlashAttention and the Mamba architecture โ work that sits underneath a large fraction of how modern models are trained and served efficiently. That research lineage is part of why Together can plausibly claim its inference stack squeezes more tokens per dollar out of the same silicon than a naive deployment would. The BusinessWire release framed the raise around a mission line โ making "frontier AI accessible to all" โ which is marketing, but it points at the real wedge: taking open models that anyone can download and turning them into something an enterprise can actually run without hiring an infra team.
Why the money is chasing "neoclouds" right now
Together isn't alone, and the round only makes sense in the context of its peers. A cluster of GPU-cloud companies โ CoreWeave, Lambda, Crusoe, Nebius โ has raised or gone public over the past 18 months on the same basic thesis: the hyperscalers can't build capacity fast enough, and there's a durable business in renting specialized AI compute to everyone who doesn't want to wait in Azure's or GCP's queue. CoreWeave's public-market debut turned that thesis into a scoreboard number.
What sets Together apart from a pure capacity reseller is the software layer. Anyone can rack H100s or GB200s. Fewer companies can operate a serving stack that keeps those chips busy at high utilization while exposing a clean API and a catalog of pre-optimized open models. That's the part that justifies a software-ish multiple rather than a data-center-REIT multiple โ and it's probably what got this round to $8.3 billion instead of something more modest.
My read: the valuation jump is a bet that open models keep closing the gap with closed frontier models, because that's the only world where a business built on serving open weights compounds this fast.
The open-vs-closed dynamic underneath the round
Here's the part that isn't in the press release but explains the whole thing. Together AI's business is a leveraged bet on open models staying competitive. Every time an open release lands within striking distance of the closed frontier โ and 2026 has produced a steady stream of them, from DeepSeek's aggressive price-cut releases to GLM topping coding benchmarks to strong Qwen and Llama iterations โ the addressable market for "serve this open model in production" grows.
The strategic argument for enterprises choosing open models on a neocloud usually comes down to a few things:
- Cost control. Open-weight inference on rented GPUs can undercut per-token pricing from closed APIs at scale, especially for high-volume, well-defined workloads.
- No lock-in. The weights are portable. If Together's pricing drifts, you can move the same model to another provider or on-prem. You can't do that with a closed API.
- Data governance. Dedicated deployments give regulated industries a cleaner story about where prompts and outputs live.
- Customization. Fine-tuning open weights on proprietary data is a first-class operation, not a black box.
None of that matters if open models are two generations behind. All of it matters if they're one release behind and a fraction of the price. Investors putting Together at $8.3 billion are, implicitly, in the second camp.
What this means for developers and enterprises
Concretely, more capital in a company like Together tends to translate into more GPU capacity contracted, more models onboarded and optimized quickly after release, and continued downward pressure on inference pricing. When a hot open model drops, the race among neoclouds to offer a fast, cheap, day-one endpoint is a large part of what makes that model actually usable for the rest of us. Funding that arms race is good for anyone building on open weights.
The honest caveat: I don't have Together's utilization numbers, gross margins, or customer concentration, and neither does anyone outside the cap table. Neocloud economics are capital-intensive and depend heavily on keeping expensive GPUs busy. A raise this size buys runway and capacity; it doesn't by itself prove the unit economics work at scale. The public reporting confirms the round, not the margins.
How it compares to the closed labs raising in parallel
| Company | What they're selling | What the capital buys |
|---|---|---|
| Together AI | Infra + inference for open models | GPU capacity, serving-stack R&D |
| Closed frontier labs | Proprietary models via API | Training runs, compute, talent |
| Pure GPU resellers | Raw rented compute | Data centers, hardware |
Together sits in the middle column of that spectrum โ more software than a raw reseller, but not carrying the cost or the risk of training a frontier model from scratch. In a year when closed labs are raising staggering sums to fund training runs, the neocloud pitch is almost the inverse: let everyone else spend billions creating the models, and build the toll road they run on. It's a lower-variance position, and this round suggests the market is willing to pay a real premium for it.
The bottom line
Together AI's $800 million at $8.3 billion is a vote of confidence in a specific future: one where open-weight models stay close enough to the frontier that serving them is a very large business, and where enterprises increasingly want the cost, portability, and control that open models on dedicated infrastructure provide. If open models stall relative to closed ones, this valuation looks aggressive in hindsight. If they keep pace โ which is the way 2026 has trended so far โ the neoclouds serving them are exactly where you'd expect the money to go. Watch what Together does with the capacity, and watch how quickly the next big open release shows up as a cheap, fast endpoint. That cadence, more than any funding headline, is the real signal.
Keep reading
Claude Sonnet 5: Anthropic's New Agentic Model
Anthropic launched Claude Sonnet 5 on June 30, 2026 with adaptive thinking, a 1M-token context, and agent-first positioning. Here's what shipped.
xAI Launches Grok Voice Agent Builder Beta
xAI's no-code Grok Voice Agent Builder ships in beta at $0.05/min with unified STT/LLM/TTS and telephony. What it changes for phone agents.
Claude Fable 5 Returns as Export Controls Lift
Anthropic restores global access to Claude Fable 5 and Mythos 5 on July 1 after the US lifted an 18-day export block. Here's what changed.