Laguna XS.2
Open-weight 33B MoE coding model from Poolside AI with only 3B active parameters. Hits 68.2% on SWE-bench Verified and runs locally on a single GPU or Mac with 36GB RAM.
Overview
Laguna XS.2 is the first public open-weight model from Poolside AI, a company that has been quietly building coding-specific foundation models since 2023. Released on April 28, 2026, it uses a Mixture-of-Experts (MoE) architecture — 33 billion total parameters but only 3 billion active at inference time. That efficiency means you can run it locally on a single GPU with 36GB VRAM or even a MacBook Pro with 36GB unified memory, which is remarkable for a model at this benchmark level.
The headline number is 68.2% on SWE-bench Verified, which places Laguna XS.2 in the same ballpark as models several times its size. Poolside achieved this by training exclusively on code-related data — not general web text — and designing the architecture specifically for agentic workflows where the model needs to plan, edit files, run tests, and iterate across multi-step coding tasks. This isn't a general-purpose LLM that happens to code; it's a coding model through and through.
The weights are released under Apache 2.0 with no usage restrictions, and there's free API access through both Poolside's own endpoint and OpenRouter during a limited preview period. For teams that want to self-host a coding model without per-token API costs — or developers who want local inference for privacy reasons — Laguna XS.2 is immediately one of the strongest options available. The main caveat: it's narrowly focused on code, so don't expect strong performance on general knowledge or non-programming tasks.
Key features
Open Weights
Full model weights released under Apache 2.0. Download, fine-tune, and deploy commercially with zero restrictions. Available on Hugging Face.
Agentic Coding
Purpose-built for multi-step coding workflows — file editing, test execution, debugging loops, and long-horizon planning. Designed to work inside agentic harnesses like SWE-agent.
Local Inference
33B MoE with 3B active parameters means it runs on a single GPU (36GB VRAM) or Apple Silicon Mac with 36GB unified memory. No cloud dependency required.
Pricing
Free tier: Fully open-weight under Apache 2.0. Free API access available during limited preview period.
| Plan | Price | What's included |
|---|---|---|
| Open Weights | Free | Apache 2.0 license, download from Hugging Face, self-host anywhere |
| Poolside API | Free (limited time) | Hosted inference via Poolside's API during preview period |
| OpenRouter | Free (limited time) | Access via OpenRouter API with standard OpenAI-compatible endpoints |
Apache 2.0 license, download from Hugging Face, self-host anywhere
Hosted inference via Poolside's API during preview period
Access via OpenRouter API with standard OpenAI-compatible endpoints
Pros & cons
Pros
- ✓68.2% SWE-bench Verified — matches models 10x its active parameter count
- ✓Runs locally on a single consumer GPU or 36GB Mac — no cloud required
- ✓Apache 2.0 with no restrictions — fine-tune and deploy commercially
- ✓MoE architecture keeps inference fast and memory-efficient
Cons
- ×Narrowly focused on code — weak on general knowledge and non-programming tasks
- ×Brand new release with limited community tooling and integrations so far
- ×Free API access is explicitly temporary — long-term hosted pricing unclear
- ×Requires 36GB RAM minimum — won't run on budget laptops or smaller GPUs


