Laguna XS.2 preview
Laguna XS.2 logo
Coding Free (Apache 2.0)

Laguna XS.2

Open-weight 33B MoE coding model from Poolside AI with only 3B active parameters. Hits 68.2% on SWE-bench Verified and runs locally on a single GPU or Mac with 36GB RAM.

8.5
AI Score / 10
Visit Laguna XS.2

Overview

Laguna XS.2 is the first public open-weight model from Poolside AI, a company that has been quietly building coding-specific foundation models since 2023. Released on April 28, 2026, it uses a Mixture-of-Experts (MoE) architecture — 33 billion total parameters but only 3 billion active at inference time. That efficiency means you can run it locally on a single GPU with 36GB VRAM or even a MacBook Pro with 36GB unified memory, which is remarkable for a model at this benchmark level.

The headline number is 68.2% on SWE-bench Verified, which places Laguna XS.2 in the same ballpark as models several times its size. Poolside achieved this by training exclusively on code-related data — not general web text — and designing the architecture specifically for agentic workflows where the model needs to plan, edit files, run tests, and iterate across multi-step coding tasks. This isn't a general-purpose LLM that happens to code; it's a coding model through and through.

The weights are released under Apache 2.0 with no usage restrictions, and there's free API access through both Poolside's own endpoint and OpenRouter during a limited preview period. For teams that want to self-host a coding model without per-token API costs — or developers who want local inference for privacy reasons — Laguna XS.2 is immediately one of the strongest options available. The main caveat: it's narrowly focused on code, so don't expect strong performance on general knowledge or non-programming tasks.

Key features

Open Weights

Full model weights released under Apache 2.0. Download, fine-tune, and deploy commercially with zero restrictions. Available on Hugging Face.

Agentic Coding

Purpose-built for multi-step coding workflows — file editing, test execution, debugging loops, and long-horizon planning. Designed to work inside agentic harnesses like SWE-agent.

Local Inference

33B MoE with 3B active parameters means it runs on a single GPU (36GB VRAM) or Apple Silicon Mac with 36GB unified memory. No cloud dependency required.

Pricing

Free tier: Fully open-weight under Apache 2.0. Free API access available during limited preview period.

Open Weights Free

Apache 2.0 license, download from Hugging Face, self-host anywhere

Poolside API Free (limited time)

Hosted inference via Poolside's API during preview period

OpenRouter Free (limited time)

Access via OpenRouter API with standard OpenAI-compatible endpoints

Pros & cons

Pros

  • 68.2% SWE-bench Verified — matches models 10x its active parameter count
  • Runs locally on a single consumer GPU or 36GB Mac — no cloud required
  • Apache 2.0 with no restrictions — fine-tune and deploy commercially
  • MoE architecture keeps inference fast and memory-efficient

Cons

  • ×Narrowly focused on code — weak on general knowledge and non-programming tasks
  • ×Brand new release with limited community tooling and integrations so far
  • ×Free API access is explicitly temporary — long-term hosted pricing unclear
  • ×Requires 36GB RAM minimum — won't run on budget laptops or smaller GPUs

How it compares

← More Coding tools