Coding Free API tier + open weights (Apache 2.0)

Command A+

Cohere's open-source 218B Mixture-of-Experts LLM built for agentic coding workflows, multilingual tasks, and document processing — runs on as few as 2 H100s.

Updated 2026-05-26

8.5

AI Score / 10

Visit Command A+

Overview

Command A+ is Cohere's flagship open-weight model, a 218-billion parameter Mixture-of-Experts architecture released under Apache 2.0 in May 2026. The MoE design activates only a subset of parameters per inference pass, which is how Cohere gets a model this large running on just two H100 GPUs — a significant hardware efficiency advantage over similarly-sized dense models. It's built from the ground up for agentic workflows: multi-step tool use, code generation, retrieval-augmented generation, and structured document processing.

What sets Command A+ apart from generic LLMs in the coding category is Cohere's focus on enterprise agentic use cases. The model is designed to chain tool calls, parse complex documents with mixed modalities (text, tables, charts), and operate across 23+ languages. For teams building AI agents that need to reason over codebases, process documentation, or orchestrate multi-step workflows, this is purpose-built rather than adapted after the fact.

The open-weight angle is the real differentiator. While GPT-4 and Claude are API-only, Command A+ can be self-hosted — critical for regulated industries or sovereign AI deployments. Quantized versions on Hugging Face make it accessible on smaller GPU setups. The trade-off: Cohere's ecosystem is thinner than OpenAI's or Anthropic's, and the model's raw conversational ability doesn't match the polish of ChatGPT or Claude for general chat use.

Key features

Agentic Coding

Purpose-built for multi-step agentic workflows: tool chaining, code generation, debugging, and structured output. Benchmarks show strong performance on agentic coding tasks compared to models in its class.

MoE Architecture

218B total parameters using Mixture-of-Experts, activating only a fraction per forward pass. This enables frontier-class capability while running on as few as 2 H100 GPUs — dramatically lower hardware requirements than dense models of similar quality.

Multilingual

Supports 23+ languages with strong performance across them, making it suitable for global enterprise deployments and sovereign AI initiatives where local language support is non-negotiable.

Document Processing

Multimodal document understanding handles mixed content including text, tables, charts, and structured data. Designed for RAG pipelines and enterprise knowledge extraction workflows.

Pricing

Free tier: Free API tier with rate limits for development; open weights downloadable from Hugging Face under Apache 2.0

Plan	Price	What's included
Cohere API — Free	Free	Rate-limited access for prototyping and evaluation
Cohere API — Production	Check website for current pricing	Higher rate limits, production SLAs, enterprise support available
Self-hosted	Free (Apache 2.0)	Open weights on Hugging Face. Quantized versions available. Minimum 2x H100 GPUs for full model

Cohere API — Free Free

Rate-limited access for prototyping and evaluation

Cohere API — Production Check website for current pricing

Higher rate limits, production SLAs, enterprise support available

Self-hosted Free (Apache 2.0)

Open weights on Hugging Face. Quantized versions available. Minimum 2x H100 GPUs for full model

Pros & cons

Pros

✓Open-source Apache 2.0 license allows self-hosting, fine-tuning, and full data sovereignty
✓218B MoE runs on just 2 H100s — exceptional hardware efficiency for a model of this capability
✓Strong agentic and tool-use benchmarks make it a serious option for AI agent builders
✓23+ language support positions it well for global and sovereign AI deployments

Cons

×Cohere's developer ecosystem is much smaller than OpenAI's or Anthropic's — fewer integrations and community resources
×General chat and creative writing quality trails behind ChatGPT and Claude
×Self-hosting still requires H100-class hardware — not accessible to hobbyists or small teams without cloud GPU budgets
×Production API pricing is not clearly published — requires contacting sales or checking the console

How it compares

Tool	Best for	Pricing	Score
Command A+	—	Free API tier + open weights (Apache 2.0)	8.5/10
Cursor vs Cursor →	—	Freemium	9.5/10
GPT-5.5 vs GPT-5.5 →	—	API: $5/$30 per 1M tokens (in/out). ChatGPT Plus $20/mo, Pro $200/mo	9.4/10
Windsurf	—	Freemium	9.1/10