zoo/ blog
Back to all articles
aimodelszenreasoninglaunchmoeunbiasedzen-mode

Zen Max: 671B Reasoning Model

Zen Max is a 671B MoE reasoning model with 384 experts, 256K context, and unbiased weights — achieving AIME 2025 99.1%, SWE-Bench 71.3%, and BrowseComp 60.2%. Built for agents, researchers, and infrastructure that needs neutral AI.

Zen Max is a 671B mixture-of-experts reasoning model. It is currently the most capable model in the Zen MoDE lineup, and one of the highest-performing open-weight models available on any benchmark.

It is also unbiased.

Architecture

671B total parameters, 384 experts, 8 active per forward pass. At inference time, Zen Max activates approximately 14B parameters per token — the compute profile of a mid-size dense model with the knowledge and capability of a 671B one.

This architecture makes Zen Max deployable at frontier quality on configurations that would be impractical for a dense 671B model. 8x H100 SXM handles the full-precision model. 4x H100 with FP8 quantization works for production throughput.

Benchmark Results

BenchmarkScoreNotes
AIME 202599.1%Math olympiad problems
SWE-Bench Verified71.3%Real-world software engineering tasks
BrowseComp60.2%Web research and fact-finding
GPQA Diamond81.2%Graduate-level science questions
Codeforces Rating2140Competitive programming

The AIME 2025 result of 99.1% is the number to look at. AIME is the American Invitational Mathematics Examination — problems that a small fraction of mathematically gifted high school students solve correctly. Getting 99.1% requires genuine mathematical reasoning, not pattern matching.

SWE-Bench Verified at 71.3% means Zen Max resolves 71.3% of real GitHub issues from open-source projects when given the repository and the issue description. This is an agentic task requiring understanding of existing code, planning a fix, and implementing it correctly.

Extended Thinking

256K context window with extended reasoning enabled. For complex multi-step problems, Zen Max can think for as long as the problem requires before generating output. The model allocates compute to thinking proportional to problem difficulty — short answers to simple questions, extended reasoning chains for research and hard math.

Agentic Tool Use

Zen Max handles 200-300 sequential tool calls in a single agentic task. This is not a typical generation loop — it is deep autonomous operation: the model issues tool calls, observes results, updates its plan, and continues. Tasks that require browsing dozens of web pages, running hundreds of code experiments, or traversing a large codebase end-to-end are within scope.

Why Unbiased AI Matters for Agents

Autonomous agents operating across the internet, blockchain infrastructure, and distributed systems cannot be built on models that stall unpredictably. When an agent is running a multi-step research task or managing a cross-chain transaction, a mid-workflow refusal doesn't just fail the task — it can corrupt state, leave operations incomplete, and undermine the systems that depend on deterministic behavior.

Zen Max is trained for neutral, consistent reasoning. The model responds the same way to the same input regardless of how it is framed, who is asking, or what topic is involved. This is the scientific standard — reproducibility — applied to AI.

The directional biases introduced during standard fine-tuning are identified and removed from the model weights. All knowledge and capability is preserved. What is removed is the mechanism that caused asymmetric, non-reproducible behavior.

Benchmark scores are measured on the unbiased model. AIME, SWE-Bench, and GPQA scores reflect actual capability.

Get Zen Max

  • HuggingFace: huggingface.co/zenlm
  • Hanzo Cloud API: api.hanzo.ai/v1/chat/completions — model zen-max
  • Zen LM: zenlm.org — deployment guides, hardware requirements

64 SafeTensor shards. Download with hf download zenlm/zen-max.


Zach Kelling is the founder of Hanzo AI, Techstars '17.