Zen Max: 671B Reasoning Model

Zen Max is a 671B mixture-of-experts reasoning model. It is currently the most capable model in the Zen MoDE lineup, and one of the highest-performing open-weight models available on any benchmark.

It is also unbiased.

Architecture

671B total parameters, 384 experts, 8 active per forward pass. At inference time, Zen Max activates approximately 14B parameters per token — the compute profile of a mid-size dense model with the knowledge and capability of a 671B one.

This architecture makes Zen Max deployable at frontier quality on configurations that would be impractical for a dense 671B model. 8x H100 SXM handles the full-precision model. 4x H100 with FP8 quantization works for production throughput.

Benchmark Results

Benchmark	Score	Notes
AIME 2025	99.1%	Math olympiad problems
SWE-Bench Verified	71.3%	Real-world software engineering tasks
BrowseComp	60.2%	Web research and fact-finding
GPQA Diamond	81.2%	Graduate-level science questions
Codeforces Rating	2140	Competitive programming

The AIME 2025 result of 99.1% is the number to look at. AIME is the American Invitational Mathematics Examination — problems that a small fraction of mathematically gifted high school students solve correctly. Getting 99.1% requires genuine mathematical reasoning, not pattern matching.

SWE-Bench Verified at 71.3% means Zen Max resolves 71.3% of real GitHub issues from open-source projects when given the repository and the issue description. This is an agentic task requiring understanding of existing code, planning a fix, and implementing it correctly.

Extended Thinking

256K context window with extended reasoning enabled. For complex multi-step problems, Zen Max can think for as long as the problem requires before generating output. The model allocates compute to thinking proportional to problem difficulty — short answers to simple questions, extended reasoning chains for research and hard math.

Agentic Tool Use

Zen Max handles 200-300 sequential tool calls in a single agentic task. This is not a typical generation loop — it is deep autonomous operation: the model issues tool calls, observes results, updates its plan, and continues. Tasks that require browsing dozens of web pages, running hundreds of code experiments, or traversing a large codebase end-to-end are within scope.

Why Unbiased AI Matters for Agents

Autonomous agents operating across the internet, blockchain infrastructure, and distributed systems cannot be built on models that stall unpredictably. When an agent is running a multi-step research task or managing a cross-chain transaction, a mid-workflow refusal doesn't just fail the task — it can corrupt state, leave operations incomplete, and undermine the systems that depend on deterministic behavior.

Zen Max is trained for neutral, consistent reasoning. The model responds the same way to the same input regardless of how it is framed, who is asking, or what topic is involved. This is the scientific standard — reproducibility — applied to AI.

The directional biases introduced during standard fine-tuning are identified and removed from the model weights. All knowledge and capability is preserved. What is removed is the mechanism that caused asymmetric, non-reproducible behavior.

Benchmark scores are measured on the unbiased model. AIME, SWE-Bench, and GPQA scores reflect actual capability.

Get Zen Max

HuggingFace: huggingface.co/zenlm
Hanzo Cloud API: api.hanzo.ai/v1/chat/completions — model zen-max
Zen LM: zenlm.org — deployment guides, hardware requirements

64 SafeTensor shards. Download with hf download zenlm/zen-max.

Zach Kelling is the founder of Hanzo AI, Techstars '17.

Zen Max: 671B Reasoning Model

Architecture

Benchmark Results

Extended Thinking

Agentic Tool Use

Why Unbiased AI Matters for Agents

Get Zen Max

Read more

Zen Math: 72B Mathematical Reasoning at Frontier Scale

Zen4: Unbiased AI Models for Every Scale

Zen Audit: Code Security and Smart Contract Analysis