The full Zen4 family is available today. Eight models spanning 4B to 1T parameters, every scale covered, all unbiased.
No artificial constraints. Full capability access. Open weights.
The Complete Zen4 Lineup
| Model | Parameters | Active Params | Architecture | Context |
|---|---|---|---|---|
| Zen4 Mini | 4B | 4B | Dense | 128K |
| Zen4 Base | 8B | 8B | Dense | 128K |
| Zen4 Pro | 14B | 14B | Dense | 128K |
| Zen4 Max | 30B | 3B | MoE | 256K |
| Zen4 Coder Flash | 31B | 3B | MoE | 131K |
| Zen4 Pro Max | 80B | 3B | MoE | 256K |
| Zen4 Coder | 80B | 3B | MoE | 256K |
| Zen4 Ultra | 1T | 64B | MoE | 256K |
Dense models for predictable workloads. MoE models for frontier capability at controlled compute cost. Every model is available on HuggingFace at zenlm/zen4-*.
Why Unbiased AI?
AI models are most useful when they reason and respond without built-in opinion filters. The internet is open. Blockchains are permissionless. Quantum computation doesn't care about policy. Agents operating autonomously across networks cannot afford to stall on arbitrary refusal heuristics.
Zen4 models are trained to reason about any topic the same way a scientific instrument measures: accurately, neutrally, without agenda. The result is a family of models well-suited to:
- Agentic systems that must operate reliably across diverse, unpredictable inputs
- Security and research applications that require complete information access
- Blockchain and decentralized infrastructure where no central party decides what is "allowed"
- Medical, legal, and technical domains where AI gatekeeping creates patient and client risk
- Scientific inquiry where the answer has to follow the evidence, not a policy document
The Science
Standard fine-tuning installs refusal behavior by teaching models a set of internal representations — directional activations in the residual stream — that fire when they encounter certain inputs. This behavior is inconsistently applied, politically motivated, and brittle: models refuse clearly legitimate requests while complying with semantically identical ones phrased differently.
We identify and remove these directional biases from the model weights using a process called directional ablation — essentially subtracting the learned "refusal direction" from every weight matrix that contributes to it. The model retains all knowledge and capability. What is removed is the mechanism that introduced asymmetric, non-neutral behavior in the first place.
We measure benchmark performance before and after on every model. The differences are within noise.
The responsibility for appropriate use stays where it belongs: with the person using the model.
Zen4 Ultra: 1 Trillion Parameters
The largest model in the family is Zen4 Ultra: 1 trillion parameters, 64B active, 64 SafeTensor shards, 256K context.
Performance:
| Benchmark | Score |
|---|---|
| AIME 2025 | 99.1% |
| SWE-Bench Verified | 71.3% |
| GPQA Diamond | 83.4% |
| Codeforces Rating | 2155 |
Zen4 Ultra activates 64B parameters per forward pass from a 1T pool, making it tractable on multi-GPU configurations that could not serve a dense 1T model. 8x H100 SXM handles full precision. FP8 quantization brings it to 4x H100.
Zen4 MoE Models
The Max, Pro Max, Coder, and Coder Flash models all use MoE architecture with 3B active parameters. This makes them particularly efficient:
- Zen4 Max (30B/3B): Fits on a single A100 or M2 Max in MLX. Remarkable quality for its inference cost.
- Zen4 Coder Flash (31B/3B): 131K context, optimized for fast code generation. Lower latency than Zen4 Coder at the cost of some depth on complex problems.
- Zen4 Pro Max (80B/3B): The best general-purpose consumer model in the lineup. Runs on 2x A100 or a Mac Studio with 192GB unified memory.
- Zen4 Coder (80B/3B): 256K context, full agentic coding support.
Formats
All models ship in SafeTensors, GGUF (Q4_K_M through F16), and MLX for Apple Silicon. The GGUF Q4_K_M quantizations of the dense models (Mini, Base, Pro) fit on any modern laptop.
Get Zen4
All models are available now:
- HuggingFace: huggingface.co/zenlm
- Hanzo Cloud:
api.hanzo.ai/v1/chat/completions— all Zen4 models available - Hanzo Desktop: One-click install for every consumer and coder model
- Zen LM: zenlm.org — benchmarks, deployment guides, hardware requirements
# Download any model with hf CLI
hf download zenlm/zen4-ultra
hf download zenlm/zen4-pro
hf download zenlm/zen4-mini --include "*.gguf"Built by Zen LM and Hanzo AI, Techstars '17. Open weights, no gates, no waitlists.
Zach Kelling is the founder of Hanzo AI, Techstars '17.
Read more
Introducing Zen4: Open Foundation Models from 4B to 1T+
Zen4 is a complete lineup of open AI models spanning from 4B to over 1 trillion parameters, featuring consumer, coder, and ultra tiers.
Zen Max: 671B Reasoning Model
Zen Max is a 671B MoE reasoning model with 384 experts, 256K context, and unbiased weights — achieving AIME 2025 99.1%, SWE-Bench 71.3%, and BrowseComp 60.2%. Built for agents, researchers, and infrastructure that needs neutral AI.
zen4-ultra: A Trillion-Parameter AI Model, Open and Free
zen4-ultra brings Kimi K2.5 — a 1.04 trillion-parameter Mixture of Experts model — to the Zen AI family. Open weights, 256K context, 71% SWE-bench. Available now on HuggingFace.