zoo/ blog
Back to all articles
aimodelszen4open-sourcelaunch

Introducing Zen4: Open Foundation Models from 4B to 1T+

Zen4 is a complete lineup of open AI models spanning from 4B to over 1 trillion parameters, featuring consumer, coder, and ultra tiers.

Zen4 is here. A complete lineup of open-weight AI models spanning edge devices to cloud-scale infrastructure, built by Zen LM and Hanzo AI, Techstars '17.

From a 4B model that runs on your phone to a trillion-parameter powerhouse for the most demanding workloads, Zen4 delivers frontier-class performance at every scale. Every model ships with open weights. No gates, no waitlists.

The Zen4 Consumer Line

The consumer line covers every deployment target from mobile to high-end workstations. All consumer models are available in both Instruct and Thinking variants.

ModelParametersArchitectureBaseContextTarget
Zen4 Mini4BDenseQwen3-4BStandardEdge and mobile devices
Zen48BDenseQwen3-8BStandardStandard desktop inference
Zen4 Pro14BDenseQwen3-14BStandardProfessional workloads
Zen4 Max30B (3B active)MoEQwen3-30B-A3B256KFlagship efficiency
Zen4 Max Pro80B (3B active)MoEQwen3-Next-80B-A3B256KThe best consumer AI

Zen4 Mini fits comfortably on edge hardware and mobile SoCs. Zen4 and Zen4 Pro handle standard and professional workloads on commodity GPUs. The MoE models -- Zen4 Max and Zen4 Max Pro -- deliver outsized capability with only 3B active parameters per forward pass, making them remarkably efficient for their total parameter count. Both support 256K context windows.

The Zen4 Coder Line

Purpose-built for software engineering. The coder line is optimized for code generation, completion, refactoring, and agentic coding workflows.

ModelParametersArchitectureBaseContextTarget
Zen4 Coder Flash31B (3B active)MoEGLM-4.7-Flash131KFast code generation
Zen4 Coder80B (3B active)MoEQwen3-Coder-Next256KFlagship agentic coding

Zen4 Coder Flash is built for speed -- rapid completions, inline suggestions, and fast iteration loops. Zen4 Coder is the flagship: 256K context, full agentic coding support, and deep understanding of complex codebases. Both ship in Instruct and Thinking variants.

The Zen4 Ultra Line (Cloud)

For workloads that demand maximum capability, the Ultra line brings trillion-scale models to cloud deployments.

ModelParametersArchitectureBaseStatus
Zen4 Ultra1.04T (32B active)MoEKimi K2.5 ThinkingAvailable now
Zen4 Ultra MaxTBATBADeepSeek V4Coming soon

Zen4 Ultra activates 32B parameters from a 1.04 trillion parameter mixture-of-experts model. It represents the current ceiling of open-weight model performance. Zen4 Ultra Max, based on DeepSeek V4, is in development.

Instruct and Thinking Variants

Every Zen4 model ships in two variants:

  • Instruct -- Optimized for direct instruction following, chat, and task completion. Low latency, deterministic output.
  • Thinking -- Extended reasoning with chain-of-thought. Better performance on complex multi-step problems, math, and code analysis.

Choose Instruct for production APIs and interactive applications. Choose Thinking when accuracy on hard problems matters more than speed.

Available Formats

All Zen4 models are distributed in multiple formats to fit your deployment stack:

FormatDescriptionUse Case
SafeTensorsNative PyTorch-compatible weightsGPU inference, fine-tuning
GGUF Q4_K_M4-bit quantizedCPU and edge deployment
GGUF Q5_K_M5-bit quantizedBalanced quality and size
GGUF Q6_K6-bit quantizedHigher quality, moderate size
GGUF Q8_08-bit quantizedNear-lossless, larger footprint
GGUF F1616-bit floatFull precision GGUF
MLXApple MLX formatNative Apple Silicon acceleration

Runs Locally on Apple Silicon

Every consumer and coder model in the Zen4 lineup fits on a 64GB M-series Mac. The MoE architectures are particularly well-suited to unified memory -- with only 3B active parameters per forward pass, inference is fast and responsive even on laptop hardware.

The MLX format provides native Apple Silicon acceleration with no external dependencies. Load a model, start inferencing.

Get Zen4

Zen4 models are available now:

  • HuggingFace: huggingface.co/zenlm -- all models, all formats, all variants
  • Zen LM: zenlm.org -- documentation, benchmarks, and guides
  • Hanzo Desktop: Zen4 models are integrated directly into the Hanzo Desktop app for one-click local inference

Built by Zen LM and Hanzo AI

Zen4 is the product of Zen LM and Hanzo AI. We build open foundation models because open weights accelerate the entire field. The best models should be available to everyone -- researchers, engineers, startups, and enterprises alike.

Hanzo AI is Techstars '17. We have been building AI infrastructure since before it was fashionable.

Download Zen4 today. Build something remarkable.