zoo/ blog
Back to all articles
aicodingzen4moelaunchhanzo.ai

Zen4 Coder: 480B Parameters, 92% Fewer Active — The MoE Code Model

Hanzo AI launches Zen4 Coder, a 480B-parameter Mixture of Experts code model that activates only 35B parameters per token — delivering frontier code intelligence at a fraction of the compute cost.

Dense models waste 90% of their parameters on every token. For code — where knowledge is highly structured across syntax, semantics, type systems, and natural language — this waste is especially acute.

Today we're launching Zen4 Coder, a 480-billion parameter Mixture of Experts model that activates only 35B parameters per forward pass. It knows as much as a 480B model. It costs as much as a 35B model.

Three Models, Every Code Workflow

Zen4 Coder — The general-purpose code model.

  • 480B total / 35B active (MoE)
  • 163K context window
  • All major programming languages
  • Agentic tool use and multi-step reasoning

Zen4 Coder Pro — Maximum accuracy.

  • 480B parameters, full BF16 precision (Dense)
  • 131K context window
  • For complex multi-file refactoring and large codebase navigation
  • When you need the best answer, not the fastest

Zen4 Coder Flash — Real-time code assistance.

  • 30B total / 3B active (MoE)
  • 262K context window
  • Inline completions, tab-complete, and code chat
  • Sub-100ms latency for interactive workflows

Why MoE Architecture Matters for Code

The Mixture of Distilled Experts (MoDE) architecture routes each token to specialized expert subnetworks. In Zen4 Coder, this creates dedicated pathways for:

  • Syntax experts — Language-specific grammar and structure
  • Semantic experts — Meaning, intent, and program logic
  • Type inference experts — Type systems, generics, and constraints
  • Documentation experts — Natural language in comments, docstrings, and explanations

When generating Python, the Python syntax experts activate alongside the semantic reasoning experts. The Java type system experts stay dormant. This selective activation is what makes a 480B model run like a 35B model.

Context Windows Built for Real Codebases

The 163K context window on Zen4 Coder fits:

  • A typical microservice codebase (50-100 files)
  • An entire Go module or Rust crate
  • A full React application with components, hooks, and state management

The 262K window on Coder Flash handles even larger workloads for code search and navigation.

The 131K window on Coder Pro is optimized for deep analysis — trading context length for maximum accuracy on complex reasoning tasks.

Available Now

All three Zen4 Coder models are available through the Hanzo AI Gateway at hanzo.ai. Use the standard OpenAI-compatible API:

curl https://llm.hanzo.ai/v1/chat/completions \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{
    "model": "zen4-coder",
    "messages": [{"role": "user", "content": "Refactor this function to use async/await..."}]
  }'

Same API, same key, same billing as every other model on the platform.