Building Block

Moreh vLLM for Tenstorrent

A production-grade LLM serving engine for Tenstorrent Galaxy systems. Optimized MoE LLM execution, vLLM API compatibility, and the serving fundamentals required for data-center deployments.

Capabilities

Built for Production LLM Serving

Kernel/Library-Level MoE Optimization

Custom operation kernels and communication libraries engineered for efficient MoE execution on Galaxy — supporting the latest LLMs including GPT-OSS, Qwen, GLM, and DeepSeek.

vLLM-Compatible API

Drop-in compatibility with the latest vLLM — OpenAI-compatible serving endpoints, Prometheus metrics format, and KV event stream all match vLLM. Reuse existing clients, dashboards, and routers unchanged.

Production Serving Fundamentals

Paged attention, variable-length batching, chunked prefill, and automatic prefix caching — the in-engine techniques required to run modern LLMs at high throughput.

Prefill-Decode Disaggregation

Run prefill and decode on separate workers to scale each phase independently — improving utilization and latency for high-throughput serving.

Performance

GPU-Class Performance on Cost-Effective Hardware

Run the same modern LLMs your applications need on Tenstorrent Galaxy with Moreh vLLM — at the throughput production serving demands, on silicon that is fundamentally more cost-efficient than flagship GPU systems. Reference numbers below compare Wormhole Galaxy against 8x A100; Blackhole Galaxy is comparable to more recent GPU generations.

ModelHigh-throughput decode (tok/s)Interactive decode, b=32 (tok/s)Long-context prefill (tok/s)
Wormhole Galaxy8x A100Wormhole Galaxy8x A100Wormhole Galaxy8x A100
GPT-OSS 120B16,258.1211,806.451,141.611,795.2537,055.3438,656.68
Qwen3 235B6,992.676,470.91577.82647.1513,220.9416,037.79
Models

Supported Models

Continuously expanding support across the latest open-source LLMs.

GPT-OSSGPT-OSSQwenQwenGLMGLMDeepSeekDeepSeekLlamaLlama
Hardware

Supported Hardware

Deployed as part of Moreh's turnkey Tenstorrent appliances — hardware, networking, and software delivered together.

Wormhole GalaxyBlackhole Galaxy