Moreh vLLM for Tenstorrent
A production-grade LLM serving engine for Tenstorrent Galaxy systems. Optimized MoE LLM execution, vLLM API compatibility, and the serving fundamentals required for data-center deployments.
Built for Production LLM Serving
Kernel/Library-Level MoE Optimization
Custom operation kernels and communication libraries engineered for efficient MoE execution on Galaxy — supporting the latest LLMs including GPT-OSS, Qwen, GLM, and DeepSeek.
vLLM-Compatible API
Drop-in compatibility with the latest vLLM — OpenAI-compatible serving endpoints, Prometheus metrics format, and KV event stream all match vLLM. Reuse existing clients, dashboards, and routers unchanged.
Production Serving Fundamentals
Paged attention, variable-length batching, chunked prefill, and automatic prefix caching — the in-engine techniques required to run modern LLMs at high throughput.
Prefill-Decode Disaggregation
Run prefill and decode on separate workers to scale each phase independently — improving utilization and latency for high-throughput serving.
GPU-Class Performance on Cost-Effective Hardware
Run the same modern LLMs your applications need on Tenstorrent Galaxy with Moreh vLLM — at the throughput production serving demands, on silicon that is fundamentally more cost-efficient than flagship GPU systems. Reference numbers below compare Wormhole Galaxy against 8x A100; Blackhole Galaxy is comparable to more recent GPU generations.
| Model | High-throughput decode (tok/s) | Interactive decode, b=32 (tok/s) | Long-context prefill (tok/s) | |||
|---|---|---|---|---|---|---|
| Wormhole Galaxy | 8x A100 | Wormhole Galaxy | 8x A100 | Wormhole Galaxy | 8x A100 | |
| GPT-OSS 120B | 16,258.12 | 11,806.45 | 1,141.61 | 1,795.25 | 37,055.34 | 38,656.68 |
| Qwen3 235B | 6,992.67 | 6,470.91 | 577.82 | 647.15 | 13,220.94 | 16,037.79 |
Supported Models
Continuously expanding support across the latest open-source LLMs.
Supported Hardware
Deployed as part of Moreh's turnkey Tenstorrent appliances — hardware, networking, and software delivered together.