Skip to content
  • Product
    • MoAI Inference Framework
    • Moreh vLLM
    • MoAI Training Framework
    • MoAI Platform
  • Solutions

    Infrastructure

    AMD GPU Appliance

    Deploy fully integrated AMD GPU-based cluster systems with scalable RoCE networking

    Tenstorrent Appliance

    Deliver the lowest TCO with inherently scalable, network-integrated chips

    Use Cases

    End-to-End Model Deployment

    Build cost-effective inference endpoints on-premises or in the cloud

    Large Scale Training

    Maximize GPU utilization and reduce training costs at the 1,000+ GPU scale

    Operation

    GPU Virtualization

    Flexible GPU aggregation, decomposition, and scaling with heterogeneous GPU support

    System Reliability

    Automatic GPU failover and diagnostic hardware monitoring

    • AMD GPU Appliance
    • Tenstorrent Appliance
    • End-to-End Model Deployment
    • Large Scale Training
    • GPU Virtualization
    • System Reliability
  • Resources
    • Blog
    • Docs
    • Demo Videos
    • Open Source
  • Career
  • Company
    • About
    • Contact
    • Newsroom

Blog

  • Optimizing Long-Context Prefill on Multiple (Older-Generation) GPU Nodes

    December 26, 2025

    SLOPE Engine improves long-context prefill performance by applying context parallelism across multiple GPU servers. This also helps efficiently utilize older-generation GPUs.

  • Runtime Draft Model Training: Adapting Speculative Decoding to Real-World Workloads

    November 10, 2025

    TIDE provides a method to optimize inference computation on newer GPUs by utilizing older or idle GPUs for runtime draft model training, resulting in better overall cost-performance at the system level.

  • Distributed Inference on Heterogeneous Accelerators Including GPUs, Rubin CPX, and AI Accelerators

    September 23, 2025

    MoAI Inference Framework supports automatic and efficient distributed inference on heterogeneous accelerators such as AMD MI300X + MI308X and NVIDIA Rubin CPX + GPU.

  • DeepSeek V3 and R1 on MoAI: 1. Fine-Tuning on AMD GPU Clusters

    February 20, 2025

    MoAI provides a PyTorch-compatible environment that makes LLM fine-tuning on hundreds of AMD GPUs super easy, including DeepSeek 671B MoE.

  • Introducing Motif: A High-Performance Open-Source Korean LLM by Moreh

    December 2, 2024

    Moreh announces the release of Motif, a high-performance 102B Korean language model (LLM), which will be made available as an open-source model.

  • Fine-tuning Llama 3.1 405B on AMD GPUs

    September 3, 2024

    There are no barriers to fine-tune Llama 3.1 405B on the MoAI platform. The Moreh team has actually demonstrated fine-tuning on the model with 192 AMD GPUs.

  • GPU Virtualization in the MoAI Platform

    August 19, 2024

    The MoAI platform provides comprehensive GPU virtualization including fine-grained resource allocation, multi-GPU scaling, and heterogeneous GPU support.

  • Training 221B Parameter Korean LLM on 1,200 AMD MI250 GPU Cluster

    August 14, 2023

    Moreh trained a largest-ever Korean LLM with 221B parameters on top of the MoAI platform and an 1,200 AMD MI250 cluster system.

  • KT’s Success Stories in AI Cloud Service and Large AI Model Training on AMD Instinct MI250 and Moreh AI Platform

    November 11, 2022

    KT has collaborated with Moreh and AMD to overcome the challenges in public cloud services and in-house AI model development.

Moreh, Inc.

  • Home
  • About
  • Career
  • Contact
  • Docs
  • Blog
  • Newsroom
  • Privacy Policy
  • Terms of Use
  • Home
  • About
  • Career
  • Contact
  • Docs
  • Blog
  • Newsroom

© 2026 Moreh, Inc. All right reserved.

Page load link
Go to Top