How Llm-d splits up AI model serving on Ubuntu

Juniya Sankara

May 15, 2026

Understanding disaggregated GenAI model serving with llm-d | Ubuntu

Ubuntu Is Getting a Better Way to Run Faster AI Systems
New Tool Lets Large AI Systems Work Faster Without Needs Of Super Expensive Hardware
Open Source System Splits AI Work Across Many Computers

What this is about

Llm-d is a new system that helps maintain the operation of large language model based AI, like ChatGPT and Gemini. The system runs these AI applications a lot faster, so you don’t experience any lag or any waiting times in response to your prompts. It works by splitting up the individual components of the AI model and running each component on dedicated computers arranged in a cluster. The components are then orchestrated by Kubernetes to automatically provision and scale. The system is open source, meaning the code is public and can be used to build upon freely.

Why it matters

This matters to companies running large AI models, and organisations that use and work with AI regularly. Having a new system that helps runs these models faster should make everyone’s working life easier. Llm-d runs on Ubuntu, so anyone running systems on the popular Linux OS should be able to install the software with relative ease using the Juju platform.

Check out some of the open source code and help build on this in the GitHub repository. Share your experience or setbacks in the comments or your development work

Read the original source.

What this is about

Why it matters

Please Share this: