Docker Model Runner General Availability

投稿日: 9月 18, 2025

We’re excited to share that Docker Model Runner is now generally available (GA)! In April 2025, Docker introduced the first Beta release of Docker Model Runner, making it easy to manage, run, and distribute local AI models (specifically LLMs). Though only a short time has passed since then, the product has evolved rapidly, with continuous enhancements driving the product to a reliable level of maturity and stability.

This blog post takes a look back at the most important and widely appreciated capabilities Docker Model Runner brings to developers, and looks ahead to share what they can expect in the near future.

What is Docker Model Runner?

Docker Model Runner (DMR) is built for developers first, making it easy to pull, run, and distribute large language models (LLMs) directly from Docker Hub (in an OCI-compliant format) or HuggingFace (if models are available in the GGUF format, in which case they will be packaged as OCI Artifacts on-the-fly by the HuggingFace backend).

Tightly integrated with Docker Desktop and Docker Engine, DMR lets you serve models through OpenAI-compatible APIs, package GGUF files as OCI artifacts, and interact with them using either the command line, a graphical interface, or developer-friendly (REST) APIs.

Whether you’re creating generative AI applications, experimenting with machine learning workflows, or embedding AI into your software development lifecycle, Docker Model Runner delivers a consistent, secure, and efficient way to work with AI models locally.

Check the official documentation to learn more about Docker Model Runner and its capabilities.

Why Docker Model Runner?

Docker Model Runner makes it easier for developers to experiment and build AI application, including agentic apps, using the same Docker commands and workflows they already use every day. No need to learn a new tool!

Unlike many new AI tools that introduce complexity or require additional approvals, Docker Model Runner fits cleanly into existing enterprise infrastructure. It runs within your current security and compliance boundaries, so teams don’t have to jump through hoops to adopt it.

Model Runner supports OCI-packaged models, allowing you to store and distribute models through any OCI-compatible registry, including Docker Hub. And for teams using Docker Hub, enterprise features like Registry Access Management (RAM) provide policy-based access controls to help enforce guardrails at scale.

11 Docker Model Runner Features Developers Love Most

Below are the features that stand out the most and have been highly valued by the community.

1. Powered by llama.cpp 

Currently, DMR is built on top of llama.cpp, which we plan to continue supporting. At the same time, DMR is designed with flexibility in mind, and support for additional inference engines (such as MLX or vLLM) is under consideration for future releases.

2. GPU acceleration across macOS and Windows platforms 

Harness the full power of your hardware with GPU support: Apple Silicon on macOS, NVIDIA GPUs on Windows, and even ARM/Qualcomm acceleration — all seamlessly managed through Docker Desktop.

3. Native Linux support 

Run DMR on Linux with Docker CE, making it ideal for automation, CI/CD pipelines, and production workflows.

4. CLI and UI experience 

Use DMR from the Docker CLI (on both Docker Desktop and Docker CE) or through Docker Desktop’s UI. The UI provides guided onboarding to help even first-time AI developers start serving models smoothly, with automatic handling of available resources (RAM, GPU, etc.).

MR GA figure 1
MR GA figure 2

Figure 1: Docker Model Runner works both in Docker Desktop and the CLI, letting you run models locally with the same familiar Docker commands and workflows you already know

5. Flexible model distribution 

Pull and push models from Docker Hub in OCI format, or pull directly from HuggingFace repositories hosting models in GGUF format for maximum flexibility in sourcing and sharing models.

6. Open Source and free 

DMR is fully open source and free for everyone, lowering the barrier to entry for developers experimenting with or building on AI.

7. Secure and controlled 

DMR runs in an isolated, controlled environment that doesn’t interfere with the main system or user data (sandboxing). Developers and IT admins can fine-tune security and availability by enabling/disabling DMR or configuring options like host-side TCP support and CORS.

8. Configurable inference settings 

Developers can customize context length and llama.cpp runtime flags to fit their use cases, with more configuration options coming soon.

9. Debugging support 

Built-in request/response tracing and inspect capabilities make it easier to understand token usage and framework/library behaviors, helping developers debug and optimize their applications.

MR GA figure 3

Figure 2: Built-in tracing and inspect tools in Docker Desktop make debugging easier, giving developers clear visibility into token usage and framework behavior

10. Integrated with the Docker ecosystem 

DMR works out of the box with Docker Compose and is fully integrated with other Docker products, such as Docker Offload (cloud offload service) and Testcontainers, extending its reach into both local and distributed workflows.

11. Up-to-date model catalog 

Access a curated catalog of the most popular and powerful AI models on Docker Hub. These models can be pulled for free and used across development, pipelines, staging, or even production environments.

MR GA figure 4

Figure 3: Curated model catalog on Docker Hub, packaged as OCI Artifacts and ready to run

The road ahead

The future is bright for Docker Model Runner, and the recent GA version is only the first milestone. Below are some of the future enhancements that you should expect to be released soon.

Streamlined User Experience 

Our goal is to make DMR simple and intuitive for developers to use and debug. This includes richer response rendering in the chat-like interface within Docker Desktop and the CLI, multimodal support in the UI (already available through the API), integration with MCP tools, and enhanced debugging features, alongside expanded configuration options for greater flexibility. Last but not least, we aim to provide smoother and more seamless integration with third-party tools and solutions across the AI ecosystem.

Enhancements and better ability to execute 

We remain focused on continuously improving DMR’s performance and flexibility for running local models. Upcoming enhancements include support for the most widely used inference libraries and engines, advanced configuration options at the engine and model level, and the ability to deploy Model Runner independently from Docker Engine for production-grade use cases, along with many more improvements on the horizon.

Frictionless Onboarding 

We want first-time AI developers to start building their applications right away, and to do so with the right foundations. To achieve this, we plan to make onboarding into DMR even more seamless. This will include a guided, step-by-step experience to help developers get started quickly, paired with a set of sample applications built on DMR. These samples will highlight real-world use cases and best practices, providing a smooth entry point for experimenting with and adopting DMR in everyday workflows.

Staying on Top of Model Launch 

As we continue to enhance inference capabilities, we remain committed to maintaining a first-class catalog of AI models directly in Docker Hub, the leading registry for OCI artifacts, including models. Our goal is to ensure that new, relevant models are available in Docker Hub and runnable through DMR as soon as they are publicly released.

結論

Docker Model Runner has come a long way in a short time, evolving from its Beta release into a mature and stable inference engine that’s now generally available. At its core, the mission has always been clear: make it simple, consistent, and secure for developers to pull, run, and serve AI models locally,. using familiar Docker CLI commands and tools they already love!

Now is the perfect time to get started. If you haven’t already, install Docker Desktop and try out Docker Model Runner today. Follow the official documentation to explore its capabilities and see for yourself how DMR can accelerate your journey into building AI-powered applications.

さらに詳しく

投稿カテゴリ

関連記事