Running large language models (LLMs) on your local machine is one of the most exciting frontiers in AI development. At Docker, our goal is to make this process as simple and accessible as possible. That’s why we built Docker Model Runner, a tool to help you download and run LLMs with a single command.
Until now, GPU-accelerated inferencing with Model Runner was limited to CPU, NVIDIA GPUs (via CUDA), and Apple Silicon (via Metal). Today, we’re thrilled to announce a major step forward in democratizing local AI: Docker Model Runner now supports Vulkan!
This means you can now leverage hardware acceleration for LLM inferencing on a much wider range of GPUs, including integrated GPUs and those from AMD, Intel, and other vendors that support the Vulkan API.
Why Vulkan Matters: AI for Everyone’s GPU
So, what’s the big deal about Vulkan?
Vulkan is a modern, cross-platform graphics and compute API. Unlike CUDA, which is specific to NVIDIA GPUs, or Metal, which is for Apple hardware, Vulkan is an open standard that works across a huge range of graphics cards. This means if you have a modern GPU from AMD, Intel, or even an integrated GPU on your laptop, you can now get a massive performance boost for your local AI workloads.
By integrating Vulkan (thanks to our underlying llama.cpp engine), we’re unlocking GPU-accelerated inferencing for a much broader community of developers and enthusiasts. More hardware, more speed, more fun!
Getting Started: It Just Works
The best part? You don’t need to do anything special to enable it. We believe in convention over configuration. Docker Model Runner automatically detects compatible Vulkan hardware and uses it for inferencing. If a Vulkan-compatible GPU isn’t found, it seamlessly falls back to CPU.
Ready to give it a try? Just run the following command in your terminal:
docker model run ai/gemma3
This command will:
Pull the Gemma 3 model.
Detect if you have a Vulkan-compatible GPU with the necessary drivers installed.
Run the model, using your GPU to accelerate the process.
It’s that simple. You can now chat with a powerful LLM running directly on your own machine, faster than ever.
Join Us and Help Shape the Future of Local AI!
Docker Model Runner is an open-source project, and we’re building it in the open with our community. Your contributions are vital as we expand hardware support and add new features.
Head over to our GitHub repository to get involved:
https://github.com/docker/model-runner
Please star the repo to show your support, fork it to experiment, and consider contributing back with your own improvements.

さらに詳しく
- Check out the Docker Model Runner General Availability announcement
- Model Runner GitHub リポジトリにアクセスしてください。Docker Model Runner はオープンソースであり、コミュニティからのコラボレーションと貢献を歓迎します。
- Get started with Model Runner with a simple hello GenAI application