Use Claude Code with Docker Model Runner

We recently showed how to pair OpenCode with Docker Model Runner for a privacy-first, cost-effective AI coding setup. Today, we’re bringing the same approach to Claude Code, Anthropic’s agentic coding tool.

This post walks through how to configure Claude Code to use Docker Model Runner, giving you full control over your data, infrastructure, and spend.

Figure 1: Using local models like gpt-oss to power Claude Code

What Is Claude Code?

Claude Code is Anthropic’s command-line tool for agentic coding. It lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows through natural language commands.

Docker Model Runner (DMR) allows you to run and manage large language models locally. It exposes an Anthropic-compatible API, making it straightforward to integrate with tools like Claude Code.

Install Claude Code

Install Claude Code:
macOS / Linux:

curl -fsSL https://claude.ai/install.sh | bash

Windows PowerShell:

irm https://claude.ai/install.ps1 | iex

Using Claude Code with Docker Model Runner

Claude Code supports custom API endpoints through the ANTHROPIC_BASE_URL environment variable. Since Docker Model Runner exposes an Anthropic-compatible API, integrating the two is simple.

Note for Docker Desktop users:
If you are running Docker Model Runner via Docker Desktop, make sure TCP access is enabled:

docker desktop enable model-runner --tcp

Once enabled, Docker Model Runner will be accessible at http://localhost:12434.

Increasing Context Size

For coding tasks, context length matters. While models like glm-4.7-flash, qwen3-coder and devstral-small-2 come with 128K context by default, gpt-oss defaults to 4,096 tokens.

Docker Model Runner makes it easy to repackage any model with an increased context size:

docker model pull gpt-oss
docker model package --from ai/gpt-oss --context-size 32000 gpt-oss:32k

Once packaged, use it with Claude Code:

ANTHROPIC_BASE_URL=http://localhost:12434 claude --model gpt-oss:32k

ANTHROPIC_BASE_URL=http://localhost:12434 claude --model gpt-oss "Describe this repo."

That’s it. Claude Code will now send all requests to your local Docker Model Runner instance.

Run Claude Code locally with gpt-oss using Docker Model Runner

Here’s what it looks like in action:

Figure 2: Claude Code powered by Docker Model Runner keeps everything local and in your control

$ docker model pull gpt-oss

$ docker model package --from ai/gpt-oss --context-size 32000 gpt-oss:32k

$ ANTHROPIC_BASE_URL=http://localhost:12434 claude --model gpt-oss:32k "Describe this repo."

 ▐▛███▜▌   Claude Code v2.1.12
▝▜█████▛▘  gpt-oss:32k · API Usage Billing
  ▘▘ ▝▝    ~/workspace/model-runner

❯ Describe this repo.

⏺ Read(README.md)
  ⎿  Read 423 lines

⏺ Docker Model Runner (DMR)
  - A Go-based tool that lets developers pull, run, and serve AI models 
    directly from Docker Hub or any OCI-compatible registry.
  - Consists of two main components:
    a. model-runner – a server/daemon that manages model lifecycle, 
       inference, and metrics.
    b. model-cli – a CLI plugin (docker model) that communicates with 
       the server, listing models, pulling new ones, and executing inference.

  - Supports multiple inference backends: llama.cpp, vLLM, and others.
  - Built with Docker Desktop in mind, but also works with Docker Engine 
    on Linux.
  - Provides a REST API for model management and inference, plus a 
    /metrics endpoint for Prometheus-style monitoring.

Claude Code reads your repository, reasons about its structure, and provides an accurate summary, all while keeping your code entirely on your local machine.

Monitor the requests sent by Claude Code

Want to see exactly what Claude Code sends to Docker Model Runner? Use the docker model requests command:

docker model requests --model gpt-oss:32k | jq .

Figure 3: Monitor requests sent by Claude Code to the LLM

This outputs the raw requests, which is useful for understanding how Claude Code communicates with the model and debugging any compatibility issues.

Making It Persistent
For convenience, set the environment variable in your shell profile:

# Add to ~/.bashrc, ~/.zshrc, or equivalent
export ANTHROPIC_BASE_URL=http://localhost:12434

Then simply run:

claude --model gpt-oss:32k "Describe this repo."

How You Can Get Involved

The strength of Docker Model Runner lies in its community, and there’s always room to grow. To get involved:

Star the repository: Show your support by starring the Docker Model Runner repo.
Contribute your ideas: Create an issue or submit a pull request. We’re excited to see what ideas you have!
Spread the word: Tell your friends and colleagues who might be interested in running AI models with Docker.

We’re incredibly excited about this new chapter for Docker Model Runner, and we can’t wait to see what we can build together. Let’s get to work!

Learn More

Read the companion post: OpenCode with Docker Model Runner for Private AI Coding
Check out the Docker Model Runner General Availability announcement
Visit our Model Runner GitHub repo
Get started with a simple hello GenAI application
Learn more about Claude Code from Anthropic’s documentation

Run Claude Code Locally with Docker Model Runner

What Is Claude Code?

Install Claude Code

Using Claude Code with Docker Model Runner

Increasing Context Size

Run Claude Code locally with gpt-oss using Docker Model Runner

Monitor the requests sent by Claude Code

How You Can Get Involved

Learn More

Related Posts

Announcing Docker Hardened System Packages

Docker Model Runner Brings vLLM to macOS with Apple Silicon

Open WebUI + Docker Model Runner: Self-Hosted Models, Zero Configuration

Products

Features

Developers

Pricing

Company

Languages