IBM Granite 4.0 Models Now Available on Docker Hub

投稿日 10月 6, 2025

Developers can now discover and run IBM’s latest open-source Granite 4.0 language models from the Docker Hub model catalog, and start building in minutes with Docker Model Runner. Granite 4.0 pairs strong, enterprise-ready performance with a lightweight footprint, so you can prototype locally and scale confidently.

The Granite 4.0 family is designed for speed, flexibility, and cost-effectiveness, making it easier than ever to build and deploy generative AI applications.

About Docker Hub

Docker Hub is the world’s largest registry for containers, trusted by millions of developers to find and share high-quality container images at scale. Building on this legacy, it is now also becoming a go-to place for developers to discover, manage, and run local AI models. Docker Hub hosts our curated local AI model collection, packaged as OCI Artifacts and ready to run. You can easily download, share, and upload models on Docker Hub, making it a central hub for both containerized applications and the next wave of generative AI.

Why Granite 4.0 on Docker Hub matters

Granite 4.0 isn’t just another set of language models. It introduces a next-generation hybrid architecture that delivers incredible performance and efficiency, even when compared to larger models.

Hybrid architecture. Granite 4.0 cleverly combines the linear-scaling efficiency Mamba-2 with the precision of transformers. Select models also leverage a Mixture of Experts (MoE) strategy – instead of using the entire model for every task, it only activates the necessary “experts”, or subssets of parameters. This results in faster processing and memory usage reductions of more than 70% compared to similarly sized traditional models.

“Theoretically Unconstrained” Context. By removing positional encoding, Granite 4.0 can process incredibly long documents, with context lengths tested up to 128,000 tokens. Context length is limited only by your hardware, opening up powerful use cases for document analysis and Retrieval-Augmented Generation (RAG).
Fit-for-Purpose Sizes. The family includes several sizes, from the 3B parameter Micro models to the 32B parameter Small model, allowing you to pick the perfect balance of performance and resource usage for your specific need

What’s in the Granite 4.0 family

Sizes and targets (8-bit, batch=1, 128K context):
H-Small (32B total, ~9B active): Workhorse for RAG and agents; runs on L4-class GPUs.
H-Tiny (7B total, ~1B active): Latency-friendly for edge/local; consumer-grade GPUs like RTX 3060.
H-Micro (3B, dense): Ultra-light for on-device and concurrent agents; extremely low RAM footprint.
Micro (3B, dense): Traditional dense option when Mamba-2 support isn’t available.
In practice, these footprints mean you can run capable models on accessible hardware – a big win for local development and iterative agent design.

Run in seconds with Docker Model Runner

Docker Model Runner gives you a portable, reproducible way to run local models with an OpenAI-compatible API from laptop dev to CI and cloud.

# Example: start a chat with Granite 4.0 Micro
docker model run ai/granite-4.0-micro

Prefer a different size? Pick your Granite 4.0 variant in the Model Catalog and run it with the same command style. See the Model Runner guide for enabling the runner, chat mode, and API usage.

What you can build (fast)

Granite’s lightweight and versatile nature makes it perfect for a wide range of applications. Combined with Docker Model Runner, you can easily build and scale projects like:

Document Summarization and Analysis: Process and summarize long legal contracts, technical manuals, or research papers with ease.
Smarter RAG Systems: Build powerful chatbots and assistants that pull information from external knowledge bases, CRMs, or document repositories.
Complex Agentic Workflows: Leverage the compact models to run multiple AI agents concurrently for sophisticated, multi-step reasoning tasks.
Edge AI Applications: Deploy Granite 4.0 Tiny in resource-constrained environments for on-device chatbots or smart assistants that don’t rely on the cloud.

Join the Open-Source AI Community

This partnership is all about empowering developers to build the next generation of AI applications. The Granite 4.0 models are available under a permissive Apache 2.0 license, giving you the freedom to customize and use them commercially.

We invite you to explore the models on Docker Hub and start building today. To help us improve the developer experience for running local models, head over to our Docker Model Runner repository.

Head over to our GitHub repository to get involved:

Star the repo to show your support
Fork it to experiment
Consider contributing back with your own improvements

Granite 4.0 is here. Run it, build with it, and see what’s possible with Granite 4.0 and Docker Model Runner.

IBM Granite 4.0 Models Now Available on Docker Hub

About Docker Hub

Why Granite 4.0 on Docker Hub matters

What’s in the Granite 4.0 family

Run in seconds with Docker Model Runner

What you can build (fast)

Join the Open-Source AI Community

掲示される

投稿タグ

投稿カテゴリ

関連記事

Unlimited access to Docker Hardened Images: Because security should be affordable, always

Docker Model Runner の一般提供

Docker が Compose をエージェント時代にもたらす: AI エージェントの構築が簡単になりました

AI Engineer Paris の Docker: Docker を使用して AI エージェントを構築して保護する

Llama.cpp がアップグレードされる: 再開可能なモデルのダウンロード

製品

特徴

開発者

料金プラン

会社

言語

IBM Granite 4.0 Models Now Available on Docker Hub

About Docker Hub

Why Granite 4.0 on Docker Hub matters

What’s in the Granite 4.0 family

Run in seconds with Docker Model Runner

What you can build (fast)

Join the Open-Source AI Community

掲示 される

投稿タグ

投稿カテゴリ

関連記事

Unlimited access to Docker Hardened Images: Because security should be affordable, always

Docker Model Runner の一般提供

Docker が Compose をエージェント時代にもたらす: AI エージェントの構築が簡単になりました

AI Engineer Paris の Docker: Docker を使用して AI エージェントを構築して保護する

Llama.cpp がアップグレードされる: 再開可能なモデルのダウンロード

掲示される