Models & Labs

Anthropic's Claude Models Now on NVIDIA GB300 in Azure

NVIDIA BlogJune 29, 2026high confidence

Why it matters

→Enterprises can now build more powerful AI agents with enhanced performance.
→The integration of NVIDIA tools allows for domain-specific capabilities in AI agents.
→This development strengthens the strategic partnership between Microsoft, NVIDIA, and Anthropic.

Anthropic's Claude models are now generally available on Microsoft Azure, utilizing NVIDIA's GB300 Blackwell Ultra GPUs. This setup allows enterprises to create more powerful AI agents with improved performance and efficiency. The collaboration between NVIDIA and Anthropic enhances developer capabilities by integrating NVIDIA tools, enabling domain-specific abilities for Claude agents. This development is part of a strategic partnership with Microsoft, NVIDIA, and Anthropic, aimed at expanding enterprise access to advanced AI solutions.

Read original

More from NVIDIA Blog

Market & Regulationbusiness

Palantir Integrates NVIDIA Nemotron for Secure AI in US Agencies

Palantir is harnessing NVIDIA's Nemotron open models to boost AI capabilities for U.S. government agencies, focusing on secure and customizable deployments. This collaboration enables agencies to run and train AI models on their own infrastructure, keeping control over data and model weights intact. By integrating with Palantir's Sovereign AI Operating System, the solution ensures data security and operational efficiency in sensitive environments. This development highlights the increasing role of open models in providing transparency, customization, and cost efficiency in AI deployments for both government and enterprise sectors.

NVIDIA BlogJun 29, 2026

More in Models & Labs

Models & Labsmodels

vLLM v0.24.0 Release Enhances Model Support

The vLLM v0.24.0 release marks a significant update with extensive contributions from 256 developers, introducing support for new models like MiniMax-M3 and DiffusionGemma. This version enhances performance with optimizations such as the FlashInfer sparse index cache and improved throughput for DeepSeek-V4. The update also expands the Model Runner V2 capabilities, supporting quantized models by default and integrating GraniteMoE. These advancements make vLLM more robust and versatile, offering developers improved tools for model deployment and performance tuning.

vLLM ReleasesJun 30, 2026

Models & Labsmodels

Llama.cpp b9833 Release Enhances MiniCPM5 Parser

The latest b9833 release of llama.cpp focuses on refining the MiniCPM5 parser, addressing several technical aspects to improve its functionality. This update includes the addition of a new tool call parser, refactoring of the PEG parser, and adjustments to the Jinja min/max API for better compatibility with Jinja2. The release also reverts some shared mapper changes to maintain strict JSON parsing for tool-call arguments. These enhancements aim to streamline the parsing process, ensuring more reliable and efficient handling of XML tool calls and grammar triggers.

llama.cpp ReleasesJun 30, 2026

Models & Labsmodels

llama.cpp b9835 Release Expands Platform Support

The latest b9835 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, the release includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. The update also maintains a wide array of builds across macOS, Linux, Windows, and openEuler, ensuring developers have the flexibility to deploy on diverse systems. While the release doesn't introduce groundbreaking changes, it solidifies llama.cpp's position as a versatile tool for AI inference across multiple environments.

llama.cpp ReleasesJun 30, 2026