Models & Labs

Llama.cpp b9833 Release Enhances MiniCPM5 Parser

llama.cpp ReleasesJune 30, 2026high confidence

Why it matters

→Enhances the MiniCPM5 parser for better tool call handling.
→Aligns Jinja API with industry standards for improved compatibility.
→Maintains strict JSON parsing to ensure data integrity.

Llama.cpp has released its b9833 update, which brings significant improvements to the MiniCPM5 parser. The update introduces a new tool call parser and refines the existing PEG parser based on review feedback. It also aligns the Jinja min/max API with Jinja2 standards and reverts certain shared mapper changes to ensure strict JSON parsing. These changes are designed to enhance the parser's efficiency and reliability, particularly in handling XML tool calls.

Read original

More from llama.cpp Releases

Open Sourcemodels

llama.cpp b9831 release adds DFlash support

The b9831 release of llama.cpp marks a significant enhancement with the addition of DFlash, which brings sliding window attention per layer types. This update is particularly beneficial for developers on macOS, Linux, and Windows, as it extends the tool's compatibility and functionality across these platforms. With ROCm 7.2 now available on Ubuntu, AMD GPU users gain a more robust option for local inference. While no new models are introduced, this release solidifies llama.cpp's role as a versatile inference runtime, especially for those not reliant on NVIDIA hardware. The update also includes various platform-specific improvements, making it a comprehensive upgrade for developers.

llama.cpp ReleasesJun 30, 2026

Open Sourcecoding

llama.cpp b9832 Release Adds Debugging Feature

The b9832 release of llama.cpp introduces a new debugging capability with the --dump-prog option in jinja, co-authored by Sigbjørn Skjæret. This enhancement is designed to streamline the debugging process for developers. The update also extends compatibility across various systems, including macOS, Linux, Windows, and openEuler, ensuring developers can work seamlessly in their preferred environments. While the release doesn't bring new models or quantization techniques, it reinforces llama.cpp's role as a flexible tool for developers. With ROCm 7.2 and CUDA 12 and 13 support, the platform continues to cater to a broad spectrum of hardware configurations. This update is a testament to llama.cpp's commitment to improving developer experience.

llama.cpp ReleasesJun 30, 2026

Models & Labsmodels

llama.cpp b9835 Release Expands Platform Support

The latest b9835 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, the release includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. The update also maintains a wide array of builds across macOS, Linux, Windows, and openEuler, ensuring developers have the flexibility to deploy on diverse systems. While the release doesn't introduce groundbreaking changes, it solidifies llama.cpp's position as a versatile tool for AI inference across multiple environments.

llama.cpp ReleasesJun 30, 2026

More in Models & Labs

Models & Labsmodels

vLLM v0.24.0 Release Enhances Model Support

The vLLM v0.24.0 release marks a significant update with extensive contributions from 256 developers, introducing support for new models like MiniMax-M3 and DiffusionGemma. This version enhances performance with optimizations such as the FlashInfer sparse index cache and improved throughput for DeepSeek-V4. The update also expands the Model Runner V2 capabilities, supporting quantized models by default and integrating GraniteMoE. These advancements make vLLM more robust and versatile, offering developers improved tools for model deployment and performance tuning.

vLLM ReleasesJun 30, 2026

Models & Labsmodels

Base44 Launches Custom AI Model for Vibe Coding

Base44, a vibe coding platform acquired by Wix, is launching its own AI model, Base1, to enhance app creation through natural language. This move aims to improve latency, cost, and efficiency by integrating the model into its tech stack, setting it apart from competitors relying on external models. The decision reflects a broader trend where AI companies leverage proprietary data and infrastructure for defensibility. While Base44's model is still new, it represents a strategic shift towards specialization in a competitive landscape dominated by frontier AI labs.

TechCrunch AIJun 30, 2026

Models & Labscoding

OpenAI Teases New Hardware for Codex

OpenAI is stepping into the hardware space with a new device tailored for its AI-powered coding tool, Codex. In collaboration with Work Louder, known for their mechanical keyboards and macro pads, OpenAI is set to launch a device that promises to enhance Codex shortcuts. The teaser suggests a device similar to Work Louder's Creator Micro 2, which features customizable mechanical switches and a joystick. This move could streamline coding workflows by integrating physical controls with AI capabilities, marking a novel intersection of hardware and AI in coding environments.

The Verge AIJun 29, 2026