Open Source

llama.cpp b9658 Release Expands Platform Support

llama.cpp ReleasesJune 16, 2026high confidence

Why it matters

→Expands platform support, making llama.cpp more versatile for developers.
→Enhances compatibility with diverse hardware configurations, including ROCm 7.2.
→Reinforces llama.cpp's position as a flexible inference runtime.

The b9658 release of llama.cpp has been announced, featuring expanded support for various platforms. Notably, it includes ROCm 7.2 support on Ubuntu x64, alongside existing compatibility with macOS, Windows, and Linux systems. The release does not introduce new model architectures but focuses on enhancing platform support, making it a more versatile tool for developers. This update reinforces llama.cpp's role as a flexible inference runtime across different hardware setups.

Read original

More from llama.cpp Releases

Open Sourcemodels

llama.cpp b9653 Release Expands Platform Support

The latest b9653 release of llama.cpp continues its trend of broadening platform compatibility, notably adding Vulkan support for Ubuntu and Windows, and ROCm 7.2 for Ubuntu x64. While KleidiAI support for macOS Apple Silicon is disabled, the release still offers a wide array of builds across macOS, Linux, Windows, and openEuler. This update doesn't introduce new models or quantization methods but focuses on making llama.cpp more accessible across diverse hardware configurations. Developers can now leverage these enhancements to optimize AI inference on a wider range of systems.

llama.cpp ReleasesJun 16, 2026

Open Sourcemodels

llama.cpp b9654 Release Expands Platform Support

The latest b9654 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, the release includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. While KleidiAI support on macOS Apple Silicon is disabled, the release still covers a wide array of systems, including Windows with CUDA 12 and 13 DLLs. This update reinforces llama.cpp's commitment to being a versatile inference runtime across diverse hardware configurations.

llama.cpp ReleasesJun 16, 2026

Models & Labsmodels

llama.cpp b9655 Release Fixes Grammar Bug

The b9655 release of llama.cpp resolves a persistent issue with the grammar generator that had re-emerged in recent updates, enhancing the tool's language processing reliability. This fix is crucial for developers who rely on precise grammar parsing in their applications. The update also corrects an erroneous case in the PEG parser test, ensuring more accurate parsing outcomes. While the release doesn't bring new features, it strengthens the existing infrastructure, making llama.cpp a more dependable choice for developers working across different operating systems, including macOS, Linux, and Windows.

llama.cpp ReleasesJun 16, 2026

More in Open Source

Open Sourceagents

OpenEnv Gains Open Source Community Support

OpenEnv is evolving into a pivotal open-source tool for agentic reinforcement learning (RL), now backed by a coalition of major AI organizations including Meta-PyTorch, Nvidia, and Hugging Face. This initiative aims to standardize the interface between RL environments and trainers, promoting interoperability and efficiency. By serving as a common socket for various RL components, OpenEnv facilitates seamless integration across different ecosystems. This move is set to enhance the development of specialized models and harnesses, making RL more accessible and efficient for the open-source community.

Hugging Face BlogJun 8, 2026

Open Sourcemodels

JetBrains Releases Mellum2 12B MoE Open-source

JetBrains has open-sourced Mellum2, a 12 billion parameter mixture of experts model.

Lev SelectorJun 5, 2026

Open Sourceresearch

Google Open Sources AI Hydrology Model for Flood Forecasting

Google has open-sourced its advanced AI-based hydrology model, aiming to enhance global flood forecasting capabilities. This move allows National Meteorological and Hydrological Services to integrate sophisticated AI tools into their workflows, potentially improving the accuracy and timeliness of flood warnings. By releasing the model on GitHub, Google empowers local experts to refine and adapt the technology using their own data, fostering a more resilient approach to flood management. This initiative democratizes access to cutting-edge forecasting tools, especially benefiting regions with limited resources.

Google Research BlogJun 3, 2026