Open Source

llama.cpp b9654 Release Expands Platform Support

llama.cpp ReleasesJune 16, 2026high confidence

Why it matters

→Expands support for AMD GPUs with ROCm 7.2, offering alternatives to CUDA.
→Maintains broad compatibility across diverse hardware platforms.
→Continues llama.cpp's role as a versatile inference runtime.

The b9654 release of llama.cpp has been announced, focusing on expanding platform support rather than introducing new features. This update includes ROCm 7.2 support for Ubuntu x64, enhancing options for AMD GPU users. While KleidiAI support on macOS Apple Silicon is disabled, the release maintains broad compatibility across various systems, including Windows with CUDA 12 and 13. This release highlights llama.cpp's ongoing efforts to provide a versatile inference runtime across multiple hardware platforms.

Read original

More from llama.cpp Releases

Open Sourcemodels

llama.cpp b9653 Release Expands Platform Support

The latest b9653 release of llama.cpp continues its trend of broadening platform compatibility, notably adding Vulkan support for Ubuntu and Windows, and ROCm 7.2 for Ubuntu x64. While KleidiAI support for macOS Apple Silicon is disabled, the release still offers a wide array of builds across macOS, Linux, Windows, and openEuler. This update doesn't introduce new models or quantization methods but focuses on making llama.cpp more accessible across diverse hardware configurations. Developers can now leverage these enhancements to optimize AI inference on a wider range of systems.

llama.cpp ReleasesJun 16, 2026

Models & Labsmodels

llama.cpp b9655 Release Fixes Grammar Bug

The b9655 release of llama.cpp resolves a persistent issue with the grammar generator that had re-emerged in recent updates, enhancing the tool's language processing reliability. This fix is crucial for developers who rely on precise grammar parsing in their applications. The update also corrects an erroneous case in the PEG parser test, ensuring more accurate parsing outcomes. While the release doesn't bring new features, it strengthens the existing infrastructure, making llama.cpp a more dependable choice for developers working across different operating systems, including macOS, Linux, and Windows.

llama.cpp ReleasesJun 16, 2026

Open Sourcemodels

llama.cpp b9658 Release Expands Platform Support

The b9658 release of llama.cpp marks another step in broadening its compatibility across different systems, now featuring ROCm 7.2 support on Ubuntu x64. This update continues to offer extensive support for macOS, Windows, and Linux, with specific builds for Vulkan and SYCL. Although there are no new model architectures introduced, the release strengthens llama.cpp's role as a versatile inference runtime for a variety of hardware setups. Developers can now utilize llama.cpp more effectively, leveraging its enhanced platform support to optimize AI development across diverse environments.

llama.cpp ReleasesJun 16, 2026

More in Open Source

Open Sourceagents

OpenEnv Gains Open Source Community Support

OpenEnv is evolving into a pivotal open-source tool for agentic reinforcement learning (RL), now backed by a coalition of major AI organizations including Meta-PyTorch, Nvidia, and Hugging Face. This initiative aims to standardize the interface between RL environments and trainers, promoting interoperability and efficiency. By serving as a common socket for various RL components, OpenEnv facilitates seamless integration across different ecosystems. This move is set to enhance the development of specialized models and harnesses, making RL more accessible and efficient for the open-source community.

Hugging Face BlogJun 8, 2026

Open Sourcemodels

JetBrains Releases Mellum2 12B MoE Open-source

JetBrains has open-sourced Mellum2, a 12 billion parameter mixture of experts model.

Lev SelectorJun 5, 2026

Open Sourceresearch

Google Open Sources AI Hydrology Model for Flood Forecasting

Google has open-sourced its advanced AI-based hydrology model, aiming to enhance global flood forecasting capabilities. This move allows National Meteorological and Hydrological Services to integrate sophisticated AI tools into their workflows, potentially improving the accuracy and timeliness of flood warnings. By releasing the model on GitHub, Google empowers local experts to refine and adapt the technology using their own data, fostering a more resilient approach to flood management. This initiative democratizes access to cutting-edge forecasting tools, especially benefiting regions with limited resources.

Google Research BlogJun 3, 2026