Open Source

llama.cpp b9724 Release with Bug Fixes

llama.cpp ReleasesJune 20, 2026high confidence

Why it matters

→Bug fixes improve the stability and reliability of llama.cpp.
→Support for multiple platforms ensures versatility for developers.
→Enhancements like Vulkan and ROCm 7.2 support broaden usability.

The b9724 release of llama.cpp brings several bug fixes aimed at improving stability and performance. Key updates include fixes to build processes and the addition of a sanity check in the get_u32() function. The release supports a wide range of platforms, including macOS, Windows, and Ubuntu, with specific enhancements for Vulkan and ROCm 7.2. While the update doesn't introduce new features, it enhances the reliability of llama.cpp, making it a more dependable tool for developers.

Read original

More from llama.cpp Releases

Models & Labsmodels

llama.cpp b9726 Release Adds New Features

The b9726 release of llama.cpp enhances server functionality with a new --agent argument, making command-line operations more efficient. By removing redundant web UI naming compatibility, the update simplifies the codebase. This release extends support to macOS, Linux, Windows, and openEuler, with specific improvements for AMD GPUs through ROCm 7.2 and NVIDIA GPUs with CUDA 12 and 13. While no new models are introduced, the update focuses on refining the platform's adaptability and ease of use for developers working in diverse computing environments.

llama.cpp ReleasesJun 20, 2026

Open Sourcemodels

llama.cpp b9728 Release Expands Platform Support

The latest b9728 release of llama.cpp continues its trend of broadening platform compatibility, though with some notable exceptions. While macOS Apple Silicon support is present, the KleidiAI feature is disabled, indicating a focus on stability over new features. The release also includes support for a variety of Linux distributions, including Ubuntu with ROCm 7.2 and Vulkan, as well as Windows with CUDA 12 and 13. This update highlights llama.cpp's commitment to being a versatile inference runtime across diverse hardware, though it remains conservative in introducing new capabilities.

llama.cpp ReleasesJun 20, 2026

Models & Labsmodels

llama.cpp b9731 Release Optimizes Token Sorting

The b9731 release of llama.cpp delivers a crucial optimization in how token probabilities are calculated. By adopting std::partial_sort, the system now efficiently sorts only the top-n tokens, cutting operation time from 8555.6 microseconds to 704.3 microseconds per operation. This enhancement is implemented across macOS, Linux, and Windows, improving performance for developers working with large language models. The update doesn't introduce new features but focuses on refining existing capabilities, such as KleidiAI on Apple Silicon and ROCm 7.2 on Ubuntu. This release underscores llama.cpp's commitment to making its core functionalities more efficient, particularly for those leveraging CUDA 12 and 13 on Windows.

llama.cpp ReleasesJun 20, 2026

More in Open Source

Open Sourcecoding

Kimi K2.7 and GLM-5.2 Models Released

Moonshot AI and Zhipu AI have released new open weight coding models, Kimi K2.7 and GLM-5.2.

Lev SelectorJun 19, 2026

Open Sourceagents

Anthropic Releases Open Source Tool for AI Agents

Anthropic has launched a new open-source tool called Claude Code, designed to simplify the creation of AI agents. This tool allows users to build and deploy AI agents without needing to write code or manage servers, making it accessible to a broader audience. The process involves an interactive setup that defines success criteria and schedules tasks, all managed in the cloud. This release could democratize AI agent development, enabling more people to experiment and innovate with AI technologies without technical barriers.

Duncan RogoffJun 19, 2026

Open Sourcecoding

GitHub Limits Open Pull Requests for Non-Writers

GitHub has introduced a new feature allowing repository maintainers to set a cap on the number of open pull requests from users without write access. This change aims to streamline the management of contributions by reducing the clutter of low-quality or drive-by pull requests. Maintainers can also designate trusted contributors who can exceed this limit without needing full collaborator access. This update is designed to help maintainers focus on meaningful contributions and reduce unnecessary review and CI overhead.

GitHub ChangelogJun 17, 2026