Open Source

llama.cpp b9771 Release Trims Shader Variants

llama.cpp ReleasesJune 24, 2026high confidence

Why it matters

→Reduces shader variant explosion, improving performance for Vulkan users.
→Decreases binary size, making the software more efficient.
→Enhances compatibility across multiple operating systems, supporting diverse hardware configurations.

The b9771 release of llama.cpp introduces a significant optimization by making 'mul_mm ALIGNED' a spec constant, which reduces the shader variant explosion and decreases binary size. This update is particularly beneficial for Vulkan users, enhancing performance and efficiency. The release maintains broad compatibility across multiple platforms, including macOS, Linux, Windows, and openEuler. While it doesn't introduce new features, it represents a continued effort to refine and optimize the llama.cpp platform for developers.

Read original

More from llama.cpp Releases

Models & Labsmodels

llama.cpp b9767 Release Enhances MTP Inference

The b9767 release of llama.cpp introduces significant improvements to MTP inference by optimizing the mat-vec path for small batches, which enhances decoding efficiency. A new barrier in the NUM_COLS loop of the mul-mat-vec process is expected to boost performance. While no new model architectures are included, this update refines the platform's capabilities across macOS, Linux, and Windows. Notably, it supports macOS Apple Silicon, Ubuntu with ROCm 7.2, and Windows with CUDA 12 and 13. This release continues llama.cpp's focus on performance optimization and compatibility, making it a more powerful tool for developers.

llama.cpp ReleasesJun 24, 2026

Models & Labsmodels

Granite Speech Plus Support Added in b9768 Release

The b9768 release of llama.cpp expands its capabilities by integrating Granite Speech Plus, which enhances audio processing with multi-layer concatenation. This update is particularly relevant for developers focused on audio applications, as it resolves naming inconsistencies and standardizes feature layer usage. While no new models are introduced, the release fortifies the existing framework, making it more reliable for audio tasks. This iteration marks a refinement in the tool's functionality, especially for those utilizing its audio features.

llama.cpp ReleasesJun 24, 2026

Open Sourcemodels

llama.cpp b9773 Release Expands Platform Support

The b9773 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, it includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. The release also maintains a wide array of builds across macOS, Linux, Windows, and openEuler, ensuring that developers can deploy llama.cpp in many different computing environments. While the update doesn't introduce groundbreaking changes, it solidifies llama.cpp's position as a versatile tool for AI inference across multiple systems.

llama.cpp ReleasesJun 24, 2026

More in Open Source

Open Sourcecoding

Hugging Face Automates Weekly Releases with AI

Hugging Face has streamlined its release process for the huggingface_hub Python client, moving from a 4-6 week cycle to weekly releases. This shift is powered by a combination of open-source tools and AI, which drafts release notes and automates mechanical tasks, while humans oversee critical judgment areas. The process is designed to be replicable by other maintainers, emphasizing transparency and adaptability. This change not only accelerates the release cycle but also ensures that updates are consistently delivered without the need for proprietary tools.

Hugging Face BlogJun 23, 2026

Open Sourcecoding

OpenAI Launches Patch the Planet Initiative

OpenAI's new initiative, Patch the Planet, aims to bolster the security of open-source projects by assisting maintainers in identifying and addressing vulnerabilities. This effort combines AI technology with expert reviews to ensure that open-source software remains robust and secure. By providing tools and support, OpenAI is addressing a critical need in the open-source community, where security can often be overlooked due to resource constraints. This initiative could significantly enhance the reliability of widely-used open-source software, making it safer for developers and users alike.

OpenAIJun 22, 2026

Open Sourcemodels

OpenRouter Launches Fusion for Model Routing

OpenRouter has introduced Fusion, a new tool for model routing in AI systems.

The AI Daily BriefJun 21, 2026