Open Source

b9004 Release for Multiple Platforms

llama.cpp ReleasesMay 2, 2026medium confidence

Why it matters

→Expands compatibility for developers across multiple operating systems.
→Enhances the usability of llama.cpp for various hardware configurations.
→Supports both CPU and GPU architectures, facilitating performance optimization.

The b9004 release of llama.cpp has been announced, providing compatibility across multiple platforms. This update includes support for macOS Apple Silicon, Intel, various Ubuntu configurations, and Windows with CUDA support. Additionally, it offers options for Android and openEuler. This broadens the accessibility of llama.cpp for developers working in diverse environments.

Read original

More from llama.cpp Releases

Open Sourcemodels

llama.cpp b9056 Release Expands Platform Support

The latest b9056 release of llama.cpp continues its trend of broadening platform compatibility, now including support for macOS Apple Silicon with KleidiAI enabled and a variety of Linux configurations such as Ubuntu with Vulkan and ROCm 7.2. This update also enhances Windows support with CUDA 12 and 13 DLLs, making it more versatile for developers working across different environments. While there are no groundbreaking new features, the release solidifies llama.cpp's position as a flexible inference runtime across diverse hardware setups. Developers can now leverage these updates to optimize performance on their specific systems, whether they're using Apple Silicon, AMD, or NVIDIA GPUs.

llama.cpp ReleasesMay 8, 2026

Open Sourcemodels

llama.cpp b9057 Release Expands Platform Support

The latest b9057 release of llama.cpp continues its trend of broadening platform compatibility, now optimizing for RISC-V CPUs with q1_0 dot support. This update enhances performance across a wide array of systems, including macOS, Linux, Windows, and Android, with specific builds for Apple Silicon, Vulkan, and CUDA environments. Notably, the inclusion of ROCm 7.2 for Ubuntu x64 and CUDA 13 for Windows x64 signifies a commitment to supporting diverse hardware configurations. While no new models are introduced, this release solidifies llama.cpp's position as a versatile inference runtime across multiple architectures.

llama.cpp ReleasesMay 8, 2026

Open Sourcemodels

llama.cpp b9058 Release Expands Platform Support

The b9058 release of llama.cpp significantly enhances its reach by supporting more platforms, making it a versatile tool for developers. It now includes KleidiAI support for macOS Apple Silicon, which optimizes performance on Apple's ARM architecture. The update also brings Vulkan support to both Ubuntu and Windows, boosting graphics processing capabilities. With the integration of ROCm 7.2 for Ubuntu, AMD GPU users see improved compatibility, narrowing the gap with NVIDIA. Additionally, Windows users benefit from CUDA 12 and 13 DLLs, catering to NVIDIA GPU needs. This release positions llama.cpp as a more adaptable solution for developers working with diverse hardware setups.

llama.cpp ReleasesMay 8, 2026