Open Source

llama.cpp b9008 Release Expands Platform Support

llama.cpp ReleasesMay 3, 2026high confidence

Why it matters

→Expands llama.cpp's compatibility across multiple platforms and architectures.
→Enhances developer flexibility with support for Vulkan and ROCm 7.2.
→Solidifies llama.cpp's role as a versatile inference tool for diverse hardware.

The latest b9008 release of llama.cpp has expanded its platform support, offering builds for macOS, Linux, Windows, and Android. This update includes support for Vulkan on both Ubuntu and Windows, as well as ROCm 7.2 on Ubuntu, enhancing its compatibility with various hardware architectures. The release aims to make llama.cpp a more versatile tool for developers by supporting a wide range of systems, including Apple Silicon, Intel, and CUDA. This update reinforces llama.cpp's position as a flexible inference runtime across different platforms.

Read original

More from llama.cpp Releases

Models & Labsmodels

llama.cpp b9010 Release Fixes CUDA Multi-GPU Issue

The b9010 release of llama.cpp tackles a crucial bug in CUDA device PCI bus ID detection, which previously caused out-of-memory errors by failing to recognize multiple GPUs. This update significantly improves multi-GPU support, especially for Windows users leveraging CUDA. The release also brings enhancements for macOS, Linux, and Windows, with specific improvements for Apple Silicon and Vulkan integration. While it doesn't introduce groundbreaking new features, this update strengthens llama.cpp's reliability and compatibility across different hardware setups, including ROCm 7.2 and KleidiAI on Apple Silicon.

llama.cpp ReleasesMay 3, 2026

Models & Labsother