The latest b9496 release of llama.cpp introduces expanded support across multiple platforms, including Ubuntu and Windows. Notably, the release includes ROCm 7.2 and Vulkan support for Ubuntu, while Windows users gain access to CUDA 12 and 13 DLLs for enhanced GPU performance. However, some features like KleidiAI on macOS Apple Silicon and SYCL on Windows are disabled. This update reflects llama.cpp's ongoing efforts to enhance compatibility and performance across a wide range of hardware configurations.
Read originalThe b9489 release of llama.cpp brings notable improvements for CUDA users, specifically by reserving space for quantized key-value caches at startup. This update also addresses previous feedback and removes certain assertions in the ggml-cuda.cu file, enhancing the CUDA experience. While it doesn't introduce new models or quantization techniques, the release continues to refine the platform's compatibility across macOS, Linux, and Windows. With ROCm 7.2 and KleidiAI support, llama.cpp is becoming a more robust tool for developers working with CUDA and other environments. This iteration is a step towards making llama.cpp a more versatile and efficient tool for AI development.
The latest b9490 release of llama.cpp continues its trend of broadening platform compatibility, though with some notable exceptions. While macOS Apple Silicon users see KleidiAI support disabled, the release strengthens its Linux offerings with Vulkan and ROCm 7.2 support on Ubuntu. Windows users benefit from CUDA 12 and 13 DLLs, enhancing GPU performance options. Despite some features being disabled, this update demonstrates llama.cpp's commitment to being a versatile inference runtime across diverse systems.