The b9081 release of llama.cpp has been announced, featuring expanded support across multiple platforms. This update includes macOS Apple Silicon with KleidiAI, Vulkan support for both Ubuntu and Windows, and ROCm 7.2 for Ubuntu x64. Windows users also gain from CUDA 12 and 13 support. While no new models are introduced, the release enhances llama.cpp's compatibility and usability across a wide range of hardware setups, reinforcing its role as a versatile AI inference tool.
Read originalThe b9073 release of llama.cpp marks a significant expansion in platform compatibility, enhancing its accessibility across various operating systems. With KleidiAI now enabled for macOS Apple Silicon, M-series Mac users can expect improved performance. The update also includes builds for Ubuntu featuring ROCm 7.2 and OpenVINO, alongside Windows versions with CUDA 12 and 13, reflecting a commitment to supporting diverse hardware. This positions llama.cpp as a versatile inference runtime, catering to developers across different environments without introducing new model architectures.
The b9075 release of llama.cpp brings a notable improvement for CUDA users by integrating the snake activation function into a single elementwise kernel. This enhancement is particularly advantageous for audio decoders like BigVGAN and Vocos, which previously depended on a more complex five-operation sequence. By streamlining these operations, the update promises better performance and efficiency across data types such as F32, F16, and BF16. This development reflects llama.cpp's ongoing focus on refining its CUDA capabilities, making it a more compelling option for developers dealing with complex activation functions.
The v0.18.2rc0 release includes a fix for handling the max_pixels parameter in the PaddleOCR-VL image processor across transformations.