The b9124 release of llama.cpp introduces expanded platform support, enhancing its utility for developers. This update exposes modalities to the /v1/models endpoint, allowing for more flexible model deployment. It includes support for various operating systems, such as macOS Apple Silicon, Windows with CUDA 13, and Ubuntu with ROCm 7.2. While no new models are introduced, this release solidifies llama.cpp's role as a versatile inference runtime across multiple hardware setups.
Read originalThe latest b9116 release of llama.cpp introduces MiMo v2.5, enhancing vision support with fused qkv for improved performance. This update addresses previous issues like f16 vision overflow and includes various cleanups for better code maintenance. With expanded platform support, including macOS, Linux, and Windows, this release broadens accessibility for developers working on diverse systems. The focus on vision capabilities marks a significant step in making llama.cpp a more versatile tool for AI developers, particularly those interested in integrating vision functionalities.
The latest b9118 release of llama.cpp continues its trend of broadening platform compatibility, now including support for a wide array of systems such as macOS, Linux, Windows, and Android. Notably, this update introduces Vulkan support on Ubuntu and Windows, alongside ROCm 7.2 for AMD GPUs, which is a significant step for users seeking alternatives to NVIDIA's CUDA. The inclusion of KleidiAI on Apple Silicon further enhances performance for M-series Macs. While there are no new model architectures, this release solidifies llama.cpp's position as a versatile inference runtime across diverse hardware configurations.
The v0.18.2rc0 release includes a fix for handling the max_pixels parameter in the PaddleOCR-VL image processor across transformations.