The latest b9515 release of llama.cpp has been announced, focusing on improving code efficiency by moving duplicated imatrix code into a single common loader. This update also brings back LLAMA_TRACE and introduces an early exit mechanism for missing metadata during quantization. The release supports multiple platforms, including macOS, Linux, Windows, and openEuler, with specific builds for technologies like Vulkan, ROCm, and CUDA. While the update doesn't introduce new features, it enhances the maintainability and efficiency of the codebase.
Read originalThe b9503 release of llama.cpp addresses a technical issue with the Gemma 4 audio projector embedding size, enhancing its functionality. By removing the projection_dim from clip_n_mmproj_embd, the update streamlines the codebase. This release ensures better compatibility across macOS, Linux, and Windows, with specific builds for Apple Silicon, ROCm 7.2, and CUDA 12 and 13. While it doesn't introduce new features, the update reflects a commitment to improving the software's reliability and performance. This release is a technical refinement, focusing on stability rather than groundbreaking changes.
The b9504 release of llama.cpp continues to broaden its reach, enhancing compatibility across multiple environments. This update notably includes support for Ubuntu with ROCm 7.2, which boosts performance for AMD GPU users. While features like KleidiAI on macOS and SYCL on Windows are not yet active, the release still represents a significant step in making llama.cpp a more adaptable tool for developers. By focusing on expanding compatibility and improving the runtime experience, this update strengthens llama.cpp's position as a versatile option for developers working with different systems.