The b9002 version of Llama.cpp has been released, providing support for various operating systems including macOS, Linux, Android, and Windows. This update includes compatibility for macOS Apple Silicon, Ubuntu distributions, and Windows with CUDA support. The release aims to enhance performance across different hardware configurations. Developers can now utilize this version to improve their applications leveraging Llama.cpp's capabilities.
Read originalThe b9004 release of llama.cpp introduces support for various platforms including macOS, Linux, Android, and Windows.
The latest update to Llama.cpp includes optimizations for MoE on Adreno GPUs and various fixes across platforms.
The latest update to HMX Flash Attention includes several optimizations and fixes for performance and correctness.