The latest release of Llama.cpp, version b8998, has been announced, providing support for multiple operating systems. This update includes compatibility for macOS Apple Silicon, Intel, various Linux distributions, Android, and Windows with specific configurations for CPU and GPU support. Notably, it features enhancements for CUDA, Vulkan, and SYCL across different architectures. This release expands the accessibility of Llama.cpp for developers working on diverse platforms.
Read originalThe latest update for llama-quant addresses a tensor-type issue when the default qtype is overridden. This release includes support for various platforms.
The latest Llama.cpp release introduces Vulkan support for asymmetric FA in the coopmat2 path, enhancing mixed quantization capabilities.
The latest update to ggml-webgpu addresses vectorized handling in the mul-mat and mul-mat-id functions. This release includes support for various operating systems and architectures.