The release of v0.20.1rc0 introduces a new system_fingerprint field to API responses, enhancing compatibility with OpenAI's API. This update was co-authored by Claude from Anthropic.
Read originalThe latest version b8991 of llama.cpp has been released, featuring updates for various operating systems.
The latest update to llama-mmap improves compatibility with various platforms and model sizes. Key enhancements include support for 32-bit wasm and updates to gguf.cpp style.

The v0.19.0rc0 release introduces a feature for CPU key-value cache offloading, enhancing performance. This update was signed off by Yifan Qiao.