Models & Labs

b9084 Release Enhances llama.cpp with HTP Kernel

llama.cpp ReleasesMay 9, 2026high confidence

Why it matters

→The update optimizes performance for AI model operations on HVX.
→It enhances compatibility across multiple platforms, broadening its usability.
→The release reduces vector reload overhead, improving efficiency.

The latest b9084 release of llama.cpp brings a notable improvement with the addition of an HTP kernel for the Gated Delta Net operation. This enhancement is designed to optimize performance on HVX by implementing fused kernels that reduce vector reload overhead. The update also extends support across various platforms, including macOS, Linux, and Windows, enhancing compatibility and performance. This release is a significant step in making llama.cpp more efficient and adaptable for developers working with AI models.

Read original

b9084 Release Enhances llama.cpp with HTP Kernel

Why it matters

More from llama.cpp Releases

Llama.cpp adds GLM-5.2 speculative decoding support

llama.cpp b10175 Release Expands Platform Support

More in Models & Labs

Microsoft to Launch Copilot 'Super App' This Year

llama.cpp b10176 Release Expands Platform Support

OpenAI Plans 'Family of Devices' for AI Interaction

Anthropic's Opus 5 Release Raises Concerns for Indie Hackers