Models & Labs

b9070 release adds Q4_0 MoE GEMM for Adreno

llama.cpp ReleasesMay 8, 2026high confidence

Why it matters

→Enhances performance for Adreno GPU users, crucial for mobile development.
→Streamlines codebase with technical adjustments, improving efficiency.
→Expands platform support, making the tool more accessible.

The latest b9070 release of llama.cpp brings Q4_0 MoE GEMM support for Adreno GPUs, enhancing performance for mobile developers using Qualcomm's graphics. This update, co-authored by Li He from Qualcomm, includes technical refinements like whitespace fixes and code cleanup. The release supports a wide range of platforms, including macOS, Linux, Windows, and Android, ensuring broad accessibility. This update focuses on optimizing existing capabilities rather than introducing new models.

Read original

b9070 release adds Q4_0 MoE GEMM for Adreno

Why it matters

More from llama.cpp Releases

Llama.cpp adds GLM-5.2 speculative decoding support

llama.cpp b10175 Release Expands Platform Support

More in Models & Labs

Microsoft to Launch Copilot 'Super App' This Year

llama.cpp b10176 Release Expands Platform Support

OpenAI Plans 'Family of Devices' for AI Interaction

Anthropic's Opus 5 Release Raises Concerns for Indie Hackers