Models & Labs

Llama.cpp b9784 Release Enhances Hexagon Performance

llama.cpp ReleasesJune 26, 2026high confidence

Why it matters

→Enhances performance of matrix multiplication on Hexagon architecture.
→Optimizes register usage and activation processing.
→Improves efficiency for developers using diverse hardware configurations.

The b9784 release of llama.cpp focuses on optimizing Hexagon's matrix multiplication operations. Key improvements include a rework of the MUL_MAT and MUL_MAT_ID functions, introducing a 32x32 tiled weight repack and enhanced kernel parameters. These updates aim to improve performance and efficiency, particularly for users utilizing Hexagon's architecture. The release does not feature new models but enhances existing processes, making llama.cpp more efficient for developers across various hardware setups.

Read original

More from llama.cpp Releases

Open Sourcemodels

llama.cpp b9781 Release Expands Platform Support

The latest b9781 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, the release includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. While KleidiAI support for macOS Apple Silicon is disabled, the release still covers a wide array of platforms, including Windows and openEuler. This update reinforces llama.cpp's position as a versatile inference runtime, though it remains focused on platform expansion rather than introducing new model architectures.

llama.cpp ReleasesJun 26, 2026

Open Sourcemodels

llama.cpp b9782 Release Expands Platform Support

The latest b9782 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, the release includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. While KleidiAI support for Apple Silicon remains disabled, the release still covers a wide array of platforms, from Windows to openEuler. This update solidifies llama.cpp's position as a versatile inference runtime, though it doesn't introduce groundbreaking changes.

llama.cpp ReleasesJun 26, 2026

Open Sourcemodels

llama.cpp b9785 Release Expands Platform Support

The latest b9785 release of llama.cpp continues its trend of broadening platform compatibility, though without major new features. Notably, the release includes support for ROCm 7.2 on Ubuntu x64, which is significant for AMD GPU users seeking alternatives to NVIDIA's CUDA. While KleidiAI support for Apple Silicon remains disabled, the release still covers a wide array of platforms, from macOS to Windows and openEuler. This update solidifies llama.cpp's position as a versatile inference runtime, though it doesn't introduce groundbreaking changes.

llama.cpp ReleasesJun 26, 2026

More in Models & Labs

Models & Labsmodels

OpenAI Develops Custom Chip 'Jalapeño'

OpenAI has announced the development of its first custom chip, named 'Jalapeño'.

The AI Daily BriefJun 25, 2026

Models & Labsmodels

GPT-5.5 Instant Now Available for Free Users

OpenAI has updated GPT-5.5 Instant, making it accessible to users on the free tier.

The AI Daily BriefJun 25, 2026

Models & Labsmodels

Unconventional AI Aims to Slash AI Power Use by 1,000x

Unconventional AI, led by former Databricks AI chief Naveen Rao, is pioneering a new computing architecture that could drastically reduce the power consumption of AI inference by up to 1,000 times. Their first model, Un-0, demonstrates the potential of an oscillator-based architecture to match the performance of state-of-the-art diffusion models in image generation. While currently running on a software simulation, the company plans to release chip schematics soon, aiming to build a complete inference stack. This innovation could address the looming energy constraints in AI scaling, offering a sustainable path forward.

TechCrunch AIJun 25, 2026