16 × AIAI signal, amplified
AI newsAboutSources
TelegramFollow on Telegram
AI newsAboutSources
16 × AIAI signal, amplified

An AI news engine that ingests trusted sources, scores with Claude, and posts only what clears the bar.

Follow on Telegram →

Subscribe

  • Telegram
  • RSS
  • All channels

Legal

  • Privacy
  • Imprint
© 2026 16 × AI. All rights reserved.Curated by Claude. Posts every 6 hours. No newsletter, no funnel.
Home/Models & Labs
Models & Labs

llama.cpp b9726 Release Adds New Features

llama.cpp Releases·June 20, 2026·high confidence

Why it matters

  • →The addition of the --agent argument simplifies server operations for developers.
  • →Removing redundant web UI naming compatibility streamlines the codebase.
  • →Enhanced platform support broadens accessibility for diverse development environments.

The b9726 release of llama.cpp brings several updates, including the addition of a --agent argument for the server, which simplifies command-line operations. This release also removes redundant web UI naming compatibility, aiming to streamline the codebase. It supports a wide range of platforms, including macOS, Linux, Windows, and openEuler, with enhancements for both AMD and NVIDIA GPUs. While no new models are introduced, the update focuses on improving platform versatility and developer accessibility.

Read original

More from llama.cpp Releases

Open Sourcecoding

llama.cpp b9724 Release with Bug Fixes

The b9724 release of llama.cpp is all about enhancing stability through a series of bug fixes, including improvements to build processes and overflow prevention in the area() function. This update ensures smoother operations across macOS, Windows, and Ubuntu, with specific support for Vulkan and ROCm 7.2 on Ubuntu. While it doesn't introduce groundbreaking features, the release strengthens llama.cpp's reliability as a tool for developers working in diverse environments. By refining and optimizing the platform, this update makes llama.cpp a more robust choice for AI development, ensuring compatibility with CUDA 12 and 13 on Windows and KleidiAI on Apple Silicon.

llama.cpp Releases·Jun 20, 2026
Open Sourcemodels

llama.cpp b9728 Release Expands Platform Support

The latest b9728 release of llama.cpp continues its trend of broadening platform compatibility, though with some notable exceptions. While macOS Apple Silicon support is present, the KleidiAI feature is disabled, indicating a focus on stability over new features. The release also includes support for a variety of Linux distributions, including Ubuntu with ROCm 7.2 and Vulkan, as well as Windows with CUDA 12 and 13. This update highlights llama.cpp's commitment to being a versatile inference runtime across diverse hardware, though it remains conservative in introducing new capabilities.

llama.cpp Releases·Jun 20, 2026
Models & Labsmodels

llama.cpp b9731 Release Optimizes Token Sorting

The b9731 release of llama.cpp delivers a crucial optimization in how token probabilities are calculated. By adopting std::partial_sort, the system now efficiently sorts only the top-n tokens, cutting operation time from 8555.6 microseconds to 704.3 microseconds per operation. This enhancement is implemented across macOS, Linux, and Windows, improving performance for developers working with large language models. The update doesn't introduce new features but focuses on refining existing capabilities, such as KleidiAI on Apple Silicon and ROCm 7.2 on Ubuntu. This release underscores llama.cpp's commitment to making its core functionalities more efficient, particularly for those leveraging CUDA 12 and 13 on Windows.

llama.cpp Releases·Jun 20, 2026

More in Models & Labs

Claude Fable 5 Withdrawn Amid Negotiations© Lev Selector
Models & Labsmodels

Claude Fable 5 Withdrawn Amid Negotiations

Claude Fable 5 was released and then withdrawn as Anthropic negotiates access with the administration.

Lev Selector·Jun 19, 2026
OpenRouter Fusion Combines Models to Reduce Costs© Lev Selector
Models & Labsmodels

OpenRouter Fusion Combines Models to Reduce Costs

OpenRouter Fusion uses model ensembles to reduce hallucinations and improve accuracy while lowering costs.

Lev Selector·Jun 19, 2026
Enterprise AI Focuses on Inference Optimization© The AI Daily Brief
Models & Labsproductivity

Enterprise AI Focuses on Inference Optimization

Enterprises prioritize inference optimization with model panels and smart routing.

The AI Daily Brief·Jun 19, 2026