16 × AIAI signal, amplified
AI newsAboutSources
TelegramFollow on Telegram
AI newsAboutSources
16 × AIAI signal, amplified

An AI news engine that ingests trusted sources, scores with Claude, and posts only what clears the bar.

Follow on Telegram →

Subscribe

  • Telegram
  • RSS
  • All channels

Legal

  • Privacy
  • Imprint
© 2026 16 × AI. All rights reserved.Curated by Claude. Posts every 6 hours. No newsletter, no funnel.
Home/Models & Labs
Models & Labs

llama.cpp b9399 Release Enhances OpenCL Functionality

llama.cpp Releases·May 29, 2026·high confidence

Why it matters

  • →Enhances code maintainability by isolating OpenCL backend info printing.
  • →Improves compatibility with non-Adreno hardware, broadening usability.
  • →Supports diverse hardware setups, aiding developers in efficient model deployment.

The b9399 release of llama.cpp brings improvements to OpenCL functionality by isolating backend info printing into a dedicated function. This update also includes a fix for non-Adreno paths, enhancing compatibility across various hardware. The release supports multiple platforms, including macOS, Linux, Windows, and openEuler, with specific configurations for each. While no new models are introduced, these changes contribute to the ongoing refinement of the llama.cpp platform, making it more robust for developers.

Read original

More from llama.cpp Releases

Models & Labsmodels

Llama.cpp b9387 Release Enhances AMD MFMA Performance

The latest b9387 release of llama.cpp introduces significant performance improvements for AMD MFMA hardware, particularly in quantized matrix multiplication. By optimizing the batch threshold logic, the update allows for more efficient processing, with throughput gains of up to 76% in certain configurations. This release is particularly relevant for users leveraging AMD's MI250X hardware, as it fine-tunes the kernel selection logic to maximize performance. While the update doesn't introduce new models, it significantly enhances the efficiency of existing operations on specific hardware, making it a noteworthy development for those using AMD GPUs.

llama.cpp Releases·May 29, 2026
Models & Labsmodels

llama.cpp b9388 release enhances Turing support

The latest b9388 release of llama.cpp introduces optimizations for Turing architecture, specifically adding MMVQ_PARAMETERS_TURING to improve JIT compilation for SM75 Turing devices. This update aims to prevent mismatches when compiling Turing device code on Ampere or newer architectures. While the release doesn't introduce new models or quantization methods, it continues to expand platform support, including updates for macOS, Linux, and Windows. The focus remains on refining compatibility and performance across diverse hardware configurations, making llama.cpp a more versatile tool for developers.

llama.cpp Releases·May 29, 2026
Open Sourcemodels

llama.cpp b9389 Release Expands Platform Support

The latest b9389 release of llama.cpp continues its trend of broadening platform compatibility, though with some notable exceptions. While macOS Apple Silicon users see KleidiAI support disabled, the release strengthens its Linux offerings with ROCm 7.2 and Vulkan support. Windows users benefit from updated CUDA DLLs, enhancing performance for CUDA 12 and 13. This release demonstrates llama.cpp's commitment to being a versatile inference runtime across diverse hardware, though some features remain disabled, indicating ongoing development challenges.

llama.cpp Releases·May 29, 2026

More in Models & Labs

Models & Labsmodels

vLLM v0.20.2 Patch Release

The vLLM v0.20.2 release is a minor update focusing on bug fixes for DeepSeek V4, gpt-oss, and Qwen3-VL. This patch addresses specific issues such as the MTP=1 hang on DeepSeek V4 by re-enabling the persistent topk path and fixing a KV cache allocation error. For gpt-oss, the update ensures compatibility with MXFP4 under torch.compile, while Qwen3-VL sees the removal of an invalid boundary check. These fixes enhance the stability and performance of the models, ensuring smoother operations under various conditions.

vLLM Releases·May 29, 2026
AWS Launches OpenSearch Serverless for AI Agents© TechCrunch AI
Models & Labsmodels

AWS Launches OpenSearch Serverless for AI Agents

AWS is reshaping its cloud infrastructure to better accommodate AI agents with the launch of its next-generation OpenSearch Serverless. This new system is designed to handle the unpredictable traffic patterns of AI agents, scaling compute resources up and down as needed, which can significantly reduce costs for users. By decoupling compute from storage, AWS allows for instant scalability, ensuring that resources are only used when necessary. This shift reflects a broader industry trend as cloud providers adapt to the growing presence of machine-generated traffic, making AI agents more efficient and cost-effective to deploy.

TechCrunch AI·May 28, 2026
Anthropic releases Opus 4.8 with Dynamic Workflow© TechCrunch AI
Models & Labsmodels

Anthropic releases Opus 4.8 with Dynamic Workflow

Anthropic's release of Opus 4.8 marks a significant step forward in AI model development, particularly with its new Dynamic Workflows feature. This tool allows the model to manage complex tasks across numerous subagents, enhancing its capability to handle large-scale code migrations. The model also improves on handling uncertain data, proactively flagging potential issues, which sets it apart from competitors. While the Mythos model remains on hold due to cybersecurity concerns, Opus 4.8's advancements suggest Anthropic is keen to maintain its competitive edge in the rapidly evolving AI landscape.

TechCrunch AI·May 28, 2026