The b9489 release of llama.cpp introduces enhancements primarily for CUDA users, including reserving space for quantized key-value caches at startup. This update also addresses review comments and removes specific assertions in the ggml-cuda.cu file. The release maintains support across multiple platforms such as macOS, Linux, and Windows, though no new models or quantization methods are included. These changes aim to improve the overall efficiency and compatibility of llama.cpp for developers utilizing CUDA.
Read originalThe latest b9490 release of llama.cpp continues its trend of broadening platform compatibility, though with some notable exceptions. While macOS Apple Silicon users see KleidiAI support disabled, the release strengthens its Linux offerings with Vulkan and ROCm 7.2 support on Ubuntu. Windows users benefit from CUDA 12 and 13 DLLs, enhancing GPU performance options. Despite some features being disabled, this update demonstrates llama.cpp's commitment to being a versatile inference runtime across diverse systems.
The b9491 release of llama.cpp resolves PDL race conditions by eliminating 'restrict' from PDL kernel headers, which were previously causing compatibility issues. This update introduces preprocessor directives to ensure performance is maintained on older architectures while simplifying the use of 'restrict' through macros. Additionally, the release addresses the PDL restrict issue on Hopper architectures. These changes are crucial for developers as they enhance compatibility and performance across different operating systems and hardware configurations, making llama.cpp more robust and versatile.
The b9493 release of llama.cpp continues to broaden its platform reach, notably integrating ROCm 7.2 for Ubuntu x64, which offers better support for AMD GPU users. Although features like KleidiAI on macOS Apple Silicon remain inactive, the update emphasizes extending functionality across various systems, including Vulkan support for both Ubuntu and Windows. While no new models are introduced, this release strengthens llama.cpp's role as a versatile inference runtime across multiple operating systems. Developers can now take advantage of improved GPU support, making it a more inclusive tool for those working outside the NVIDIA ecosystem.
© Google Research BlogGoogle has open-sourced its advanced AI-based hydrology model, aiming to enhance global flood forecasting capabilities. This move allows National Meteorological and Hydrological Services to integrate sophisticated AI tools into their workflows, potentially improving the accuracy and timeliness of flood warnings. By releasing the model on GitHub, Google empowers local experts to refine and adapt the technology using their own data, fostering a more resilient approach to flood management. This initiative democratizes access to cutting-edge forecasting tools, especially benefiting regions with limited resources.
© TechCrunch AIMicrosoft's new Agent Control Specification (ACS) offers developers a unified way to manage AI agent behavior across various environments. By allowing teams to define specific policies, ACS ensures agents operate within set boundaries, reducing the risk of unintended actions. This open-source standard integrates controls into a common governance layer, making it easier to audit and reuse across different systems. With ACS, developers can maintain consistent oversight, enhancing both security and compliance in AI deployments.
© Lev SelectorCohere has open-sourced its Command A+ model, making it accessible for public use.