The b9503 release of llama.cpp focuses on fixing an issue related to the Gemma 4 audio projector embedding size. This update removes the projection_dim from clip_n_mmproj_embd, which is a technical adjustment aimed at improving the software's performance. The release includes compatibility updates for multiple operating systems, such as macOS, Linux, and Windows. While not introducing new features, this update is part of continuous efforts to enhance the software's stability and functionality.
Read originalThe b9504 release of llama.cpp continues to broaden its reach, enhancing compatibility across multiple environments. This update notably includes support for Ubuntu with ROCm 7.2, which boosts performance for AMD GPU users. While features like KleidiAI on macOS and SYCL on Windows are not yet active, the release still represents a significant step in making llama.cpp a more adaptable tool for developers. By focusing on expanding compatibility and improving the runtime experience, this update strengthens llama.cpp's position as a versatile option for developers working with different systems.
The b9505 release of llama.cpp continues its trend of broadening compatibility across various systems, though with some notable exceptions. While macOS Apple Silicon users see KleidiAI support disabled, the release strengthens its presence on Windows with CUDA 12 and 13 DLLs, and extends Vulkan support to more environments. The inclusion of ROCm 7.2 for Ubuntu x64 users further narrows the gap between AMD and NVIDIA GPU support. This update underscores llama.cpp's commitment to being a versatile inference runtime, though some features remain disabled, indicating ongoing development challenges.
The b9509 release of llama.cpp brings a key optimization by preventing unnecessary checkpoint restores when new tokens are detected. This update ensures that the system only applies a conservative -1 subtraction when no new tokens are present, thereby minimizing redundant KV state restoration. Developers working with token-based tasks will find this change streamlines processing and boosts efficiency. While the release doesn't introduce new models or architectures, it enhances the runtime's performance across macOS, Linux, and Windows, including support for ROCm 7.2 and CUDA 12 and 13. This makes llama.cpp more efficient and adaptable for developers using different hardware configurations.
© Google Research BlogGoogle has open-sourced its advanced AI-based hydrology model, aiming to enhance global flood forecasting capabilities. This move allows National Meteorological and Hydrological Services to integrate sophisticated AI tools into their workflows, potentially improving the accuracy and timeliness of flood warnings. By releasing the model on GitHub, Google empowers local experts to refine and adapt the technology using their own data, fostering a more resilient approach to flood management. This initiative democratizes access to cutting-edge forecasting tools, especially benefiting regions with limited resources.
© TechCrunch AIMicrosoft's new Agent Control Specification (ACS) offers developers a unified way to manage AI agent behavior across various environments. By allowing teams to define specific policies, ACS ensures agents operate within set boundaries, reducing the risk of unintended actions. This open-source standard integrates controls into a common governance layer, making it easier to audit and reuse across different systems. With ACS, developers can maintain consistent oversight, enhancing both security and compliance in AI deployments.
© Lev SelectorCohere has open-sourced its Command A+ model, making it accessible for public use.