The b9596 release of llama.cpp introduces expanded platform support, including ROCm 7.2 for Ubuntu x64, enhancing usability for AMD GPU users. This update aims to reduce the performance gap between AMD and NVIDIA GPUs, making llama.cpp more accessible across different systems. While certain features like KleidiAI on macOS remain disabled, the release still represents a significant step forward in platform compatibility. This update allows developers to explore improved performance on a wider range of hardware configurations.
Read originalThe latest b9590 release of llama.cpp addresses a critical issue where the LFM2 template handler was ignoring the json_schema from response_format, focusing solely on tool-calling grammar. This update ensures more robust handling of JSON schemas, which is crucial for developers relying on precise data formatting. The release also includes a variety of platform-specific builds, though some features like KleidiAI on macOS and SYCL on Windows remain disabled. This update is a step forward in refining the tool's functionality, particularly for those working with complex data structures.
The b9591 release of llama.cpp brings notable improvements to Multi-Task Processing (MTP) by removing padding and optimizing data handling. The update refines the ggml_gated_delta_net function, which now only requires the initial recurrent state and uses a snapshot count as an operational parameter, enhancing processing efficiency. These changes are implemented across all backends, addressing previous review comments and fixing CI build errors. With support for diverse hardware configurations, including macOS Apple Silicon, ROCm 7.2 on Ubuntu, and CUDA 12 and 13 on Windows, this release is a significant step forward for developers seeking improved performance and reliability.
The b9601 release of llama.cpp significantly extends its reach by supporting more platforms, enhancing its utility for developers. This update includes Ubuntu builds with ROCm 7.2, which is a boon for AMD GPU users seeking alternatives to NVIDIA's CUDA. Although features like KleidiAI on macOS and SYCL on Windows are currently disabled, the release still represents a meaningful step in making llama.cpp adaptable to a wider range of hardware. While no new models are introduced, the focus on expanding runtime compatibility marks a strategic move to increase the tool's versatility.
OpenEnv is evolving into a pivotal open-source tool for agentic reinforcement learning (RL), now backed by a coalition of major AI organizations including Meta-PyTorch, Nvidia, and Hugging Face. This initiative aims to standardize the interface between RL environments and trainers, promoting interoperability and efficiency. By serving as a common socket for various RL components, OpenEnv facilitates seamless integration across different ecosystems. This move is set to enhance the development of specialized models and harnesses, making RL more accessible and efficient for the open-source community.
© Lev SelectorJetBrains has open-sourced Mellum2, a 12 billion parameter mixture of experts model.
© Google Research BlogGoogle has open-sourced its advanced AI-based hydrology model, aiming to enhance global flood forecasting capabilities. This move allows National Meteorological and Hydrological Services to integrate sophisticated AI tools into their workflows, potentially improving the accuracy and timeliness of flood warnings. By releasing the model on GitHub, Google empowers local experts to refine and adapt the technology using their own data, fostering a more resilient approach to flood management. This initiative democratizes access to cutting-edge forecasting tools, especially benefiting regions with limited resources.