The b9010 release of llama.cpp resolves a significant issue with CUDA device PCI bus ID detection that led to out-of-memory errors by ignoring additional GPUs. This fix enhances multi-GPU support, particularly benefiting users on Windows platforms. The update also includes platform-specific improvements for macOS, Linux, and Windows, with notable support for Apple Silicon and Vulkan. This release focuses on improving stability and compatibility rather than introducing new features.
Read originalThe b9008 release of llama.cpp continues its trend of broadening platform support, making it a versatile tool for developers across various systems. This update includes new builds for macOS, Linux, Windows, and Android, with notable additions like Vulkan support on Ubuntu and Windows, and ROCm 7.2 on Ubuntu. By enhancing compatibility with different architectures, including Apple Silicon and Intel on macOS, and CUDA on Windows, llama.cpp is positioning itself as a go-to runtime for diverse hardware environments. While there are no groundbreaking new features, the release solidifies llama.cpp's role as a flexible and accessible inference tool for developers.
The b9002 version of Llama.cpp has been released, supporting multiple platforms.
The b9004 release of llama.cpp introduces support for various platforms including macOS, Linux, Android, and Windows.
© Lev SelectorDeepSeek V4 Pro is a new AI model with 1.6 trillion parameters.
© Matt WolfeDeepSeek has launched a preview of its V4 model.
© Matt WolfeNVIDIA has introduced the Nemotron 3 Nano Omni multimodal AI agent.