Llama.cpp has released version b9038, focusing on improving OpenCL memory estimation. The update uses CL_DEVICE_GLOBAL_MEM_SIZE to provide more accurate memory estimates, aiding developers in optimizing AI models. This enhancement is part of a broader effort to support diverse hardware, including macOS, Windows, and Linux platforms. The release does not include new models but enhances the tool's utility for AI inference.
Read originalThe latest b9041 release of llama.cpp continues its trend of broadening platform compatibility, making it a versatile choice for developers across different environments. Notably, this update includes support for macOS Apple Silicon with KleidiAI enabled, as well as expanded Vulkan and ROCm 7.2 support on Ubuntu. This release doesn't introduce new models but focuses on enhancing the runtime's adaptability across various hardware configurations. By doing so, llama.cpp strengthens its position as a go-to inference runtime for developers seeking flexibility beyond NVIDIA's CUDA ecosystem.
Llama.cpp's latest update expands its functionality by integrating IBM's Granite-Speech, significantly enhancing its audio processing capabilities. The update features a Conformer encoder with Shaw relative position encoding and a QFormer projector, which efficiently compresses audio data into the LLM embedding space. This ensures precise token-for-token matching with HF transformers on audio clips, demonstrating its robustness. By incorporating these advanced audio processing techniques, llama.cpp becomes a more versatile tool for developers, extending its utility beyond text to include sophisticated audio data handling.
The transition from vLLM V0 to V1 represents a major backend overhaul, prioritizing parity before modifying reinforcement learning objectives. By resolving issues such as processed rollout logprobs and runtime defaults, the vLLM team ensured that V1's outputs meet the expectations set by V0. This approach demonstrates the critical role of backend accuracy in preserving training integrity. With these adjustments, V1 now mirrors V0's behavior, creating a stable foundation for future enhancements in RL objectives without the complications of backend discrepancies.
© TechCrunch AIGenesis AI, a startup backed by Khosla Ventures, has unveiled its first full-stack robotics model, GENE-26.5, featuring human-like robotic hands. This development marks a significant step as the company aims to bridge the 'embodiment gap' in robotics by mimicking human hand functionality. The robotic hands are capable of performing complex tasks such as cooking and lab work, showcasing their potential for real-world applications. The startup's innovative approach includes a sensor-loaded glove for data collection, which could revolutionize how robots are trained. This move positions Genesis AI as a notable player in the robotics industry, with plans to expand further into general-purpose robotics.