
Thinking Machines, an AI company founded by ex-OpenAI CTO Mira Murati, is developing 'interaction models' to enhance real-time collaboration between humans and AI. These models aim to process multiple modalities like audio, video, and text simultaneously, allowing for more natural and responsive interactions. The company plans to release a limited research preview in the coming months, with a wider rollout expected later this year. This development could redefine how AI systems interact with users, making them more intuitive and effective.
Read originalThe b9103 release of llama.cpp continues its trend of broadening platform compatibility, making it a versatile tool for developers across various systems. With this update, Apple Silicon users benefit from KleidiAI support, enhancing performance on M-series Macs. The inclusion of ROCm 7.2 for Ubuntu x64 further narrows the gap between AMD and NVIDIA GPUs, offering more options for local inference. This release doesn't introduce new models but solidifies llama.cpp's position as a go-to runtime for diverse hardware configurations, ensuring developers can deploy AI models efficiently across multiple environments.
The b9109 release of llama.cpp brings notable advancements in parallel drafting, enhancing the efficiency of model processing. By refining speculative contexts and supporting multiple spec types, the update optimizes the acceptance of tokens and the drafting process. This release ensures compatibility with macOS, Linux, and Windows, including specific support for Apple Silicon with KleidiAI, ROCm 7.2, and CUDA 12 and 13. While it doesn't introduce new model architectures, the focus on refining existing capabilities makes llama.cpp a more robust tool for developers. The improvements in speculative processing and platform-specific enhancements make it a valuable update for those working with AI models.