
Ollama has updated its MLX engine to achieve its highest performance yet on Apple Silicon. By utilizing Apple's unified memory and the Metal-backed MLX framework, the engine now delivers faster and more efficient model outputs. The update introduces support for NVIDIA's NVFP4 format, enhancing output quality while maintaining performance. Additionally, new optimizations have increased processing speed by 20%, and a snapshot system improves responsiveness in agent workflows. This update enhances the portability and efficiency of AI models on consumer hardware.
Read originalClaude Code's latest update introduces the Claude Fable 5, a Mythos-class model now safe for general use. This model surpasses previous offerings in capability, marking a significant step forward for developers using Claude Code. Additionally, the update resolves an issue with session transcripts not saving when launched from certain environments. This release enhances both the power and reliability of the Claude Code platform, offering developers a more robust toolset for their projects.
The latest b9590 release of llama.cpp addresses a critical issue where the LFM2 template handler was ignoring the json_schema from response_format, focusing solely on tool-calling grammar. This update ensures more robust handling of JSON schemas, which is crucial for developers relying on precise data formatting. The release also includes a variety of platform-specific builds, though some features like KleidiAI on macOS and SYCL on Windows remain disabled. This update is a step forward in refining the tool's functionality, particularly for those working with complex data structures.
The b9591 release of llama.cpp brings notable improvements to Multi-Task Processing (MTP) by removing padding and optimizing data handling. The update refines the ggml_gated_delta_net function, which now only requires the initial recurrent state and uses a snapshot count as an operational parameter, enhancing processing efficiency. These changes are implemented across all backends, addressing previous review comments and fixing CI build errors. With support for diverse hardware configurations, including macOS Apple Silicon, ROCm 7.2 on Ubuntu, and CUDA 12 and 13 on Windows, this release is a significant step forward for developers seeking improved performance and reliability.