
NVIDIA has enhanced Google DeepMind's DiffusionGemma model to achieve faster text generation on NVIDIA GPUs. DiffusionGemma, built on the Gemma 4 architecture, generates text in parallel blocks rather than sequentially, allowing for up to 4x faster performance. This model is optimized for NVIDIA's hardware, including GeForce RTX and DGX systems, enabling local, low-latency AI applications. The open model is available under an Apache 2.0 license and can be accessed through platforms like Hugging Face Transformers.
Read original
© NVIDIA BlogNVIDIA is making significant strides in the robotaxi industry with the introduction of its Halos Operating System, designed to enhance safety and reliability in autonomous vehicles. This system, built on the NVIDIA DRIVE Hyperion platform, integrates a certified OS foundation, standardized interfaces, and safety guardrails for AI, ensuring vehicles operate within verifiable limits. The Halos OS also includes a comprehensive safety evaluation framework, drawing from extensive research and patents, to support scalable deployment. This development marks a crucial step in making autonomous vehicles safer and more reliable, paving the way for broader adoption in cities worldwide.
© NVIDIA BlogNVIDIA's Confidential Computing technology is now a key component of Apple's Private Cloud Compute, which is expanding its capabilities to Google Cloud. Announced at Apple's WWDC, this integration uses NVIDIA Blackwell GPUs to bolster server-side inference for Apple's Foundation Models. The technology ensures that sensitive data remains protected during processing, marking a significant advancement in secure AI infrastructure. This development demonstrates the growing necessity of merging high-performance AI processing with stringent privacy and security measures.
Claude Code's latest update introduces the Claude Fable 5, a Mythos-class model now safe for general use. This model surpasses previous offerings in capability, marking a significant step forward for developers using Claude Code. Additionally, the update resolves an issue with session transcripts not saving when launched from certain environments. This release enhances both the power and reliability of the Claude Code platform, offering developers a more robust toolset for their projects.
The latest b9590 release of llama.cpp addresses a critical issue where the LFM2 template handler was ignoring the json_schema from response_format, focusing solely on tool-calling grammar. This update ensures more robust handling of JSON schemas, which is crucial for developers relying on precise data formatting. The release also includes a variety of platform-specific builds, though some features like KleidiAI on macOS and SYCL on Windows remain disabled. This update is a step forward in refining the tool's functionality, particularly for those working with complex data structures.
The b9591 release of llama.cpp brings notable improvements to Multi-Task Processing (MTP) by removing padding and optimizing data handling. The update refines the ggml_gated_delta_net function, which now only requires the initial recurrent state and uses a snapshot count as an operational parameter, enhancing processing efficiency. These changes are implemented across all backends, addressing previous review comments and fixing CI build errors. With support for diverse hardware configurations, including macOS Apple Silicon, ROCm 7.2 on Ubuntu, and CUDA 12 and 13 on Windows, this release is a significant step forward for developers seeking improved performance and reliability.