
NVIDIA's Vera CPU is being integrated into new supercomputers at Los Alamos National Laboratory, aiming to enhance scientific discovery through advanced AI capabilities. The Mission and Vision systems will utilize the Vera CPU's superior performance, which surpasses traditional x86 CPUs by over three times, to support agentic AI in scientific research. These supercomputers, expected to be operational by 2027, will serve both classified national security and fundamental science research. This development continues a long-standing collaboration between LANL and NVIDIA, focusing on extreme codesign for simulation workloads.
Read original
© NVIDIA BlogNVIDIA and AWS are collaborating to streamline AI deployment at scale, addressing key challenges like low-latency inference and GPU price-performance. The introduction of EC2 G7 instances, powered by NVIDIA RTX PRO 4500 Blackwell GPUs, offers significant performance improvements over previous generations, making them ideal for AI, graphics, and data analytics workloads. Additionally, NVIDIA's cuVS library now powers GPU-accelerated vector indexing in Amazon OpenSearch, drastically reducing costs and time for building large-scale vector databases. This partnership ensures that enterprises can leverage high-performance AI infrastructure without the complexity of managing it themselves.
© NVIDIA BlogNVIDIA's new Agent Toolkit is a significant step towards creating specialized AI agents that can be customized and trusted by enterprises. By providing a modular foundation of models, tools, and secure runtime, the toolkit allows businesses to build AI systems tailored to their specific workflows. This development is particularly impactful in industries like life sciences and healthcare, where AI agents can drastically reduce the time needed for complex tasks such as protein design and clinical documentation. The toolkit's open nature ensures that companies can integrate these agents into existing systems, enhancing efficiency and control.
© NVIDIA BlogNVIDIA's technology now powers over 400 of the world's 500 fastest supercomputers, marking a significant presence in the TOP500 list. This dominance is driven by their GPUs and networking solutions, with NVIDIA Grace CPUs seeing increased adoption. The company's systems are not only fast but also energy-efficient, as evidenced by their top rankings in the Green500 list. This trend highlights the growing reliance on accelerated computing for AI and scientific research, with NVIDIA at the forefront of this shift. The landscape of high-performance computing is increasingly defined by NVIDIA's comprehensive hardware stack.
The b9767 release of llama.cpp introduces significant improvements to MTP inference by optimizing the mat-vec path for small batches, which enhances decoding efficiency. A new barrier in the NUM_COLS loop of the mul-mat-vec process is expected to boost performance. While no new model architectures are included, this update refines the platform's capabilities across macOS, Linux, and Windows. Notably, it supports macOS Apple Silicon, Ubuntu with ROCm 7.2, and Windows with CUDA 12 and 13. This release continues llama.cpp's focus on performance optimization and compatibility, making it a more powerful tool for developers.
The b9768 release of llama.cpp expands its capabilities by integrating Granite Speech Plus, which enhances audio processing with multi-layer concatenation. This update is particularly relevant for developers focused on audio applications, as it resolves naming inconsistencies and standardizes feature layer usage. While no new models are introduced, the release fortifies the existing framework, making it more reliable for audio tasks. This iteration marks a refinement in the tool's functionality, especially for those utilizing its audio features.
The latest b9774 release of llama.cpp brings significant improvements to Vulkan support, enabling backend tests for various mathematical operations like SQR, SQRT, SIN, and COS. This update also enhances the handling of noncontiguous data in norm operations, broadening the library's applicability across different platforms. While the release doesn't introduce new models, it strengthens the existing infrastructure, particularly for developers working with Vulkan and other supported platforms. This makes llama.cpp a more robust choice for those looking to leverage GPU capabilities beyond NVIDIA's CUDA ecosystem.