
Hewlett Packard Enterprise (HPE) and NVIDIA have expanded their AI Factory to support the development and deployment of agentic AI systems. This includes the introduction of the NVIDIA Vera CPU, designed for real-time data processing in agent systems, and the NVIDIA Agent Toolkit for managing autonomous agents. The expansion also features NVIDIA Confidential Computing for secure data handling. This collaboration aims to facilitate the transition of agentic AI from proof of concept to production, providing enterprises with the necessary tools and infrastructure for advanced AI applications.
Read original
© NVIDIA BlogFrance is making significant strides in AI infrastructure, leveraging NVIDIA technologies to enhance its capabilities. With the construction of a new 44-megawatt data center by Mistral and the deployment of NVIDIA Blackwell instances by Scaleway, France is positioning itself as a key player in Europe's AI landscape. The collaboration with NVIDIA and other partners is fostering the development of open models tailored to local languages and cultural contexts, ensuring compliance with European regulations. This initiative marks a shift from pilot projects to full-scale AI production, promising to accelerate AI adoption across various industries in France.
© NVIDIA BlogNVIDIA XR AI is transforming how AI agents interact with the physical world by integrating with AR glasses and XR devices. This developer library allows for the creation of spatially aware, multimodal AI agents that can perceive, reason, and act in real-time, providing low-latency, context-aware assistance. By leveraging NVIDIA's accelerated computing platforms, these agents can operate effectively in dynamic environments, from factory floors to research labs. This development marks a significant step in embedding AI into everyday workflows, enhancing productivity and decision-making across various industries.
© NVIDIA BlogCoherent's groundbreaking for an expanded facility in Sherman, Texas, marks a significant step in bolstering the U.S. semiconductor manufacturing landscape. This expansion will enhance the production of indium phosphide wafers, crucial for the optical components that form the backbone of AI systems. Supported by a $50 million CHIPS Act grant, this move aligns with broader efforts to reindustrialize the U.S. and strengthen domestic supply chains. As AI systems grow, the need for efficient optical connectivity becomes paramount, and Coherent's facility aims to meet this demand, ensuring that AI infrastructure can scale effectively.
The latest release candidate for vLLM, version 0.22.1rc1, introduces a change in the Docker setup by removing the use of extra-index-url for the flashinfer-jit-cache. This adjustment simplifies the Docker configuration, potentially reducing dependency management issues and improving build reliability. While this update might seem minor, it reflects ongoing efforts to streamline the development process and enhance the usability of vLLM for developers. This change is particularly relevant for those maintaining Docker environments and looking for more efficient ways to manage dependencies.
The latest b9688 release of llama.cpp introduces significant updates to its server capabilities, including a new model management API and real-time SSE updates. These enhancements aim to streamline the deployment and management of AI models, making it easier for developers to integrate and maintain models in various environments. The update also includes a download API and a delete endpoint, providing more control over model assets. While the release doesn't introduce new models, it strengthens the infrastructure, making llama.cpp a more robust choice for developers working with diverse hardware configurations.
The latest release of llama.cpp, version b9689, enhances its Metal backend by adding support for f16 and bf16 tensor types in the concat operator. This update broadens the compatibility of the Metal backend, which previously supported only f32 and i32 types. By templating the kernel_concat on type T and adding type-specific pipeline getters, the release ensures more efficient processing across different data types. This development is particularly relevant for developers working on macOS and iOS platforms, as it expands the capabilities of AI models running on Apple Silicon and other supported devices.