Models & Labs

Google Unveils Gemini Omni and Gemini 3.5 Models

Google AI BlogMay 29, 2026high confidence

Why it matters

→Gemini Omni revolutionizes video editing by allowing natural language interactions.
→Gemini 3.5 Flash enhances agentic task performance, crucial for complex workflows.
→Integration across Google platforms brings advanced AI capabilities to everyday users.

Google Unveils Gemini Omni and Gemini 3.5 Models — ©Google AI Blog

Google has introduced its latest AI models, Gemini Omni and Gemini 3.5, at Google I/O 2026. Gemini Omni enables users to create and edit videos through conversational commands, offering a new way to interact with multimedia content. Gemini 3.5 Flash focuses on agentic tasks and coding, providing advanced capabilities for complex workflows. These models are being integrated across Google's platforms, including Search and the Gemini app, to enhance user experiences with personalized AI agents and interactive tools. This development underscores Google's commitment to advancing AI technology in practical applications.

Read original

More from Google AI Blog

Models & Labsagents

Google Enhances Gemini API with Managed Agents

Google's latest update to the Gemini API introduces managed agents with new capabilities like environment hooks and model selection, enhancing automation and control. These agents can now execute complex tasks autonomously within a cloud sandbox, offering developers a robust tool for managing workflows. The introduction of a free tier allows experimentation without financial commitment, while budget controls prevent excessive resource consumption. This update positions Gemini API as a powerful tool for developers looking to streamline and automate their coding processes efficiently.

Google AI BlogJul 28, 2026

More in Models & Labs

Models & Labsmodels

Llama.cpp adds GLM-5.2 speculative decoding support

Llama.cpp's latest update introduces speculative decoding support for GLM-5.2, enhancing its capabilities with NextN/MTP features. This addition allows for more efficient tensor loading and context management, particularly benefiting models using the GLM_DSA architecture. The update also includes options for exporting models with or without the MTP feature, providing flexibility for developers. This release marks a step forward in optimizing model performance and adaptability, especially for those leveraging the GLM-5.2 framework.

llama.cpp ReleasesJul 30, 2026

Models & Labsmodels

Llama.cpp b10178 Release Adds Trace Logging

The b10178 release of llama.cpp enhances its server capabilities by adding trace logging for slot similarity checking, offering developers detailed insights into prompt cache slot selection processes. This update includes specifics on skip reasons and similarity calculations, which can aid in performance optimization. While no new model architectures are introduced, the release continues to support a wide array of platforms, such as macOS with KleidiAI, Ubuntu with ROCm 7.2, and Windows with CUDA 12 and 13. This makes llama.cpp a more versatile tool for developers working on different systems, reinforcing its position as a comprehensive inference runtime.

llama.cpp ReleasesJul 30, 2026

Models & Labsmodels

llama.cpp b10180 Release Enhances SYCL Performance

The b10180 release of llama.cpp brings notable improvements to SYCL performance, focusing on unary elementwise operations. By introducing a contiguous fast path and employing 32-bit index math, the update aims to boost computational efficiency. The integration of fastdiv for elementwise index math further enhances processing speed. Although there are no new models in this release, llama.cpp continues to evolve as a flexible inference runtime, now more efficient on systems like macOS, Linux, and Windows. Developers working with SYCL can expect smoother and faster operations, reinforcing llama.cpp's adaptability across different computing environments.

llama.cpp ReleasesJul 30, 2026