Google DeepMind, along with Schmidt Sciences and other partners, has announced a $10 million funding initiative to support research in multi-agent AI safety. This effort aims to address the challenges posed by the interaction of numerous AI agents across digital environments. The funding will support global researchers in developing frameworks to understand and mitigate the risks associated with these interactions. The initiative underscores the importance of establishing safety standards as AI systems become more interconnected and complex.
Read original
© Google DeepMindGoogle DeepMind's DiffusionGemma marks a significant shift in text generation by leveraging diffusion techniques to generate text blocks up to four times faster than traditional models. This 26B Mixture of Experts model, designed for speed-critical applications, moves beyond the sequential token-by-token approach, allowing for parallel generation of 256 tokens. While it offers blazing fast inference on GPUs, it trades off some quality compared to the standard Gemma 4 models. This innovation is particularly beneficial for developers working on real-time interactive AI applications, as it maximizes hardware utilization and reduces latency bottlenecks.
© Google DeepMindGoogle DeepMind's Gemini 3.5 Live Translate marks a significant leap in real-time speech translation, offering fluid and natural-sounding translations across 70+ languages. Unlike traditional systems, it provides continuous translation, maintaining the speaker's intonation and pacing, and operates just seconds behind the speaker. This model is now available for developers via the Gemini Live API and is being integrated into Google Meet and the Google Translate app. The rollout promises to enhance multilingual communication in various settings, from business meetings to everyday conversations.
© Google DeepMindGoogle DeepMind's Gemma 4 12B model is a significant step forward in multimodal AI, offering advanced capabilities in a compact form. By eliminating traditional encoders, it processes visual and audio inputs directly through its language model backbone, reducing latency and memory usage. This makes it feasible to run sophisticated AI tasks on consumer laptops with just 16GB of RAM. The model's open-source release under an Apache 2.0 license encourages widespread adoption and innovation, enabling developers to create powerful applications without the need for high-end hardware.
© MIT News AIMIT researchers have uncovered a significant improvement in Random Utility Models (RUMs) by demonstrating that considering three alternatives instead of two can reveal correlations in preferences. This breakthrough challenges the traditional pairwise comparison method, which fails to capture the interconnectedness of choices. By using a best-of-three approach, the team has developed algorithms that efficiently extract preference information, offering a more accurate prediction model. This advancement is crucial for improving AI models and their commercial applications, particularly in areas like large language models and digital platforms.
Hugging Face's blog post dives into the profiling of PyTorch operations, focusing on the shift from basic matrix operations to using nn.Linear and constructing a Multilayer Perceptron (MLP). The article reveals how nn.Linear manages operations by integrating bias addition into the matrix multiplication kernel, effectively reducing overhead. It also examines the limited impact of torch.compile on single operations, pointing out its potential in more complex scenarios. These insights are crucial for developers aiming to optimize deep learning models on GPUs, as they provide a deeper understanding of how to maximize performance and efficiency.
© AI ExplainedThe inventor of the transformer model has issued a warning regarding potential risks associated with AI advancements.