Google DeepMind's AlphaEvolve, powered by Gemini, has become integral to Google's infrastructure, optimizing TPU designs and cache policies. It has also been deployed commercially, enhancing efficiency in industries like finance, logistics, and advertising. Notably, it doubled training speed for Klarna's models and improved routing efficiency for FM Logistic. This marks a significant step towards AI systems that autonomously learn and optimize, expanding their impact across various sectors.
Read originalThe latest b9060 release of llama.cpp introduces several new SYCL operations, including FILL, CUMSUM, and DIAG, which expand the library's computational capabilities. This update also addresses a critical issue that caused aborts during test-backend-ops, ensuring more stable performance. With the addition of scope_dbg_print to both new and existing SYCL operations, developers gain enhanced debugging tools. This release continues to broaden llama.cpp's platform support, making it a more versatile tool for developers working across different environments.
The b9066 release of llama.cpp brings notable improvements for CUDA users by integrating cublasSgemmStridedBatched, which optimizes batch operations' inner loops. This enhancement is designed to boost performance for developers leveraging CUDA technology. The update also extends compatibility to include macOS Apple Silicon, Ubuntu with ROCm, and Windows with CUDA 12 and 13, ensuring developers can work seamlessly across different systems. While no new models are introduced, the release strengthens llama.cpp's role as a flexible tool for developers working with diverse hardware setups.