vLLM has released version 0.20.1, which introduces a significant change by replacing the deadsnakes PPA with the option to build Python from source. This update aims to enhance performance by eliminating the slow and unreliable wait times associated with the Launchpad PPA servers. The change is expected to streamline the installation process for users. Overall, this update reflects a commitment to improving user experience and efficiency in Python installations.
Read original
© GitHub ChangelogGitHub will deprecate GPT-5.2 and GPT-5.2-Codex models by June 1, 2026. Users are advised to update their workflows to supported models.
The latest update to ggml-webgpu addresses vectorized handling in the mul-mat and mul-mat-id functions. This release includes support for various operating systems and architectures.
The v0.19.0rc0 release introduces a feature for CPU key-value cache offloading, enhancing performance. This update was signed off by Yifan Qiao.