
Hugging Face has introduced MolmoMotion, a model designed to predict 3D motion trajectories from video frames and language instructions. This model uses a sparse set of 3D points to represent motion, which is more efficient than full video rendering. Alongside the model, Hugging Face released the MolmoMotion-1M dataset and PointMotionBench benchmark to aid in accuracy testing. MolmoMotion outperforms existing methods, offering significant advancements for applications in robotics and video generation.
Read original
© Hugging Face BlogThe Strands Robots SDK, an open-source toolkit from AWS, simplifies the process of deploying AI models from the Hugging Face Hub to robot hardware. By integrating the LeRobot stack as AgentTools, developers can now create a single agent that handles simulation, policy inference, and deployment to physical robots with minimal code changes. This integration allows for seamless coordination across multiple robots using a peer mesh network. The SDK's ability to maintain consistent dataset formats between simulation and hardware ensures that developers can easily transition from testing to real-world applications.
© Hugging Face BlogGLM-5.2 marks a significant step forward in handling long-horizon coding tasks with its robust 1M-token context capability. By introducing IndexShare, the model reduces computational demands while maintaining high performance across extended contexts. This release positions GLM-5.2 as a leading open-source model, outperforming its predecessor and closing the gap with proprietary models on key benchmarks. The model's ability to balance performance with computational cost through effort level control offers users flexibility in managing complex coding tasks. This advancement makes GLM-5.2 a practical tool for sustained engineering work, particularly in scenarios requiring extensive context handling.
Hugging Face has implemented the Agentic Resource Discovery (ARD) specification, a collaborative effort with industry giants like Microsoft and Google. This open standard allows AI agents to dynamically discover and utilize capabilities without pre-installation, shifting from static catalogs to intent-based searches. The Hugging Face Discover Tool serves as a reference implementation, enabling search access to a wide array of AI skills and services. This development marks a significant step towards more flexible and scalable AI agent ecosystems, allowing for seamless integration and discovery of tools across federated registries.
The latest release candidate for vLLM, version 0.22.1rc1, introduces a change in the Docker setup by removing the use of extra-index-url for the flashinfer-jit-cache. This adjustment simplifies the Docker configuration, potentially reducing dependency management issues and improving build reliability. While this update might seem minor, it reflects ongoing efforts to streamline the development process and enhance the usability of vLLM for developers. This change is particularly relevant for those maintaining Docker environments and looking for more efficient ways to manage dependencies.
The latest b9688 release of llama.cpp introduces significant updates to its server capabilities, including a new model management API and real-time SSE updates. These enhancements aim to streamline the deployment and management of AI models, making it easier for developers to integrate and maintain models in various environments. The update also includes a download API and a delete endpoint, providing more control over model assets. While the release doesn't introduce new models, it strengthens the infrastructure, making llama.cpp a more robust choice for developers working with diverse hardware configurations.
The latest release of llama.cpp, version b9689, enhances its Metal backend by adding support for f16 and bf16 tensor types in the concat operator. This update broadens the compatibility of the Metal backend, which previously supported only f32 and i32 types. By templating the kernel_concat on type T and adding type-specific pipeline getters, the release ensures more efficient processing across different data types. This development is particularly relevant for developers working on macOS and iOS platforms, as it expands the capabilities of AI models running on Apple Silicon and other supported devices.