
AWS has introduced a new version of OpenSearch Serverless, tailored for the unique demands of AI agents. This update allows for dynamic scaling of compute resources, addressing the unpredictable nature of agent-driven workloads. By separating compute from storage, AWS ensures that users only pay for active usage, potentially lowering costs. This move is part of a larger industry trend as cloud providers adapt to increasing machine-generated traffic. The integration with AI development platforms like Vercel and Kiro further simplifies deployment for developers.
Read original
© TechCrunch AIGlean has achieved a remarkable $300 million in annual recurring revenue, tripling its figures in just 15 months. This growth is particularly notable as the company faces new competition from tech giants like Google and Microsoft in the enterprise AI search market. Glean's edge lies in its 'context graph' technology, which enhances AI efficiency by reducing computing costs for enterprises. This feature is increasingly appealing to businesses aiming to manage their AI budgets more effectively. As the market becomes more crowded, Glean's ability to offer tailored AI solutions gives it a significant advantage. The company's revenue model, which includes both consumption-based and hybrid pricing, reflects its adaptability to client needs.
© TechCrunch AIAsana's acquisition of StackAI marks a strategic move to enhance its AI capabilities and position itself as a leader in AI-native workplace platforms. By integrating StackAI's no-code agent-building technology, Asana aims to deepen its integration into existing business systems like Salesforce and Slack, offering more sophisticated automation solutions. This acquisition is part of Asana's broader AI pivot, which includes products like AI Studio and AI Teammates. Despite recent market challenges, Asana's leadership is optimistic that these advancements will drive growth and recovery.
© TechCrunch AIAnthropic's latest $65 billion funding round propels it to a near $1 trillion valuation, setting the stage for a potential IPO. This massive influx of capital, led by major investors like Altimeter Capital and Sequoia, reflects the intense interest in AI startups. The funds are earmarked for advancing safety research and scaling their Claude model, which has seen significant enterprise adoption. As Anthropic competes with OpenAI, this round highlights the escalating stakes in the AI race, with both companies eyeing public market debuts. The strategic involvement of partners like Samsung and Amazon further amplifies Anthropic's growth trajectory. With these resources, Anthropic is poised to enhance its AI capabilities and expand its market presence.
The vLLM v0.20.2 release is a minor update focusing on bug fixes for DeepSeek V4, gpt-oss, and Qwen3-VL. This patch addresses specific issues such as the MTP=1 hang on DeepSeek V4 by re-enabling the persistent topk path and fixing a KV cache allocation error. For gpt-oss, the update ensures compatibility with MXFP4 under torch.compile, while Qwen3-VL sees the removal of an invalid boundary check. These fixes enhance the stability and performance of the models, ensuring smoother operations under various conditions.
The latest b9387 release of llama.cpp introduces significant performance improvements for AMD MFMA hardware, particularly in quantized matrix multiplication. By optimizing the batch threshold logic, the update allows for more efficient processing, with throughput gains of up to 76% in certain configurations. This release is particularly relevant for users leveraging AMD's MI250X hardware, as it fine-tunes the kernel selection logic to maximize performance. While the update doesn't introduce new models, it significantly enhances the efficiency of existing operations on specific hardware, making it a noteworthy development for those using AMD GPUs.
The latest b9388 release of llama.cpp introduces optimizations for Turing architecture, specifically adding MMVQ_PARAMETERS_TURING to improve JIT compilation for SM75 Turing devices. This update aims to prevent mismatches when compiling Turing device code on Ampere or newer architectures. While the release doesn't introduce new models or quantization methods, it continues to expand platform support, including updates for macOS, Linux, and Windows. The focus remains on refining compatibility and performance across diverse hardware configurations, making llama.cpp a more versatile tool for developers.