
The latest Replicate blog discusses concepts in GPT models, introduces real-time speech-to-text capabilities in the browser, and announces the upcoming availability of H100 GPUs.
Read originalThe latest version b8991 of llama.cpp has been released, featuring updates for various operating systems.
The latest update to llama-mmap improves compatibility with various platforms and model sizes. Key enhancements include support for 32-bit wasm and updates to gguf.cpp style.
