HONG KONG, May 19, 2026 /PRNewswire/ — Tencent Cloud, the cloud business of leading global technology company, Tencent, today announced a strategic collaboration with Stream, the company behind the open-source AI agent framework Vision Agents, to accelerate the development of real-time, multimodal AI agents.
Through this collaboration, Tencent Real-Time Communication (Tencent RTC) becomes an officially supported edge transport plugin for Vision Agents, giving developers worldwide a low-latency path to build and deploy interactive AI applications across global markets, including regions where network complexity and real-time performance are critical.
Unlocking Lower-Latency Transport for Enhanced Experiences Across China and Asia
Vision Agents is an open-source, edge-agnostic Python framework from Stream that helps developers quickly build low-latency vision AI applications. Rather than retrofitting video onto a voice-centric stack, Vision Agents was designed as a video-first solution — running models such as YOLO, Roboflow, OpenAI Realtime, and Google Gemini on every frame, with sub-500ms end-to-end latency and over 25 out-of-the-box integrations across LLM, STT, TTS, vision, RAG, telephony, and avatar providers.
Through this partnership, Tencent RTC becomes an officially supported edge transport plugin for Vision Agents. Developers can use Tencent RTC to replace the default communication layer in Vision Agents and instantly leverage Tencent Cloud’s enterprise-grade backbone — more than 3,200 global nodes, sub-300ms worldwide latency, AI-driven noise suppression, and weak-network resilience — while keeping every existing LLM, STT, TTS, vision, and avatar plugin unchanged.
The integration supports both audio and video, making it suitable for voice agents, video agents, and multimodal scenarios — powering use cases such as gaming assistants, virtual avatars, sports coaching, and robotics. AI agents can join TRTC rooms and interact with participants in real-time through high-quality audio and video streams.
Tencent RTC operates a globally distributed real-time network with particularly strong performance across some markets in Asia where many global real-time stacks face connectivity and latency challenges. By integrating Tencent RTC, Vision Agents gives developers worldwide a reliable transport option for delivering low-latency, multimodal AI experiences to users in Asia. Developers can improve real-time communication performance by simply swapping the interface.
Wison Xie, Head of Product at Tencent RTC, said: “Vision Agents represents exactly where conversational AI is heading, beyond voice-only, into agents that can truly see, hear, and act in real time. By bringing Tencent RTC’s global real-time backbone to the Vision Agents framework, we are giving developers worldwide a turnkey path to ship multimodal agents that perform reliably from Silicon Valley to Shenzhen. This collaboration reinforces our commitment to powering the next generation of real-time AI experiences for enterprises and developers across global market.”
Neevash Ramdial, Director of Marketing and Vision Agents Lead, said, “Our goal with Vision Agents is to make real-time AI development faster, more flexible, and open, giving developers the freedom to choose the models, infrastructure, and plugins that work best for their applications. Developers building global conversational AI applications also need reliable real-time performance in every market, and Tencent RTC brings high-quality, low-latency connectivity across Asia to the Vision Agents ecosystem. We’re excited to work with Tencent RTC to help developers scale multimodal AI experiences worldwide while having the freedom to use whichever plugin or model best fits their app.”
About Tencent Cloud:
Tencent Cloud, one of the world’s leading cloud companies, is committed to creating innovative solutions to resolve real-world issues and enabling digital transformation for smart industries. Through our extensive global infrastructure, Tencent Cloud provides businesses across the globe with stable and secure industry-leading cloud products and services, leveraging technological advancements such as cloud computing, Big Data analytics, AI, IoT, and network security. It is our constant mission to meet the needs of industries across the board, including the fields of gaming, media and entertainment, finance, healthcare, property, retail, travel, and transportation.
About Tencent RTC:
Tencent RTC provides real-time communication solutions, including audio/video calling, live streaming, and in-game voice. With enterprise-grade security, AI-powered enhancements, and a global network of over 3,200 nodes, Tencent RTC powers mission-critical communication for customers worldwide.
About Vision Agents:
Vision Agents is Stream’s open-source framework that helps developers quickly build real-time video AI applications. It works out of the box with most major LLM, speech-to-text, text-to-speech, avatar, and infrastructure providers, so teams can go from idea to production in just a few lines of code, building everything from real-time sports coaches to rich, context-aware avatars.




