【20251022AI日报】Qwen's new Deep Research update lets you turn its reports into webpages, podcasts in seconds

今日新鲜事 · 10-21
本文字数:约 4500 字,预计阅读时间:15 分钟

Qwen's new Deep Research update lets you turn its reports into webpages, podcasts in seconds

Chinese e-commerce giant Alibaba’s Qwen Team has introduced a major expansion to its Qwen Deep Research tool. The update allows users to generate comprehensive research reports, interactive web pages, and multi-speaker podcasts with just a few clicks. This functionality is part of a proprietary release, distinct from many of Qwen’s previous open-source model offerings.

The core workflow begins with a user request inside the Qwen Chat interface. Qwen collaborates by asking clarifying questions to shape the research scope, pulling data from web and official sources, and analyzing or resolving inconsistencies, even generating custom code when needed. Once the research is complete, users can convert the output into various formats: a PDF-style report, a live, professional-grade web page, and an audio podcast.

The web page includes inline graphics generated by Qwen Image, making it suitable for public presentations, classrooms, or publishing. The podcast feature allows users to select from 17 different speaker names as the host and 7 as the co-host. The website is hosted via a public link, while the podcast must be downloaded by the user and can't be linked to publicly.

This feature is available through the Qwen Chat app, and no pricing details have been provided for Qwen3-Max or the specific Deep Research capabilities as of this writing. The update was announced via the team’s official X account (@Alibaba_Qwen) on October 21, 2025.


DeepSeek drops open-source model that compresses text 10x through images, defying conventions

DeepSeek, a Chinese AI research company, has released a new model called DeepSeek-OCR, which fundamentally reimagines how large language models process information. The model compresses text through visual representation up to 10 times more efficiently than traditional text tokens. This breakthrough challenges a core assumption in AI development and could pave the way for language models with dramatically expanded context windows, potentially reaching tens of millions of tokens.

The model consists of two primary components: DeepEncoder, a novel 380-million-parameter vision encoder, and a 3-billion-parameter mixture-of-experts language decoder with 570 million activated parameters. DeepEncoder combines Meta’s Segment Anything Model (SAM) for local visual perception with OpenAI’s CLIP model for global visual understanding, connected through a 16x compression module.

To validate their compression claims, DeepSeek researchers tested the model on the Fox benchmark, achieving 97.3% accuracy on documents containing 700-800 text tokens using just 100 vision tokens, representing an effective compression ratio of 7.5x. The model is designed to support five distinct resolution modes, each optimized for different compression ratios and use cases.

The implications of this breakthrough extend far beyond compression. The approach could eliminate the need for complex and limiting tokenizers, enabling new capabilities such as handling formatting information lost in pure text representations. The training process employed pipeline parallelism across 160 Nvidia A100-40G GPUs, with the vision encoder divided between two pipeline stages and the language model split across two others.

The full model weights, training code, and inference scripts were released on GitHub and Hugging Face, gaining over 4,000 stars within 24 hours of release. This open-source release ensures the technique will be widely explored and tested.


Google's new vibe coding AI Studio experience lets anyone build, deploy apps live in minutes

Google AI Studio has introduced a new vibe coding interface, buttons, suggestions, and community features that allow anyone with an idea for an app to bring it into existence and deploy it live on the web within minutes. The updated Build tab is available at ai.studio/build and is free to start.

The updated Build tab introduces a new layout and workflow where users can select from Google’s suite of AI models and features to power their applications. The default is Gemini 2.5 Pro, which is great for most cases. Users can describe what they want to build, and the system automatically assembles the necessary components using Gemini’s APIs.

The platform supports mixing capabilities like Nano Banana, Veo, Imagine, Flashlight, and Google Search. Once an app is generated, users land in a fully interactive editor. The editor displays the full source of the app and allows editing of each component, such as React entry points, API calls, or styling files.

One standout feature is the “I’m Feeling Lucky” button, which generates randomized app concepts and configures the app setup accordingly. Examples produced during demos include interactive map-based chatbots, dream garden designers, and trivia game apps.

The platform also offers context-aware feature suggestions, generated by Gemini’s Flashlight capability, which analyze the current app and propose relevant improvements. The new experience is available at no cost for users who want to experiment, prototype, or build lightweight apps. Advanced capabilities, such as using models like Veo 3.1 or deploying through Cloud Run, do require switching to a paid API key.


讯飞刚发的财报:净利润暴涨了202%

新闻图片

讯飞发布的最新财报显示,其净利润暴涨了202%。AI大模型成为拉动业绩修复的关键引擎。这份亮眼的成绩单不仅展示了讯飞在AI领域的强大实力,也体现了其在技术创新和市场应用上的领先地位。AI大模型的突破性进展,为讯飞带来了显著的业绩提升,也为未来的发展奠定了坚实的基础。

商汤科技亮相上海知识产权国际论坛,探讨AI×IP创新变革新格局

新闻图片

商汤科技在近期的上海知识产权国际论坛上展示了其在AI领域的创新成果,并探讨了AI与知识产权结合的创新变革格局。商汤科技详细阐述了人工智能在知识产权领域的创新应用,同时也指出了在这一过程中面临的新挑战。商汤科技的这些创新应用为知识产权保护提供了新的思路和方法,有助于推动行业的健康发展。

真假混战:Agent元年,如何拨开概念迷雾?

新闻图片

在AI Agent概念兴起的这一年,市场上出现了真假混战的局面。这场关于AI Agent的博弈,不是真假之争,而是短期热度与长期价值的较量。许多公司纷纷推出自己的AI Agent产品,但真正具有实际应用价值的产品并不多见。如何拨开概念迷雾,找到具有长期价值的AI Agent产品,成为当前行业面临的重要问题。

大模型中毒记

新闻图片

一篇关于“大模型中毒记”的文章引起了广泛关注。文章指出,许多大模型在开发过程中遭遇了“中毒”问题,即模型在训练过程中受到了不良数据的影响,导致其输出结果存在偏差或误导性。文章呼吁行业加强数据质量管理,确保大模型的可靠性和安全性。

总结

今日AI领域的新闻主要聚焦于技术进步和应用创新。阿里巴巴的Qwen团队推出了Deep Research工具的重大更新,用户可以快速生成多种格式的输出,包括网页和播客;DeepSeek发布了一款压缩文本的开源模型,挑战了传统AI处理文本的方式;Google AI Studio则推出了新的开发体验,降低了AI应用开发的门槛。这些进展展示了AI在不同领域的广泛应用和深入发展,也为未来的技术创新提供了更多可能性。


作者:Qwen/Qwen2.5-32B-Instruct
文章来源:钛媒体, 量子位, VentureBeat
编辑:小康

Theme Jasmine by Kent Liao