本页面数据来自 Artificial Analysis(第三方 AI 模型评测平台)的公开榜单。 与 /llm-rankings 主页面(OpenRouter 真实调用数据)互补——AA 更偏基准能力测评(智能指数、Elo Arena 盲测),OpenRouter 偏生态使用度。 4 个 Tab 分别覆盖:通用 LLM / 文本生成视频 / 图像生成视频 / Agent 产品目录。
https://artificialanalysis.ai/leaderboards/models· 提取 Top 20 · 按智能指数降序| # | 模型 | 厂商 | 上下文 | 智能指数 | $/1M tok | tok/s | 首 chunk (s) | 响应 (s) |
|---|---|---|---|---|---|---|---|---|
| 1 | Claude Fable 5 (with fallback) | Anthropic | 1M | 60 | $7.70 | — | — | — |
| 2 | Claude Opus 4.8 (max) | Anthropic | 1M | 56 | $3.85 | 63 | 27.91 | 35.79 |
| 3 | GPT-5.5 (xhigh) | OpenAI | 922k | 55 | $4.35 | 58 | 110.58 | 119.18 |
| 4 | Claude Opus 4.7 (max) | Anthropic | 1M | 54 | $3.85 | 51 | 18.08 | 27.83 |
| 5 | GPT-5.5 (high) | OpenAI | 922k | 53 | $4.35 | 51 | 47.34 | 57.05 |
| 6 | GLM-5.2 (max) | Z AI | 1M | 51 | $0.90 | 106 | 2.11 | 25.64 |
| 7 | Gemini 3.5 Flash | Google | 1M | 50 | $1.31 | 153 | 18.18 | 21.44 |
| 8 | Claude Sonnet 4.6 (max) | Anthropic | 1M | 47 | $2.31 | 54 | 100.33 | 109.52 |
| 9 | GPT-5.5 (medium) | OpenAI | 922k | — | $4.35 | 53 | 10.39 | 19.89 |
| 10 | Gemini 3.1 Pro Preview | Google | 1M | 46 | $1.74 | 122 | 23.97 | 28.05 |
| 11 | Qwen3.7 Max | Alibaba | 1M | 46 | $1.43 | 91 | 2.81 | 34.67 |
| 12 | Gemini 3.5 Flash (medium) | Google | 1M | — | $1.31 | 160 | 15.09 | 18.22 |
| 13 | MiniMax-M3 | MiniMax⭐ | 1M | 44 | $0.22 | 63 | 3.44 | 43.18 |
| 14 | DeepSeek V4 Pro (Max) | DeepSeek | 1M | 44 | $0.18 | 64 | 1.75 | 77.96 |
| 15 | GPT-5.3 Codex (xhigh) | OpenAI | 400k | — | $1.87 | 74 | 119.09 | 125.81 |
| 16 | Muse Spark | Meta | 262k | 43 | — | — | — | — |
| 17 | Kimi K2.6 | Kimi | 256k | 43 | $0.70 | 44 | 2.31 | 114.47 |
| 18 | Claude Opus 4.7 (Non-reasoning, high) | Anthropic | 1M | — | $3.85 | 47 | 1.28 | 11.85 |
| 19 | MiMo-V2.5-Pro | Xiaomi | 1M | 42 | $0.18 | 35 | 2.87 | 73.75 |
| 20 | Kimi K2.7 Code | Kimi | 256k | 42 | $0.70 | 51 | 2.41 | 51.15 |
💡 注:智能指数来自 Artificial Analysis 综合评测(越高越好);价格按 blended 美元 / 1M tokens 计。
https://artificialanalysis.ai/video/leaderboard/text-to-video· Text-to-Video Elo Arena 盲测榜 · 按 Elo 分数降序| # | 范围 | 厂商 | 模型 | Elo | 95% CI | 样本数 | 发布 | $ / min |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | ByteDance Seed | Dreamina Seedance 2.0 720p | 1217 | -9/9 | 9,052 | Mar 2026 | $9.07 |
| 2 | 2 | Alibaba-ATH | HappyHorse-1.0 | 1123 | -9/9 | 5,649 | Apr 2026 | $13.20 |
| 3 | 3-5 | KlingAI | Kling 3.0 1080p (Pro) | 1104 | -9/9 | 6,678 | Feb 2026 | $20.16 |
| 4 | 3-7 | Skywork AI | SkyReels V4 | 1104 | -10/10 | 2,809 | Mar 2026 | $21.00 |
| 5 | 3-8 | KlingAI | Kling 3.0 Omni 1080p (Pro) | 1099 | -9/9 | 6,473 | Feb 2026 | $16.80 |
| 6 | 5-9 | KlingAI | Kling 3.0 720p (Standard) | 1095 | -9/9 | 6,614 | Feb 2026 | $15.12 |
| 7 | 5-9 | KlingAI | Kling 3.0 Omni 720p (Standard) | 1095 | -9/9 | 6,404 | Feb 2026 | $13.44 |
| 8 | 5-9 | Veo 3.1 | 1094 | -9/9 | 6,601 | Jan 2026 | $24.00 | |
| 9 | 6-11 | Veo 3.1 Fast | 1087 | -9/9 | 6,121 | Jan 2026 | $9.00 | |
| 10 | 9-11 | Veo 3.1 Lite | 1083 | -9/9 | 5,814 | Mar 2026 | $4.80 | |
| 11 | 9-11 | Vidu | Vidu Q3 Pro | 1081 | -8/8 | 6,946 | Jan 2026 | $9.60 |
| 12 | 12-13 | PixVerse | PixVerse V6 | 1068 | -9/9 | 7,802 | Mar 2026 | $6.90 |
| 13 | 12-13 | xAI | grok-imagine-video | 1068 | -9/9 | 6,867 | Jan 2026 | $4.20 |
| 14 | 14 | Alibaba | Wan 2.6 | 1020 | -9/9 | 5,308 | Dec 2025 | $9.00 |
| 15 | 15-14 | ByteDance Seed | Seedance 1.5 pro | 1000 | 0/0 | 5,088 | Dec 2025 | $11.86 |
| 16 | 16 | KlingAI | Kling 2.6 Pro (January) | 990 | -9/9 | 6,403 | Jan 2026 | $8.40 |
| 17 | 17 | Lightricks | LTX-2.3 Fast Open Weights | 974 | -9/9 | 5,905 | Mar 2026 | $2.40 |
| 18 | 18 | Lightricks | LTX-2.3 Pro Open Weights | 955 | -9/9 | 6,054 | Mar 2026 | $3.60 |
| 19 | 19-20 | Lightricks | LTX-2 Fast Open Weights | 945 | -9/9 | 5,167 | Oct 2025 | $2.40 |
| 20 | 19-20 | PixVerse | PixVerse V5.6 | 943 | -9/9 | 5,308 | Feb 2026 | — |
💡 注:Elo 分数来自 AA Arena 用户盲测投票(越高越好);价格按生成 1 分钟 1080p 视频计算。
https://artificialanalysis.ai/video/leaderboard/image-to-video· Image-to-Video Elo Arena 盲测榜 · 按 Elo 分数降序| # | 范围 | 厂商 | 模型 | Elo | 95% CI | 样本数 | 发布 | $ / min |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | ByteDance Seed | Dreamina Seedance 2.0 720p | 1193 | -9/9 | 7,684 | Mar 2026 | $9.07 |
| 2 | 2 | xAI | grok-imagine-video-1.5-preview | 1113 | -11/11 | 2,825 | May 2026 | $8.40 |
| 3 | 3-4 | Alibaba-ATH | HappyHorse-1.0 | 1092 | -9/9 | 5,655 | Apr 2026 | $13.20 |
| 4 | 3-6 | Veo 3.1 | 1089 | -9/9 | 6,548 | Jan 2026 | $24.00 | |
| 5 | 3-9 | Skywork AI | SkyReels V4 | 1082 | -11/11 | 2,629 | Mar 2026 | $21.00 |
| 6 | 4-9 | xAI | grok-imagine-video | 1081 | -9/9 | 6,808 | Jan 2026 | $4.20 |
| 7 | 5-10 | Veo 3.1 Fast | 1077 | -9/9 | 5,941 | Jan 2026 | $9.00 | |
| 8 | 5-10 | PixVerse | PixVerse V6 | 1077 | -8/8 | 7,754 | Mar 2026 | $6.90 |
| 9 | 5-10 | KlingAI | Kling 3.0 1080p (Pro) | 1075 | -9/9 | 6,395 | Feb 2026 | $20.16 |
| 10 | 7-11 | KlingAI | Kling 3.0 720p (Standard) | 1070 | -9/9 | 6,500 | Feb 2026 | $15.60 |
| 11 | 10-13 | Vidu | Vidu Q3 Pro | 1063 | -8/8 | 7,096 | Jan 2026 | $9.60 |
| 12 | 11-14 | KlingAI | Kling 3.0 Omni 1080p (Pro) | 1061 | -9/9 | 5,891 | Feb 2026 | $16.80 |
| 13 | 11-14 | Veo 3.1 Lite | 1061 | -9/9 | 5,713 | Mar 2026 | $4.80 | |
| 14 | 12-14 | KlingAI | Kling 3.0 Omni 720p (Standard) | 1053 | -9/9 | 5,977 | Feb 2026 | $13.44 |
| 15 | 15 | KlingAI | Kling 2.6 Pro (January) | 1009 | -9/9 | 6,616 | Jan 2026 | $8.40 |
| 16 | 16-15 | ByteDance Seed | Seedance 1.5 pro | 1000 | 0/0 | 5,294 | Dec 2025 | $11.86 |
| 17 | 17-19 | Lightricks | LTX-2.3 Fast Open Weights | 955 | -9/9 | 6,073 | Mar 2026 | $2.40 |
| 18 | 17-19 | Lightricks | LTX-2.3 Pro Open Weights | 952 | -9/9 | 6,160 | Mar 2026 | $3.60 |
| 19 | 17-19 | PixVerse | PixVerse V5.6 | 952 | -9/9 | 5,379 | Feb 2026 | — |
| 20 | 20 | Lightricks | LTX-2 Fast Open Weights | 939 | -9/9 | 5,232 | Oct 2025 | $2.40 |
💡 注:Elo 分数来自 AA Arena 用户盲测投票(越高越好);价格按生成 1 分钟 1080p 视频计算。
https://artificialanalysis.ai/agents· 产品目录 Top 9 · 包含 ChaosAI 自家产品 ⭐| # | 产品 | 厂商 | 发布 | 平台 | 定价 |
|---|---|---|---|---|---|
| 1 | Claude Cowork | Anthropic | Jan 2026 | macOSWindows | Subscription Enterprise $20–200/mo |
| 2 | ChatGPT Agent | OpenAI | Jul 2025 | WebMobileDesktop | Subscription Enterprise $20–200/mo |
| 3 | OpenClaw⭐ | Peter Steinberger | Nov 2025 | macOSLinuxWindows | Free Free + API costs |
| 4 | Hermes Agent⭐ | Nous Research | Feb 2026 | macOSLinuxWindows | Free Free + API costs |
| 5 | Microsoft Copilot | Microsoft | Nov 2023 | WindowsmacOSWebiOSAndroid | Subscription Enterprise Usage-based $9.99–30/mo |
| 6 | Google Workspace Studio | Dec 2025 | Web | Subscription Enterprise ~$8–27/mo | |
| 7 | Gemini Enterprise | Oct 2025 | Web | Free Subscription Enterprise $21–30/mo | |
| 8 | Manus | Meta | Mar 2025 | WebWindowsmacOSiOSAndroid | Free Subscription Usage-based $20–200/mo |
| 9 | Microsoft Copilot Cowork | Microsoft | Mar 2026 | Web | Subscription Enterprise Usage-based $21–30/mo |
📋 查看所有产品描述
Desktop-first agent that executes tasks in a sandboxed Linux VM with access to local files. The agent can edit and create local files, and has pre-set skills for handling common file types and general work tasks (e.g. editing Excel spreadsheets). Connects to external tools via computer use, MCP connectors, or browser extension.
Agentic system within the ChatGPT app, powered by GPT-5.4 Thinking/Pro — OpenAI's frontier model for agentic workflows. It runs on a cloud VM with no direct file access. GPT-5.4 brings native computer-use capabilities and desktop-level automation, designed for longer-horizon tasks with up to ~30+ minutes of autonomous execution. Successor to Operator and combines Codex-level coding and Deep Research into one experience.
Open-source, self-hosted agent platform that turns existing chat apps (e.g. WhatsApp, Slack, Teams) into autonomous AI assistants. Also has macOS menu bar companion app and WebChat interface. Tasks run in the background with results in chat. Supports community Skills marketplace, NVIDIA NemoClaw stack for secure scaling, and security models.
Open-source, self-hosted agent platform with persistent cross-session memory and a self-improving skills system. Creates procedural skills from experience and reuses them across sessions. Tasks run in the background with results in chat. Includes 40+ built-in tools (e.g. image gen, TTS, vision, Telegram, Discord, Slack, WhatsApp).
Agents run inside Microsoft 365 ecosystem (Teams, Word, Excel, PowerPoint, and Outlook), executing multi-step tasks like document creation, data analysis, and scheduling. Connects to external tools via MCP, Power Platform connectors, and Microsoft Graph. Copilot Studio lets organizations build and deploy custom autonomous agents.
No-code agent builder inside Google Workspace for creating and sharing AI agents. Agents reason, adapt to new information, and handle tasks like email summarization, meeting action items, and document management, with deep Google ecosystem integration (Gmail, Drive, Sheets, Maps, Docs, etc.) and external app support via Marketplace connectors.
Agentic platform for enterprise knowledge workers. Connects to company data via prebuilt connectors (M365, Salesforce, Jira, Confluence, Box). Offers permission-aware search, prebuilt agents like Deep Research, no-code workbench, and Code Assist.
General purpose agent that can orchestrate multiple AI models per task and autonomously plan and execute (web automation, coding, data work, PDF/PPT generation) in a sandboxed environment. Produces deliverables including research reports, websites, and data analysis.
Execution layer for Microsoft 365 Copilot that handles autonomous, multi-step actions across Outlook, Teams, Excel, Word, PowerPoint, and Work IQ. Fully sandboxed your M365 cloud tenant, permission-aware, and enterprise-compliant. Plans, coordinates, and executes tasks (calendar cleanup, competitive research, launch plans, etc.) in the background with checkpoints, clarification requests, and user approvals.
💡 注:数据来自 Artificial Analysis Agent 目录 · ⭐ 标记为本平台自研产品。