主流 Agent 框架对比 | Agent 产品架构师知识库

01 · 为什么先选框架 / Why Framework Selection Comes First

为什么先选框架，
而不是先写代码？

Why Choose a Framework Before Writing a Single Line of Code?

框架不只是工具——它决定了你系统的结构、调试方式、扩展路径，以及出了问题时你能不能找到原因。

A framework isn't just a tool — it determines your system's structure, how you debug, how you scale, and whether you can ever find the root cause when something goes wrong.

⚠️ 一个真实的代价：框架选错了，三个月白费 / A Real Cost: Wrong Framework, Three Months Wasted

某出行公司的 AI 团队，花了三个月用 AutoGen 构建了一套客服 Agent 系统。上线前压测发现：在高并发场景下，多个 Agent 之间的对话消耗了大量 token，响应时间超过 8 秒，远超产品要求的 2 秒。

An AI team at a travel company spent three months building a customer service Agent system with AutoGen. Pre-launch load testing revealed that in high-concurrency scenarios, inter-agent conversations consumed enormous tokens, pushing response time above 8 seconds — far beyond the 2-second product requirement.

问题的根源在于：AutoGen 的多 Agent 对话设计更适合"深度推理、低并发"的场景，而不是"高并发、强实时性"的客服场景。这个问题不是优化能解决的，是框架的底层架构决定的。团队最终不得不重新选型，用 LangGraph 重写了整套系统。

The root cause: AutoGen's multi-agent dialogue design is optimized for "deep reasoning, low concurrency" scenarios — not "high concurrency, strict real-time" customer service. No optimization could fix this; it was baked into the framework's architecture. The team had to re-select and rebuild the entire system with LangGraph.

教训：框架选型不是技术决策，是产品决策。它应该在你想清楚"系统要在什么场景下跑、用户体验要达到什么标准"之后做出，而不是先选个"最流行的"框架再去适配需求。

Lesson: Framework selection is a product decision, not a technical one. It should come after you've thought through "what scenarios must this system handle, and what UX standards must it meet" — not before, with a "pick whatever's popular" attitude.

选框架之前，先把这三个问题回答清楚：① 你的任务有多复杂、流程有多长？② 团队的技术深度是什么水平？③ 你现在是在验证 idea，还是在做生产系统？这三个问题的答案，基本上就决定了框架的选择方向。

Before choosing a framework, answer these three questions: ① How complex is your task, how long is the workflow? ② What's your team's technical depth? ③ Are you validating an idea or building a production system? These three answers largely determine your framework direction.

02 · LangGraph / State-Graph Driven Control

LangGraph
状态图驱动的精确控制

LangGraph: Precise Control Through State Graphs

如果你需要对 Agent 的每一步执行有完整的控制权——知道它在哪个节点、为什么走这条路、出了问题怎么回退——LangGraph 是目前最成熟的选择。

If you need complete control over every step of Agent execution — knowing which node it's on, why it took this path, how to roll back when something fails — LangGraph is currently the most mature choice.

LangGraph 企业级首选 / Enterprise First Choice

由 LangChain 团队开发，用"状态图"（State Graph）管理 Agent 的执行流程。核心思路是：把整个 Agent 的工作流程拆成一个个"节点"（Node），节点之间的流转由"边"（Edge）决定，每一步都有完整的状态记录和可控的流转逻辑。

Built by the LangChain team, LangGraph manages Agent execution using State Graphs. Core idea: break the entire Agent workflow into "nodes," with transitions between nodes governed by "edges." Every step has complete state tracking and controllable routing logic.

最适合场景 / Best For

流程复杂、需要精确控制、生产稳定性要求高的企业级 Agent

上手难度 / Learning Curve

🟡 中高（需要理解状态图概念，有一定 Python 基础）

生产稳定性 / Prod Stability

⭐⭐⭐⭐⭐ 目前企业级 Agent 最成熟的框架

LangGraph 的核心优势

LangGraph's Core Strengths

①

完整的状态管理：知道 Agent 在哪、做了什么

每个节点执行时，当前状态（用户输入、历史记录、中间结果）都被完整保留。这意味着出了问题，你可以精确定位到哪一步出错——而不是面对一个"结果不对但不知道为什么"的黑盒。

During each node's execution, the complete current state (user input, history, intermediate results) is preserved. When something goes wrong, you can pinpoint exactly which step failed — rather than facing a black box where "the result is wrong but we don't know why."

②

条件流转：根据结果动态决定下一步

执行完一个节点后，可以根据输出结果动态决定走哪条路——成功继续、失败重试、置信度低转人工。这让复杂的业务逻辑可以被精确表达，而不是让 Agent 靠"猜"来决定下一步。

After executing a node, the next path is dynamically chosen based on output — continue on success, retry on failure, escalate to human when confidence is low. Complex business logic can be expressed precisely, rather than letting the Agent "guess" what to do next.

③

人在回路（Human-in-the-Loop）原生支持

LangGraph 内置了"暂停执行、等待人工介入"的机制。在出行场景里，支付这类高风险操作可以设计成：Agent 到达支付节点后自动暂停，等待用户确认，确认后继续执行。这不需要额外开发，框架本身支持。

LangGraph has built-in support for "pause execution, await human intervention." In travel scenarios, high-risk actions like payments can be designed to auto-pause when the Agent reaches the payment node, wait for user confirmation, then resume. No extra development needed — the framework handles it natively.

④

并行执行：多个节点同时跑，提升速度

不依赖前一步结果的任务可以并行执行——比如同时查询火车票和机票，而不是先查完一个再查另一个。这在需要整合多路数据的场景下，能显著降低总响应时间。

Independent tasks can execute in parallel — e.g., querying train and flight availability simultaneously instead of sequentially. This significantly reduces total response time in scenarios requiring data from multiple sources.

    LangGraph 的适用信号：如果你发现自己在用代码写大量的 if-else 来控制 Agent 的流程，或者你的系统已经出了问题但无法定位在哪里——这两个信号都在告诉你：你需要 LangGraph。
  

    Signs LangGraph is right for you: If you find yourself writing massive if-else chains in code to control Agent flow, or your system is failing but you can't locate where — both signals say you need LangGraph.
  

LangGraph 状态图示意 / LangGraph State Graph Structure

03 · CrewAI / Multi-Role Collaboration

CrewAI
多角色分工协作

CrewAI: Multi-Role Agent Collaboration

有些任务需要"一个团队"而不是"一个人"——CrewAI 用多个扮演不同角色的 Agent 分工完成任务，最像人类团队协作的框架。

Some tasks need a team, not one person. CrewAI uses multiple Agents playing different roles to collaboratively complete tasks — the framework most analogous to how human teams work.

CrewAI 内容生产 / 快速验证 · Content & Rapid Prototyping

CrewAI 的核心概念是"Crew"（团队）——你定义一组 Agent，每个 Agent 有自己的角色（Role）、目标（Goal）、背景知识（Backstory），然后把一个大任务分配给这个团队，由团队内部协调完成。

CrewAI's core concept is the "Crew." You define a group of Agents, each with its own Role, Goal, and Backstory, then assign a large task to the crew and let them coordinate to complete it internally.

最适合场景 / Best For

内容生产、市场调研、报告撰写等需要多角色分工的任务

上手难度 / Learning Curve

🟢 低（角色定义直观，上手快，适合快速验证）

生产稳定性 / Prod Stability

⭐⭐⭐ 适合中低并发场景，高并发下需要额外设计

用出行场景举例：可以定义一个"行程规划师 Agent"负责理解用户需求和整体方案、一个"票务查询 Agent"负责实时查询交通信息、一个"酒店推荐 Agent"负责选择住宿、一个"审核 Agent"负责检查整个方案是否合理。这四个角色分工明确，CrewAI 负责协调他们之间的信息传递和任务依赖。

Travel scenario example: define an "Itinerary Planner Agent" for understanding needs and overall planning, a "Ticketing Agent" for real-time transport queries, a "Hotel Recommendation Agent" for accommodation selection, and a "Review Agent" for checking the full plan's coherence. These four roles have clear responsibilities; CrewAI handles coordination, information passing, and task dependencies between them.

    CrewAI 最大的优势：角色定义非常直观——你用自然语言描述每个 Agent 是谁、负责什么、擅长什么，CrewAI 会据此构建 Agent 的行为。对于没有深厚技术背景的团队，这是门槛最低的多 Agent 协作方案。
  

    CrewAI's biggest advantage: Role definition is highly intuitive — you describe each Agent in natural language: who it is, what it handles, what it's good at. CrewAI uses this to shape Agent behavior. For teams without deep technical backgrounds, this is the lowest-barrier multi-agent collaboration solution.
  

    CrewAI 的边界：它对任务执行流程的控制粒度不如 LangGraph 精细——当你需要"在某个特定条件下做某件具体的事"时，CrewAI 的表达能力有限。适合内容生产类任务，不适合强逻辑控制的业务系统。
  

    CrewAI's ceiling: It offers less fine-grained control over execution flow than LangGraph. When you need "do this specific thing under this specific condition," CrewAI's expressiveness has limits. Good for content production; not suited for business systems requiring tight logical control.
  

04 · AutoGen / Conversation-Driven, Code-First

AutoGen
对话驱动，代码见长

AutoGen: Conversation-Driven Multi-Agent with Strong Code Execution

微软出品，擅长让多个 Agent 通过对话协作，尤其在涉及代码生成和执行的场景下表现突出。

Built by Microsoft, AutoGen excels at multi-agent collaboration through conversation, particularly in scenarios involving code generation and execution.

AutoGen 代码执行 / 深度推理 · Code Execution & Deep Reasoning

AutoGen 的核心模式是"多个 Agent 互相对话"——一个 Agent 提出方案，另一个 Agent 批评或修正，通过多轮对话逐步收敛到最优解。它的代码执行能力特别强：Agent 可以写出 Python 代码、自动执行、根据执行结果修改代码，循环迭代。

AutoGen's core mode is "multiple Agents conversing with each other" — one Agent proposes a solution, another critiques or refines it, converging to the best answer through multiple rounds. Its code execution capability is particularly strong: Agents can write Python code, auto-execute it, modify based on results, and iterate in a loop.

最适合场景 / Best For

数据分析、代码生成、复杂推理、需要 Agent 写代码解决问题的场景

上手难度 / Learning Curve

🟡 中等（概念简洁，但深度定制需要较强 Python 能力）

生产稳定性 / Prod Stability

⭐⭐⭐ 适合低并发的深度推理任务，高并发客服场景慎用

AutoGen 在出行场景的典型应用：用户描述一个复杂的行程需求，AutoGen 可以让"规划 Agent"写出一套初始方案，"验证 Agent"检查价格和时间是否合理，"优化 Agent"根据反馈调整——这个过程非常像一个真实的人类团队在讨论方案。

AutoGen in a travel scenario: a user describes a complex itinerary requirement. AutoGen lets a "Planning Agent" draft an initial plan, a "Validation Agent" check whether prices and timing are reasonable, and an "Optimization Agent" refine based on feedback — very similar to how a real human team would discuss a plan.

    AutoGen 的注意事项：多 Agent 对话会产生大量 token 消耗，响应时间相对较长。在需要快速响应（<2秒）的 C 端产品场景中，需要格外谨慎。它更适合"后台深度处理"而非"前台实时交互"的场景。
  

    AutoGen's caveats: Multi-agent dialogue generates significant token consumption and relatively long response times. In consumer-facing products requiring fast responses (<2 seconds), use extreme caution. AutoGen fits "background deep processing" better than "real-time front-end interaction."
  

05 · Dify / Low-Code, Fast Validation

Dify
低代码平台，快速验证

Dify: Low-Code Platform for Fast Prototyping and Validation

不写代码，在可视化界面里搭出 Agent 流程——Dify 是验证 Agent idea 的最快方式，但有清晰的上限。

Build Agent workflows in a visual interface without writing code. Dify is the fastest way to validate an Agent idea — but it has a clear ceiling.

Dify 低代码 / 快速原型 · Low-Code & Rapid Prototype

Dify 是一个 LLM 应用开发平台，提供可视化的工作流编辑器——你可以拖拽节点、配置参数、连接工具，不需要写代码就能搭出一个能用的 Agent。它内置了对话管理、知识库、变量处理、API 集成等常用能力。

Dify is an LLM application development platform with a visual workflow editor. You can drag and drop nodes, configure parameters, and connect tools — no code required to build a working Agent. It ships with built-in conversation management, knowledge bases, variable handling, and API integrations.

最适合场景 / Best For

快速验证 Agent idea、没有专职工程师的团队、内部工具

上手难度 / Learning Curve

🟢 最低（产品经理可以直接操作，不依赖工程师）

生产稳定性 / Prod Stability

⭐⭐⭐ 小规模场景可用，复杂业务逻辑受限

Dify 的最大价值在于：让不写代码的人也能快速验证 Agent idea。产品经理可以直接在 Dify 上搭出一个出行咨询 Agent，测试用户反应，收集反馈——完全不需要等工程师开发。这个验证周期从"几周"变成了"几天"甚至"几小时"。

Dify's greatest value: letting non-engineers rapidly validate Agent ideas. A product manager can build a travel consultation Agent directly in Dify, test user reactions, and gather feedback — no waiting for engineers. The validation cycle goes from "weeks" to "days" or even "hours."

    Dify 的上限在哪里：当你的业务逻辑变复杂时——需要精确的条件分支、复杂的数据处理、高并发性能优化——Dify 的可视化配置就不够用了。这时候需要迁移到代码框架。从 Dify 迁移到 LangGraph 这条路，比从零开始要难，因为 Dify 的抽象层和 LangGraph 的概念不完全对应。所以一开始就要想清楚：你是在做一个"永久的原型"，还是一个"要上生产的系统"？
  

    Where Dify hits its ceiling: When business logic becomes complex — precise conditional branching, complex data processing, high-concurrency performance tuning — Dify's visual configuration runs out. You need to migrate to a code framework. Migrating from Dify to LangGraph is harder than starting fresh, because Dify's abstraction layer doesn't map cleanly onto LangGraph's concepts. So decide upfront: are you building a "permanent prototype" or a "system that goes to production"?
  

06 · 选型核心逻辑 / The Core Selection Logic

选型核心逻辑
与框架无关思维

The Core Selection Logic — and Framework-Agnostic Thinking

没有最好的框架，只有最合适的框架。更重要的是：建立"框架无关"的思维，让你在框架迭代、被替代时，能快速适应。

There's no best framework, only the most suitable one. More importantly: develop framework-agnostic thinking so you can adapt quickly when frameworks iterate or get replaced.

四框架横向对比

Head-to-Head Comparison: All Four Frameworks

维度 / Dimension	LangGraph	CrewAI	AutoGen	Dify
上手难度 Learning Curve	中高	低	中	最低
流程控制精度 Control Precision	最高	中	中	有限
多 Agent 协作 Multi-Agent Support	支持（图节点）	原生设计	对话式协作	基础支持
代码执行 Code Execution	需自行集成	需自行集成	原生支持	有限
高并发性能 High Concurrency	强	中	弱（多对话消耗大）	取决于部署
生产稳定性 Production Stability	最成熟	中等	中等	适合小规模
适合阶段 Best Stage	生产系统	验证 + 内容场景	深度推理任务	概念验证（PoC）

选型决策的三个核心问题

The Three Core Questions That Drive Framework Selection

你的任务有多复杂？流程有多长？

单步任务（用户问→模型答）：连框架都不需要，直接调 API 就行。多步任务但流程固定：Dify 或 CrewAI 够用。多步任务且有复杂条件分支、重试逻辑、人工介入节点：LangGraph。

Single-step tasks (user asks → model answers): you don't even need a framework, just call the API directly. Multi-step tasks with fixed flows: Dify or CrewAI. Multi-step tasks with complex conditional branching, retry logic, or human-in-the-loop nodes: LangGraph.

你的团队有多少技术深度？

全是产品和运营，没有工程师：Dify。有工程师但 AI 经验有限：CrewAI 或 AutoGen 快速起步，再迁移。有经验的 AI 工程团队，追求生产稳定性：直接上 LangGraph。

All product/ops, no engineers: Dify. Have engineers but limited AI experience: start with CrewAI or AutoGen, migrate later. Experienced AI engineering team aiming for production stability: go straight to LangGraph.

你现在是在验证 idea，还是在做生产系统？

验证 idea（未来两周内）：Dify 或 CrewAI，速度第一。做生产系统（要上线服务真实用户）：从一开始就用 LangGraph，省去后续迁移的痛苦。别用"先用 Dify 验证，再迁到 LangGraph"——这条路的成本比你想象的高。

Validating an idea (within two weeks): Dify or CrewAI, speed first. Building a production system (going live for real users): use LangGraph from day one and avoid the migration pain. Don't fall for "validate with Dify, then migrate to LangGraph" — that migration costs more than you think.

框架会迭代、会被替代——
底层的设计思维才是长期有效的资产。

Frameworks iterate and get replaced —
the underlying design thinking is the durable, long-term asset.

所谓"框架无关思维"，是指：理解 Agent 系统的核心抽象——节点、状态、流转逻辑、工具调用、记忆管理——而不是把自己绑死在某一个框架的 API 上。这些核心抽象在 LangGraph、CrewAI、AutoGen 里都存在，只是表达方式不同。掌握了底层概念，遇到任何新框架都能快速上手。

Framework-agnostic thinking means: understand the core abstractions of Agent systems — nodes, state, routing logic, tool calls, memory management — rather than binding yourself to a specific framework's API. These core abstractions exist in LangGraph, CrewAI, and AutoGen alike; only the syntax differs. Master the underlying concepts and you can pick up any new framework quickly.

产品负责人的决策清单 / Product Owner Checklist

框架选型前的自查清单

我们的 Agent 任务，是"单步回答"还是"多步执行"？多步执行的流程大概有几个节点？ Is our Agent task "single-step answering" or "multi-step execution"? If multi-step, roughly how many nodes are in the flow?
对响应时间有多严格的要求？如果是 C 端产品要求 <2s，某些框架需要谨慎评估。 How strict are our response time requirements? If it's a consumer product requiring <2s, certain frameworks need careful evaluation.
团队里有没有有经验的 AI 工程师？如果没有，选上手难度低的框架，避免卡在配置上。 Does the team have experienced AI engineers? If not, choose low-barrier frameworks to avoid getting stuck in configuration hell.
这是一个要上线服务真实用户的系统，还是一个内部原型？两种情况的框架选择逻辑完全不同。 Is this a production system serving real users, or an internal prototype? The framework selection logic is completely different for each.
有没有想清楚"出了问题怎么排查"？可观测性是否在框架选型的考量里？ Have we thought through "how do we debug when something goes wrong"? Is observability part of the framework selection criteria?

ARCHITECT NOTE · 本篇核心结论 / Chapter Takeaways

框架选型是 Agent 系统搭建里最被低估的决策之一。选错了，你会在几个月后才意识到代价——通常是在系统已经做了一大半的时候。

Framework selection is one of the most underestimated decisions in building an Agent system. Choose wrong, and you'll realize the cost months later — usually when the system is already half-built.

记住这个原则：框架选型应该由你的场景需求决定，而不是由"当下最流行"或"团队最熟悉"决定。如果你的产品需要上生产、服务真实用户、有稳定性要求——从第一天就用 LangGraph，哪怕上手慢一点。

Remember this principle: Framework selection should be driven by your scenario requirements — not by "what's trending" or "what we know best." If your product needs to go to production, serve real users, and meet stability requirements — use LangGraph from day one, even if the learning curve is steeper.

更重要的长期投资是：在任何框架之上，建立起你自己对 Agent 系统核心抽象的理解。框架是会换的，底层的思维方式不会过时。

The more important long-term investment: above any framework, build your own understanding of the core Agent system abstractions. Frameworks get replaced. The underlying way of thinking doesn't go stale.

📖 中英词汇对照表

Glossary of Key Terms · Agent Frameworks / Architecture Concepts

状态图

State Graph

LangGraph 的核心概念，用有向图表示 Agent 的执行流程，图中的节点是执行单元，边是流转条件，当前状态被完整保留在图的上下文中。

LangGraph's core concept — a directed graph representing Agent execution flow. Nodes are execution units, edges are routing conditions, and the current state is fully preserved in the graph context.

人在回路

Human-in-the-Loop (HITL)

在 Agent 执行流程中设计人工介入节点，允许在关键决策点暂停执行、等待人工确认后再继续。

Designing human intervention checkpoints in the Agent execution flow — allowing the system to pause at critical decision points, await human confirmation, then resume.

多 Agent 系统

Multi-Agent System (MAS)

由多个独立 Agent 组成的系统，每个 Agent 承担特定职责，通过协作共同完成复杂任务。

A system composed of multiple independent Agents, each with specific responsibilities, collaborating to complete complex tasks together.

低代码平台

Low-Code Platform

通过可视化界面和配置代替代码编写，降低构建 Agent 应用门槛的开发平台。Dify 是代表性产品。

Development platforms that use visual interfaces and configuration instead of code to lower the barrier to building Agent applications. Dify is the representative product.

条件流转

Conditional Routing

Agent 根据当前执行结果动态决定下一步走哪个节点/分支的能力，是复杂业务逻辑的核心实现机制。

The ability for an Agent to dynamically choose which node/branch to follow next based on current execution results — the core mechanism for implementing complex business logic.

框架无关思维

Framework-Agnostic Thinking

理解 Agent 系统的底层抽象（状态、节点、工具、记忆），不将思维绑定于特定框架的 API，使得在不同框架间切换时能快速适应。

Understanding the underlying abstractions of Agent systems (state, nodes, tools, memory) without binding thinking to a specific framework's API — enabling fast adaptation when switching between frameworks.

并行执行

Parallel Execution

在 Agent 工作流中，让不互相依赖的多个节点同时执行，以减少总响应时间。

Running multiple independent nodes simultaneously within an Agent workflow to reduce total response time.

概念验证

Proof of Concept (PoC)

在正式开发前，用最小投入验证 Agent 设计思路和核心功能是否可行的早期原型。

An early prototype that validates whether an Agent's design concept and core functionality are feasible — before committing to full development.

生产稳定性

Production Stability

系统在真实用户规模和真实流量下持续稳定运行的能力，包括高并发处理、容错、故障恢复等。

A system's ability to run continuously and stably at real user scale and traffic levels — including high-concurrency handling, fault tolerance, and failure recovery.

选型决策

Framework Selection Decision

根据场景复杂度、团队技术深度和系统阶段，选择最合适的 Agent 框架的决策过程。选错框架可能导致数月后系统重写。

The decision process of selecting the most suitable Agent framework based on scenario complexity, team technical depth, and system stage. A wrong choice can force a full rewrite months later.

角色（Agent Role）

Agent Role

在 CrewAI 等多 Agent 框架中，为每个 Agent 定义的职责描述，决定 Agent 的行为边界和专注方向。

In multi-agent frameworks like CrewAI, the responsibility description defined for each Agent — determining its behavioral boundaries and area of focus.

代码执行沙盒

Code Execution Sandbox

Agent 执行生成代码的隔离环境，防止恶意代码或错误代码影响生产系统。AutoGen 的代码执行特性需要配合沙盒使用。

An isolated environment for Agents to execute generated code — preventing malicious or erroneous code from affecting the production system. AutoGen's code execution capability must be paired with a sandbox.

为什么先选框架，而不是先写代码？

LangGraph状态图驱动的精确控制

完整的状态管理：知道 Agent 在哪、做了什么

条件流转：根据结果动态决定下一步

人在回路（Human-in-the-Loop）原生支持

并行执行：多个节点同时跑，提升速度

CrewAI多角色分工协作

AutoGen对话驱动，代码见长

Dify低代码平台，快速验证

选型核心逻辑与框架无关思维

你的任务有多复杂？流程有多长？

你的团队有多少技术深度？

你现在是在验证 idea，还是在做生产系统？

框架选型前的自查清单

📖 中英词汇对照表

为什么先选框架，
而不是先写代码？

LangGraph
状态图驱动的精确控制

CrewAI
多角色分工协作

AutoGen
对话驱动，代码见长

Dify
低代码平台，快速验证

选型核心逻辑
与框架无关思维