知识库 · 第二篇 · 中英双语

LLM 能力边界
与幻觉机制

AI 为什么会"一本正经地说错话"?
Why Does AI "Confidently Say the Wrong Thing"?

这是 Agent 产品里最容易被低估的问题。 大多数团队在早期 Demo 阶段感觉模型很聪明,然后在上线后被各种奇怪的错误砸中,却不知道为什么。 原因只有一个:他们没有搞清楚模型能做什么、不能做什么,以及当它"说错了"的时候,错误是怎么产生的。

这篇文章不讲技术细节,讲的是产品架构师应该建立的那套认知——只有清楚了这些,你才知道在设计系统时,哪里要设护栏、哪里可以放手让 AI 来。

难度:基础-中级 适合:产品架构师 / 负责人 阅读:约 20 分钟 语言:中英双语 / Bilingual
01
AI 为什么会"一本正经地说错话"
02
模型在哪些地方最容易出错
03
不同模型表现一样吗?怎么选
04
产品层面怎么设"护栏"规避幻觉
05
工具调用有多可靠?边界在哪里
01 · 幻觉机制 / Hallucination

AI 为什么会
"一本正经地说错话"?

Why Does AI State Incorrect Information with Absolute Confidence?

幻觉(Hallucination)是 AI 领域里听起来玄乎、但其实原理很好理解的概念。搞清楚它怎么产生的,是设计 Agent 系统的第一步。

有人问 ChatGPT:"北京到上海的高铁 G1 次,2024年6月1日还有没有二等座?" AI 给出了非常具体的回答:有余票,价格 553 元,16:28 发车,20:56 到达。

Someone asked ChatGPT: "Are there second-class seats available on the G1 high-speed train from Beijing to Shanghai on June 1, 2024?" The AI gave a very specific answer: tickets available, price ¥553, departure at 16:28, arrival at 20:56.

听起来非常权威。但这些信息是假的——AI 根本无法查询实时余票,它只是生成了一段"看起来像是真实车次信息"的内容。

But all of this was fabricated. The AI cannot check real-time ticket availability — it simply generated content that looked like real train schedule information.

模型的本质是什么?

What Is a Language Model, Fundamentally?

理解幻觉,要先理解模型在做什么。大模型的工作原理,用一句话解释就是:根据前面所有的内容,预测"接下来最可能出现的词"是什么。它不是在查资料,不是在做逻辑推导,也不是在"思考"——它是在做一个极其复杂的概率计算。

To understand hallucination, you first need to understand what the model is doing. In one sentence: a large language model predicts "the most likely next word" based on everything that came before it. It is not looking up information, not doing logical reasoning, and not "thinking" — it is performing an extraordinarily complex probability calculation.

给定"北京到上海高铁G1次,2024年"这串词,接下来最可能跟的是什么?模型学过大量车票信息相关的文本,所以它知道车次信息应该包含价格、发车时间、到达时间……于是它生成了这些——哪怕它并不知道 2024 年 6 月 1 日那天的真实情况。

Given the phrase "G1 high-speed train from Beijing to Shanghai, 2024," what is most likely to follow? The model has been trained on large amounts of ticket-related text, so it knows that train information should include price, departure time, arrival time... and so it generates exactly that — even though it has no idea what the actual situation was on June 1, 2024.

模型生成内容的过程 · 概率预测 vs 查询事实 / Probability Generation vs. Fact Retrieval
人们以为 AI 在做的事 用户提问 → AI 去数据库查询事实 → 返回真实准确的答案 AI 没有"查询"这个动作 AI 实际在做的事 用户提问 → AI 计算 "接下来什么词的概率最高" → 生成"听起来最合理"的内容 内容可能对,也可能只是"听起来像对的" 幻觉 = 模型生成了"概率上合理但事实上错误"的内容, 而它自己并不知道这是错的

幻觉为什么这么难发现?

Why Is Hallucination So Hard to Detect?

普通的错误,错得很明显——比如把"北京"打成"北蟹",一眼就看出来了。但幻觉不一样:模型生成的错误内容,往往格式正确、语言流畅、语气笃定,和正确答案放在一起根本看不出区别。

Ordinary errors are obvious — like typing "Bejing" instead of "Beijing," you notice it immediately. But hallucination is different: the incorrect content generated by the model is typically well-formatted, fluently written, and stated with complete confidence. Placed next to a correct answer, it looks indistinguishable.

更危险的是:模型没有"我不确定"的内置自觉。它不会在错误答案前加上"我不太确定,但是……"——它会用和说出正确答案时完全一样的语气,给出错误的信息。

What makes this even more dangerous: the model has no built-in sense of uncertainty. It won't preface a wrong answer with "I'm not sure, but..." — it delivers incorrect information in exactly the same confident tone it uses for correct information.

模型不是在"查资料",它是在"生成听起来像是对的内容"。
这两件事看起来像,但本质上完全不同。
The model is not "looking up information" — it is "generating content that sounds correct."
These may look the same from the outside, but they are fundamentally different operations.

幻觉有哪几种?

Types of Hallucination

不是所有"说错了"都是同一种问题,区分清楚才知道怎么应对:

Not all "wrong answers" are the same type of problem. Identifying the category helps you choose the right countermeasure:

幻觉类型 / Type 表现 / What It Looks Like 例子 / Example 风险 / Risk
事实性幻觉
Factual Hallucination
编造了不存在的事实 说某篇论文存在,但实际上根本没有这篇论文
时效性错误
Temporal Error
给出的信息是过时的 票价、法规条文、人事变动——训练截止后的内容不知道
精度幻觉
Precision Hallucination
数字算错,或给出假精确数字 说"这条路线全程 312.7 公里",但这个数字是捏造的
逻辑跳跃
Reasoning Gap
推理过程有漏洞,但结论听起来合理 推荐路线时跳过了某个重要的换乘节点
过度自信
Overconfidence
对不确定的内容给出了确定的答案 "这个限号规定每天都执行"——但实际有例外情况
格式幻觉
Format Hallucination
内容对,但格式是捏造的 要求输出 JSON,但某个字段名和约定的不一致
产品负责人的决策清单 / Product Owner Checklist

在开始设计 Agent 之前,先问自己这几个问题

  • 我的 Agent 需要回答有确定正确答案的问题吗?(比如票价、法规、库存数量)——如果是,必须接入实时数据源,不能依赖模型"背诵" Does my Agent need to answer questions with definitive correct answers (e.g., fares, regulations, inventory)? If so, it must connect to real-time data sources — not rely on the model's "memory."
  • 用户会把 Agent 的回答当作可以直接行动的依据吗?——如果是,需要在高风险输出前加人工确认或数据验证 Will users treat the Agent's answers as direct basis for action? If so, add human confirmation or data validation before high-risk outputs.
  • 如果 Agent 说错了,后果有多严重?——后果越严重,需要的护栏越多 If the Agent is wrong, how severe are the consequences? The more severe, the more guardrails you need.
02 · 能力边界 / Capability Boundaries

模型在哪些地方
最容易出错?

Where Are Models Most Likely to Fail?

每个模型都有自己不擅长的事。不了解这些边界,你就会在错误的地方信任它,然后在出问题时感到莫名其妙。

问 AI:"如果我 2024年3月15日购买了一张6个月有效期的火车票,有效期截止到哪一天?"大多数模型会说:2024年9月15日。听起来对,对吧?但等等——这是需要数日期的计算,包含月份的天数差异,还要考虑"6个月后"是否刚好落在那一天……

Ask an AI: "If I bought a train ticket with a 6-month validity period on March 15, 2024, what is the expiration date?" Most models will say: September 15, 2024. Sounds right, doesn't it? But wait — this requires counting actual calendar days, accounting for month lengths, and determining whether "6 months later" lands precisely on that date...

这种"看起来简单但其实需要精确推算"的问题,是模型最常翻车的场景之一。更危险的是——它算完了还是很自信,根本不会告诉你"我可能算错了"。

This type of "looks simple but requires precise calculation" problem is one of the most common failure scenarios for models. What makes it more dangerous: the model remains completely confident after calculating — it will never tell you "I might have gotten this wrong."

五个最容易翻车的场景

Five High-Failure Scenarios

高风险 · 01 / HIGH RISK

精确数学计算
Precise Math

加减乘除、日期计算、价格汇总。模型在做这些的时候,是"生成答案"而非"真的算"——复杂计算错误率出乎意料地高。

Basic arithmetic, date math, price totals. The model is "generating an answer," not actually computing — error rates on complex calculations are surprisingly high.

→ 解法:让模型调用计算器工具 / Use a calculator tool

高风险 · 02 / HIGH RISK

实时信息
Real-Time Information

今天的天气、实时余票、最新政策、当前油价。模型的知识截止到训练日期,之后发生的事它一无所知,但它可能不会承认这一点。

Today's weather, live ticket availability, latest regulations, current prices. The model's knowledge has a training cutoff — it knows nothing about events after that date, yet may not admit it.

→ 解法:接入实时数据源 / Connect to live data sources

高风险 · 03 / HIGH RISK

超长文本中段
Lost in the Middle

就算模型支持 10 万字上下文,放在文本中间位置的信息常常会被"遗漏"。研究显示这叫"迷失在中间"(Lost in the Middle)效应。

Even with 100K-token context windows, information placed in the middle of a long document is frequently overlooked. Research calls this the "Lost in the Middle" effect.

→ 解法:关键信息放开头/结尾;使用 RAG / Place key info at start or end; use RAG

中风险 · 04 / MED RISK

多步逻辑推理
Multi-Step Reasoning

需要连续推理 5 步以上的问题,模型容易在中途走偏,最终结论看起来合理但前提已经错了。

Problems requiring 5+ consecutive reasoning steps often cause the model to drift mid-process — the final conclusion sounds reasonable, but the premises are already wrong.

→ 解法:让模型显式写出每步推理 / Force explicit step-by-step reasoning

中风险 · 05 / MED RISK

小众垂直领域
Niche Domain Knowledge

模型学习的数据以大众内容为主,小众领域(比如特定城市的地方政策、特定行业的专业规范)的知识往往不全甚至有误。

Training data skews toward mainstream content. Niche domain knowledge — local regulations, industry-specific standards — is often incomplete or incorrect.

→ 解法:接入垂直知识库(RAG) / Inject domain knowledge via RAG

"迷失在中间"效应

The "Lost in the Middle" Effect

斯坦福大学的研究发现,当你给模型一个很长的文档时,它对文档开头和结尾的内容记得最清楚,中间部分的信息则明显容易被忽略。

Stanford research found that when given a long document, models remember the content at the beginning and end most clearly, while information in the middle is significantly more likely to be missed.

模型对长文档不同位置的注意力示意 / Model Attention Distribution Across Document Positions
开头 ✓ 结尾 ✓ 中间(注意力明显下降 / Attention drops here) 注意力 ↑ 记得住 / Remembered 容易被忽略 / Easily missed 记得住

对于 Agent 产品来说,这意味着:如果你把重要的系统指令或关键约束条件塞在一个超长的 Prompt 中间,模型很可能"忘了"它。重要的约束要放在 Prompt 的开头,或者用结构清晰的格式强调

For Agent products, this means: if you bury critical system instructions or key constraints in the middle of a long prompt, the model will likely "forget" them. Place important constraints at the beginning of the prompt, or use clearly structured formatting to emphasize them.

一条实用的经验法则:对于需要精确性的任务(计算、查询、核实信息),永远不要让模型"自己想"——给它工具,让它调用工具来获取准确信息,而不是依赖它的"记忆"。模型的"记忆"是训练数据,不是真实世界的当前状态。
A practical rule of thumb: For tasks requiring precision (calculations, lookups, fact verification), never let the model "figure it out" — give it tools and let it call those tools for accurate information, rather than relying on its "memory." The model's "memory" is its training data, not the current state of the real world.
03 · 模型选型 / Model Selection

不同模型表现一样吗?
Agent 场景怎么选?

Do All Models Perform the Same? How to Choose for Agent Use Cases?

"哪个模型最聪明"是个伪问题。对 Agent 产品来说,真正要问的是:"这个模型在我的具体场景下,稳不稳定?"

某团队在内部测试时发现,GPT-4 在回答复杂问题时表现出色,但在一个需要频繁调用工具、严格按照 JSON 格式输出的 Agent 流程里,它有时候会在 JSON 里多塞一段解释文字,导致下游解析失败。

One team found in internal testing that GPT-4 excelled at complex questions, but in an Agent workflow requiring frequent tool calls and strict JSON output, it would sometimes insert explanatory text inside the JSON — causing downstream parsing failures.

而另一个在"聪明程度"测试上分数略低的模型,在工具调用的格式遵从性上却非常稳定——几乎每次都能给出规范的结构化输出。最后他们换了模型。对 Agent 系统来说,稳定比"聪明"更重要。

A slightly lower-scoring model on intelligence benchmarks was rock-solid on format adherence — producing well-structured output almost every time. They switched models. For Agent systems, consistency beats cleverness.

Agent 场景里真正重要的几个维度

What Actually Matters in Agent Scenarios

评估维度 / Dimension 重要性 / Importance 具体指的是什么 / What It Means
指令遵从能力
Instruction Following
⭐⭐⭐⭐⭐ 最重要 你让它"只输出 JSON,不要解释",它能不能做到?Agent 流程的每一步都依赖这一点
工具调用准确率
Function Calling Accuracy
⭐⭐⭐⭐⭐ 最重要 调用工具时,传入的参数格式对不对、字段有没有缺失、调用的工具选对了吗
长上下文稳定性
Long-Context Consistency
⭐⭐⭐⭐ 对话轮次多了之后,有没有"忘记"前面说过的事,或者前后矛盾
拒绝幻觉能力
Hallucination Refusal
⭐⭐⭐⭐ 当它不确定时,能不能说"我不知道",而不是编造一个答案
语言生成质量
Generation Quality
⭐⭐⭐ 回复是否流畅自然——对最终用户体验有影响,但不是最核心的
推理能力
Reasoning Capability
⭐⭐⭐ 能不能处理复杂的多步骤任务——对于 Agent 复杂任务规划很重要
响应速度
Latency
⭐⭐ 对时间敏感的场景影响体验;但可以用流式输出(Streaming)来弥补

主流模型在 Agent 场景下的特点

Major Models for Agent Use Cases

GPT-4o / GPT-4 Turbo

OpenAI

工具调用成熟度高,生态最完善,文档和社区资源最丰富。是目前 Agent 产品的主流选择,但价格相对较高,速度有时不稳定。

Most mature function-calling ecosystem, richest community resources. The mainstream choice for Agent products — but relatively expensive, and latency can be inconsistent.

适合:核心决策链路 / Best for: core decision pipeline

Claude 3.5 / Claude 3 Opus

Anthropic

指令遵从能力强,在需要"严格按格式输出"的场景下表现出色。长上下文处理能力好,幻觉率相对较低。

Strong instruction-following, excellent at strict format compliance. Good long-context handling and relatively lower hallucination rates.

适合:格式遵从性要求高的 Agent / Best for: format-strict Agents

Gemini 1.5 Pro

Google

超长上下文(100万 token)是核心优势,处理超大文档不需要切分。多模态能力强,可以处理图片、视频等。

1M-token context window is the flagship advantage — process massive documents without chunking. Strong multimodal capabilities for image and video inputs.

适合:大文档/多模态场景 / Best for: large docs, multimodal

Qwen、Kimi 等

国产模型 / Chinese Models

中文理解和生成能力强,对国内场景的知识覆盖更好,有些模型针对工具调用做了专项优化。价格通常也更有竞争力。

Superior Chinese language understanding, better coverage of China-specific knowledge. Some are optimized specifically for function calling. Generally more cost-competitive.

适合:中文产品/成本敏感场景 / Best for: Chinese-language, cost-sensitive

不要只用一个模型——"路由策略"是高级玩法

Don't Use Just One Model — "Routing Strategy" Is the Advanced Play

成熟的 Agent 产品不会只用一个模型。更聪明的做法是:根据任务类型,用不同的模型处理不同的子任务

Mature Agent products don't use just one model. The smarter approach: route different subtasks to different models based on task type.

简单意图识别 → 用小模型

用户说"我要查票"还是"我要退票"——这种简单分类,不需要最贵的模型,用轻量级的模型处理,速度快、成本低。

Simple intent classification — "check tickets" vs. "refund" — doesn't need your most expensive model. A lightweight model handles this faster and cheaper.

复杂任务规划 → 用大模型

帮用户规划多城市行程、处理复杂的退改签策略——这时候用旗舰模型,确保推理质量。

Multi-city itinerary planning, complex refund/change policy handling — use a flagship model here to ensure reasoning quality.

重复性格式化任务 → 用专项优化模型

把工具调用的结果格式化输出给用户——这种高度结构化的任务,可以用针对指令遵从优化的小模型,稳定且便宜。

Formatting tool call results for user display — highly structured tasks like this work great with a small, instruction-optimized model. Stable and cheap.
选模型的底线原则:在正式上线前,把你最担心的那几个边界场景(最复杂的任务、最可能出错的输入)在候选模型上都跑一遍,看谁更稳定——不是谁看起来更聪明,而是谁在你的场景里更不容易出奇怪的错误。
The baseline principle for model selection: Before going live, run your most-feared edge cases (most complex tasks, most error-prone inputs) across all candidate models. See who is more stable — not who seems smarter, but who produces fewer unexpected failures in your specific scenario.
04 · 护栏设计 / Guardrail Design

产品层面
怎么规避幻觉风险?

How to Mitigate Hallucination Risk at the Product Layer?

很多团队想到"降低幻觉",第一反应是"优化 Prompt"。这是必要的,但远远不够。更可靠的方式是在系统架构层面设置护栏——不要假设模型会永远输出正确的内容,而是假设它随时可能出错,然后提前设计好应对方案。

银行 ATM 机会核对你的密码,不是因为不信任你,而是因为密码错误是可以发生的,所以需要验证机制。给 Agent 设置护栏,逻辑完全一样:不是因为不信任模型,而是因为幻觉是可以发生的。

A bank ATM verifies your PIN not because it doesn't trust you, but because PIN errors can happen, so a verification mechanism is needed. Adding guardrails to an Agent follows the same logic: not distrust in the model, but recognition that hallucinations can and do happen.

五层护栏,从轻到重

Five Layers of Guardrails, from Light to Heavy

1

输入层:控制好"喂给模型的东西"

在把用户请求发给模型之前,先做预处理:过滤掉明显的干扰信息、规范化格式、确保系统提示(System Prompt)里的关键约束在最显眼的位置。垃圾进,垃圾出——输入质量决定输出质量的上限。

Input Layer: Pre-process before sending to the model — filter noise, normalize format, ensure key System Prompt constraints are prominently positioned. Garbage in, garbage out — input quality caps output quality.
2

Prompt 层:给模型明确的约束和"逃生路线"

除了告诉模型"你要做什么",还要明确告诉它"当你不确定时,应该怎么处理"。比如:"如果你不确定票价是否准确,请说'我需要查询实时数据,请稍候',而不是给出一个可能不准确的数字。"

Prompt Layer: Beyond telling the model "what to do," explicitly tell it "what to do when uncertain." Example: "If you're unsure the ticket price is accurate, say 'I need to check real-time data, one moment' — don't give a potentially wrong number."
3

工具层:让模型"查"而不是"记"

所有需要精确性的信息——票价、余票、路况、政策——都通过工具调用(Function Calling)从数据源实时获取,不依赖模型的"知识"。工具就是模型的"外挂手册",让它知道去哪儿查,而不是凭记忆猜。

Tool Layer: All precision-required information — fares, availability, traffic, policies — is fetched in real-time via Function Calling, not from the model's training data. Tools are the model's "cheat sheet" — they tell it where to look, rather than letting it guess from memory.
4

输出层:格式校验 + 范围校验

模型的输出在到达下一步之前,先过一道验证:格式校验(JSON 结构是否完整?必填字段有没有?)和范围校验(价格是不是一个合理的数字?日期是不是在有效范围内?)。不合格的输出触发重试,而不是直接传下去。

Output Layer: Before output reaches the next stage, run two checks: format validation (is JSON structure complete? are required fields present?) and range validation (is the price a reasonable number? is the date within valid bounds?). Failing output triggers a retry — not a passthrough.
5

决策层:高风险操作必须人工确认

支付、退票、删除数据——这些不可逆操作,不管模型多么"确定",都必须先给用户看清楚要做什么,用户确认后才执行。这是最后一道护栏,也是最重要的一道。

Decision Layer: Payment, ticket cancellation, data deletion — irreversible actions must always show the user exactly what will happen and require explicit confirmation before executing. This is the last guardrail, and the most important one.
× 没有护栏的系统 / No Guardrails

用户问票价 → 模型直接回答

  • 可能返回过时的训练数据里的价格 / May return stale training-data prices
  • 可能随机生成一个"看起来合理"的数字 / May generate a plausible-sounding number
  • 用户信以为真,到了才发现价格不对 / User trusts it, arrives and finds it's wrong
  • 投诉、退款、信任损失 / Complaints, refunds, trust loss
✓ 有护栏的系统 / With Guardrails

用户问票价 → 触发工具调用

  • 模型识别需要实时数据,调用票务查询工具 / Model triggers live ticket lookup tool
  • 工具返回真实价格,模型基于真实数据回答 / Tool returns real price, model answers from real data
  • 输出通过范围校验(价格是否合理?) / Output passes range validation
  • 给用户展示的是真实且验证过的价格 / User sees real, validated price

有些团队的做法是:让模型输出内容之后,再让同一个模型检查自己的输出对不对。这是行不通的——模型在检查自己的输出时,会倾向于觉得它是对的。就像让一个作者校对自己的文章,他往往看不到自己的错误。

Some teams have the model check its own output for correctness after generating it. This doesn't work — when checking its own output, the model tends to agree with itself. It's like asking an author to proofread their own writing — they rarely catch their own mistakes.

更好的做法:用一个独立的验证机制来检查输出——可以是规则引擎(检查格式、数值范围)、可以是另一个专门做验证的小模型、或者是明确的人工确认节点。验证者和生成者不能是同一个"人"。
A better approach: Use an independent validation mechanism — a rules engine (check format, numeric ranges), a separate small model dedicated to verification, or an explicit human confirmation step. The validator and the generator cannot be the same entity.
产品负责人的决策清单 / Product Owner Checklist

护栏设计的四个必问问题

  • 我们系统里哪些信息是必须实时准确的(票价/余票/状态)?这些必须通过工具调用获取,不能依赖模型知识 Which information in our system must be real-time accurate (fares/availability/status)? These must be retrieved via tool calls — not from model knowledge.
  • 模型的输出在到达用户之前,有没有独立的格式和范围校验?(不能由模型自己验证自己) Is there an independent format and range validation step before model output reaches the user? (Cannot be self-validated by the model.)
  • 当模型给出一个我们没有预期的输出时,系统会怎么处理?(是报错、降级、还是直接传给用户) When the model produces unexpected output, what does the system do? (Error out, degrade gracefully, or pass it to the user?)
  • 有没有定期抽查模型输出,看有没有出现幻觉的情况?(不能假设上线后就没问题了) Do we regularly sample-check model outputs for hallucination? (Cannot assume everything is fine after launch.)
05 · 工具调用 / Function Calling

工具调用有多可靠?
边界在哪里?

How Reliable Is Function Calling? Where Are Its Boundaries?

工具调用(Function Calling)是 Agent 系统的核心能力——正是因为模型能调用工具,Agent 才能真正"做事"而不只是"说话"。但很多团队高估了工具调用的可靠性,在边界情况下被坑。

给模型定义了一个查询订单的工具,要求传入 order_id(字符串)和 user_id(整数)。99% 的时候,模型调用这个工具完全没问题。

A tool is defined for querying orders, requiring order_id (string) and user_id (integer). 99% of the time, the model calls it perfectly.

但有时候,用户的 user_id 包含了字母(比如"USR_001"这种格式),模型有时会自作主张地把它转成数字,或者把两个字段搞混……下游系统收到了错误的参数,查询失败,但错误日志只显示"接口调用异常",根本找不到原因。

But sometimes, when user_id contains letters (like "USR_001"), the model may autonomously convert it to a number, or swap two fields... The downstream system receives wrong parameters, the query fails, and the error log just says "API call exception" — impossible to trace.

工具调用可能出现哪些问题?

What Can Go Wrong with Function Calling?

问题类型 · 01

参数类型错误
Type Mismatch

要求整数,模型传了字符串;要求数组,模型传了单个值。在边界情况下出现率不低。

Expected integer, got string; expected array, got single value. Occurs frequently in edge cases.

问题类型 · 02

工具选择错误
Wrong Tool Selected

当你提供了多个功能相似的工具时,模型可能选错了一个。比如"取消预订"和"退款"是两个工具,模型有时会搞混。

With multiple similar tools, the model may pick the wrong one. "Cancel booking" vs "Refund" are often confused.

问题类型 · 03

多余文字输出
Extraneous Text Output

要求"只返回工具调用,不要额外解释",但模型在调用工具前后加了一段文字,导致解析失败。

Instructed to return only a function call, but the model adds explanatory text — causing downstream parsing to fail.

问题类型 · 04

参数值捏造
Parameter Fabrication

当模型不确定某个参数的值时,它有时会"猜"一个听起来合理的值,而不是说"我没有这个信息"。

When unsure of a parameter value, the model sometimes guesses a plausible-sounding one instead of admitting "I don't have this information."

让工具调用更可靠的五个实践

Five Practices for More Reliable Function Calling

实践 / Practice 做什么 / What to Do 为什么有效 / Why It Works
工具描述写清楚
Clear Tool Descriptions
每个工具的 description 字段要精确说明用途、参数含义、边界情况 模型主要靠描述来判断调用哪个工具——描述越清楚,选对的概率越高
减少相似工具数量
Minimize Similar Tools
功能相近的工具合并,或者用清晰的命名区分 选择越多越容易混淆,保持工具集精简,每个工具职责单一
参数做强校验
Strict Parameter Validation
在工具接收到参数后,先做类型和范围校验,不合规的参数直接报错返回 让模型知道"这次调用参数不对",它会重新尝试;而不是用错误参数执行
必填参数不要有默认值
No Defaults for Required Params
如果参数是必须的,不要给它设默认值 有默认值时,模型可能"懒得传";强制必填,逼模型给出真实的参数
记录完整调用日志
Full Call Logging
每次工具调用的参数和返回值都要完整记录 出了问题时,可以精确复现"模型传了什么参数",不用靠猜

工具描述写得好不好,差距有多大?

How Much Does Tool Description Quality Matter?

这是实际工程里最常被低估的细节之一。来看一个具体对比:

This is one of the most underestimated details in real-world engineering. Here's a concrete comparison:

× 模糊的工具描述 / Vague Description

查询列车 / query_train

description: "查询列车信息"
params: from, to, date

模型不清楚 from/to 是城市名还是车站代码,date 是什么格式,也不知道什么时候该用这个工具。

Model doesn't know if from/to are city names or station codes, what date format to use, or when to call this vs. other tools.

✓ 清晰的工具描述 / Clear Description

search_trains

description: "查询两城市间高铁/动车班次。当用户询问余票、时刻表或想要订票时使用。不适用于已购票查询(用 get_order)"

明确了:① 什么时候用 ② 和其他工具的区别 ③ 参数格式要求

Explicitly defines: ① when to use this tool ② how it differs from other tools ③ required parameter formats

ARCHITECT NOTE · 本章核心结论 / Chapter Takeaways

了解模型的能力边界,不是为了对 AI 失望,而是为了把它用在对的地方、用正确的方式用

Understanding model capability boundaries isn't about being disappointed in AI — it's about using it in the right places, in the right ways.

模型不擅长的事(精确计算、实时信息、超长文本中段内容)——给它工具,让它查,而不是让它猜。模型容易出错的地方——在架构层面设护栏,不要假设它永远正确。

What models can't do well (precise math, real-time info, middle-of-document content) — give them tools to look it up, not guess. Where models are error-prone — set guardrails at the architecture level, don't assume it's always right.

工具描述写得清楚、参数做强校验、完整记录日志——这三件事做好了,能解决 80% 的工具调用问题。

Clear tool descriptions, strict parameter validation, complete call logging — getting these three things right solves 80% of Function Calling problems.

最终的原则是:对模型的能力持客观态度,既不神话它,也不妖魔化它。知道它能做什么、不能做什么,然后做好系统设计的托底工作。

The ultimate principle: approach model capabilities objectively — neither mythologize nor demonize. Know what it can and cannot do, then design solid system-level safeguards accordingly.

📖 中英词汇对照表

Glossary of Key Terms · AI / Agent / Travel Industry

大语言模型
Large Language Model (LLM)
通过海量文本训练、能理解和生成自然语言的 AI 模型,如 GPT-4、Claude、Gemini 等。
An AI model trained on vast text data that can understand and generate natural language, e.g., GPT-4, Claude, Gemini.
幻觉
Hallucination
模型生成了"概率上合理但事实上错误"的内容,且语气笃定,不自知犯错。
When a model generates content that is probabilistically plausible but factually incorrect, stated with complete confidence.
智能体 / 代理
Agent
能够感知环境、做出决策、调用工具执行任务的 AI 系统,超越单纯对话的 AI 应用形态。
An AI system that perceives its environment, makes decisions, and invokes tools to execute tasks — beyond simple conversation.
工具调用 / 函数调用
Function Calling / Tool Use
模型识别到需要外部能力时,生成结构化调用指令,由系统执行并将结果返回给模型。
When the model recognizes a need for external capabilities, it generates structured call instructions; the system executes and returns results.
系统提示
System Prompt
在对话开始前给模型设定角色、规则、约束的指令,用户通常不可见但全程生效。
Instructions given to the model before conversation begins, setting its role, rules, and constraints — invisible to users but active throughout.
检索增强生成
Retrieval-Augmented Generation (RAG)
在模型回答前先从知识库检索相关内容,将其作为上下文传入,提高回答的准确性和时效性。
Before the model answers, relevant content is retrieved from a knowledge base and passed as context — improving accuracy and timeliness.
上下文窗口
Context Window
模型一次能"看到"并处理的最大文本量,以 token 为单位。超出窗口的内容模型无法访问。
The maximum amount of text a model can "see" and process at once, measured in tokens. Content beyond the window is inaccessible.
令牌 / 词元
Token
模型处理文本的最小单位,大约相当于英文 3/4 个单词或中文 1-2 个字。计费和上下文计算的基本单位。
The smallest unit of text processing for models — roughly 3/4 of an English word or 1-2 Chinese characters. The basic unit for billing and context calculation.
知识截止日期
Training Cutoff / Knowledge Cutoff
模型训练数据的截止时间点,之后发生的事件模型一无所知,需要通过工具或 RAG 补充。
The date after which the model has no knowledge of world events. Real-time data or RAG must be used to supplement post-cutoff information.
迷失在中间效应
Lost in the Middle Effect
研究发现模型对长文档中间部分的注意力明显低于开头和结尾,导致中间关键信息易被遗漏。
Research finding that model attention to the middle sections of long documents is significantly lower than to the beginning and end.
流式输出
Streaming Output
模型边生成边输出,用户不用等全部生成完才看到内容,类似打字机效果,改善响应延迟体验。
Output is delivered token-by-token as the model generates it, like a typewriter effect — improving perceived latency.
微调
Fine-tuning
用特定领域数据对预训练模型进行再训练,使其更好地适应特定任务或风格。
Re-training a pre-trained model on domain-specific data to better adapt it to particular tasks or styles.
提示词工程
Prompt Engineering
通过设计和优化输入提示来引导模型输出期望结果的技术,是目前最常用的模型对齐方式之一。
The practice of designing and optimizing input prompts to guide model outputs toward desired results.
思维链
Chain of Thought (CoT)
让模型在给出最终答案前先写出推理步骤,能显著提升复杂推理任务的准确率。
Prompting the model to write out reasoning steps before the final answer — significantly improves accuracy on complex reasoning tasks.
护栏
Guardrail
在 AI 系统中设置的各种验证、限制和保险机制,防止模型输出错误或有害内容。
Validation, restriction, and safety mechanisms built into an AI system to prevent incorrect or harmful model outputs.
指令遵从
Instruction Following
模型按照给定指令执行任务的能力,是 Agent 场景中最重要的模型能力维度之一。
The model's ability to follow given instructions when executing tasks — one of the most critical capability dimensions for Agent scenarios.
多模态
Multimodal
能处理多种类型输入(文字、图片、音频、视频等)的 AI 模型或系统。
An AI model or system capable of processing multiple input types — text, images, audio, video, etc.
模型路由
Model Routing
根据任务类型和复杂度,将不同子任务分配给不同模型处理的策略,兼顾质量与成本。
A strategy of assigning different subtasks to different models based on task type and complexity — balancing quality and cost.
余票查询
Ticket Availability Check
实时查询指定班次、日期、座位等级的剩余可售票数量。属于必须接入实时数据源的功能。
Real-time query of remaining sellable tickets for a specific route, date, and seat class. Requires live data source integration.
退改签
Refund / Change / Endorsement
对已购机票或火车票进行退票、改期或背书的操作,通常涉及手续费规则,是 Agent 场景的高风险操作。
Operations to refund, reschedule, or endorse purchased tickets — typically involves fee rules and is a high-risk action in Agent systems.
动态定价
Dynamic Pricing
根据需求、时间、剩余座位等因素实时调整票价的机制,航空业普遍使用。
A mechanism that adjusts ticket prices in real-time based on demand, timing, remaining availability, and other factors — prevalent in aviation.
行程单 / 电子客票
Itinerary / E-Ticket
数字化的旅行凭证,包含出行人信息、航班/车次信息、座位信息等,等同于纸质票的法律效力。
A digital travel credential containing passenger info, flight/train details, and seat information — legally equivalent to a paper ticket.
联运 / 中转
Interline / Transfer / Transit
需要乘坐多段交通工具才能完成的旅程,涉及多个承运人或换乘节点的协调。
A journey requiring multiple transportation segments, involving coordination across carriers or transfer points.
舱位等级
Cabin Class / Booking Class
座位服务级别(经济/商务/头等)及航空公司内部用于管理票价的舱位代码(如Y、B、M等)。
Service level (economy/business/first class) and airline internal booking class codes (Y, B, M, etc.) used to manage fares.
全球分销系统
Global Distribution System (GDS)
连接航空公司、酒店、租车公司与旅行社的分销网络,如 Amadeus、Sabre、Travelport。
A distribution network connecting airlines, hotels, and car rentals with travel agencies, e.g., Amadeus, Sabre, Travelport.
机器人流程自动化
Robotic Process Automation (RPA)
通过软件机器人模拟人工操作界面来自动化重复性业务流程,常与 AI Agent 结合使用。
Using software bots to simulate human UI interactions for automating repetitive business processes — often combined with AI Agents.
接口 / 应用程序接口
API (Application Programming Interface)
系统间交换数据和调用功能的标准化接口,Agent 通过 API 调用外部工具(查票、支付、地图等)。
A standardized interface for exchanging data and invoking functions between systems. Agents use APIs to call external tools (ticketing, payment, maps, etc.).
编排 / 工作流
Orchestration / Workflow
协调多个 Agent、工具或服务按特定顺序和逻辑执行复杂任务的机制。
A mechanism for coordinating multiple Agents, tools, or services to execute complex tasks in a specific sequence and logic.
延迟 / 响应时间
Latency / Response Time
从发出请求到收到响应的时间间隔,对实时出行场景(余票查询、导航)体验影响显著。
The time between sending a request and receiving a response. Significantly impacts real-time travel scenarios (availability checks, navigation).
幂等性
Idempotency
对同一请求执行多次与执行一次结果相同的性质,对防止重复支付、重复订单至关重要。
The property where executing the same request multiple times produces the same result as executing it once — critical for preventing duplicate payments and orders.