Agentic RAG
Agentic RAG 将 Agent 的规划能力和 RAG 的检索能力融合,让系统能够自主决定何时检索、检索什么、以及如何整合信息来回答复杂问题。
从传统 RAG 到 Agentic RAG
graph TB
subgraph Traditional["传统 RAG"]
A1[用户提问] --> A2[向量检索]
A2 --> A3[拼接上下文]
A3 --> A4[LLM 生成]
end
subgraph Agentic["Agentic RAG"]
B1[用户提问] --> B2[Agent 规划]
B2 --> B3{需要检索?}
B3 -->|是| B4[选择检索工具]
B3 -->|否| B5[直接回答]
B4 --> B6[多轮检索]
B6 --> B7{信息充分?}
B7 -->|否| B2
B7 -->|是| B8[综合生成]
end
style B2 fill:#e3f2fd,stroke:#1976d2,stroke-width:3px
style B7 fill:#fff3e0,stroke:#f57c00,stroke-width:2px
对比分析
| 维度 | 传统 RAG | Agentic RAG |
|---|---|---|
| 检索决策 | 总是检索 | 动态决定 |
| 检索次数 | 单次 | 多轮迭代 |
| 工具使用 | 仅向量检索 | 多种工具(搜索、SQL、API) |
| 查询改写 | 简单改写 | Agent 自主分解 |
| 质量控制 | 无 | 自我反思和验证 |
| 适用场景 | 简单问答 | 复杂多步骤研究 |
| 复杂度 | 低 | 高 |
Agentic RAG 实现
"""
Agentic RAG 引擎
"""
from dataclasses import dataclass, field
from enum import Enum
from abc import ABC, abstractmethod
class ToolType(Enum):
VECTOR_SEARCH = "vector_search"
KEYWORD_SEARCH = "keyword_search"
SQL_QUERY = "sql_query"
WEB_SEARCH = "web_search"
CALCULATOR = "calculator"
@dataclass
class RetrievalResult:
"""检索结果"""
source: str
content: str
relevance: float
tool_used: ToolType
@dataclass
class AgentStep:
"""Agent 执行步骤"""
thought: str
action: str
result: str
step_num: int
@dataclass
class AgenticRAGState:
"""Agentic RAG 状态"""
query: str
sub_questions: list[str] = field(default_factory=list)
retrieved_docs: list[RetrievalResult] = field(default_factory=list)
steps: list[AgentStep] = field(default_factory=list)
is_sufficient: bool = False
final_answer: str = ""
class RetrievalTool(ABC):
"""检索工具基类"""
@abstractmethod
def search(self, query: str, top_k: int = 5) -> list[RetrievalResult]:
pass
class VectorSearchTool(RetrievalTool):
"""向量检索工具"""
def __init__(self, index, embedder):
self.index = index
self.embedder = embedder
def search(self, query: str, top_k: int = 5) -> list[RetrievalResult]:
embedding = self.embedder.encode(query)
results = self.index.search(embedding, top_k)
return [
RetrievalResult(
source=r["id"],
content=r["text"],
relevance=r["score"],
tool_used=ToolType.VECTOR_SEARCH,
)
for r in results
]
class AgenticRAGEngine:
"""Agentic RAG 引擎"""
MAX_ITERATIONS = 5
def __init__(self, llm_client, tools: dict[str, RetrievalTool]):
self.llm = llm_client
self.tools = tools
def answer(self, query: str) -> AgenticRAGState:
"""回答复杂问题"""
state = AgenticRAGState(query=query)
# Step 1: 分析问题复杂度
plan = self._plan(query)
state.sub_questions = plan["sub_questions"]
# Step 2: 迭代检索
for iteration in range(self.MAX_ITERATIONS):
# 选择下一步动作
action = self._decide_action(state)
if action["type"] == "search":
results = self._execute_search(
action["tool"], action["query"]
)
state.retrieved_docs.extend(results)
state.steps.append(AgentStep(
thought=action["reasoning"],
action=f"search:{action['tool']}",
result=f"找到 {len(results)} 条结果",
step_num=iteration + 1,
))
elif action["type"] == "answer":
state.is_sufficient = True
break
# Step 3: 综合生成
state.final_answer = self._synthesize(state)
return state
def _plan(self, query: str) -> dict:
"""规划检索策略"""
prompt = (
f"分析以下问题,分解为子问题并选择检索策略:\n"
f"问题:{query}\n"
f"可用工具:{list(self.tools.keys())}\n"
f"输出 JSON:{{\"sub_questions\": [...], \"strategy\": \"...\"}}"
)
result = self.llm.generate(prompt)
# 简化:返回分解后的子问题
return {"sub_questions": [query], "strategy": "iterative"}
def _decide_action(self, state: AgenticRAGState) -> dict:
"""决定下一步动作"""
# 检查已有信息是否充分
if len(state.retrieved_docs) >= 10:
return {"type": "answer"}
# 选择尚未回答的子问题
answered = {s.action for s in state.steps}
for sq in state.sub_questions:
for tool_name in self.tools:
key = f"search:{tool_name}"
if key not in answered:
return {
"type": "search",
"tool": tool_name,
"query": sq,
"reasoning": f"使用 {tool_name} 搜索: {sq}",
}
return {"type": "answer"}
def _execute_search(self, tool_name: str, query: str) -> list[RetrievalResult]:
"""执行检索"""
tool = self.tools.get(tool_name)
if tool is None:
return []
return tool.search(query, top_k=5)
def _synthesize(self, state: AgenticRAGState) -> str:
"""综合生成答案"""
context = "\n".join(
f"[{r.source}] ({r.tool_used.value}, 相关度:{r.relevance:.2f}) {r.content[:200]}"
for r in sorted(state.retrieved_docs, key=lambda x: x.relevance, reverse=True)[:10]
)
prompt = (
f"基于以下检索结果回答问题。\n"
f"问题:{state.query}\n"
f"检索结果:\n{context}\n"
f"请生成准确、完整的回答。如果信息不足,说明哪些部分需要更多研究。"
)
return self.llm.generate(prompt)
典型应用场景
graph LR
A[Agentic RAG 场景] --> B[企业研究助手]
A --> C[技术问答系统]
A --> D[法律法规查询]
A --> E[金融分析报告]
B --> B1[多源整合
报告生成] C --> C1[代码+文档+API
联合检索] D --> D1[法条+案例+解释
综合分析] E --> E1[财报+新闻+公告
关联分析] style A fill:#e8eaf6,stroke:#3f51b5,stroke-width:3px
报告生成] C --> C1[代码+文档+API
联合检索] D --> D1[法条+案例+解释
综合分析] E --> E1[财报+新闻+公告
关联分析] style A fill:#e8eaf6,stroke:#3f51b5,stroke-width:3px
本章小结
| 主题 | 要点 |
|---|---|
| 核心区别 | Agent 控制检索决策,不是固定管道 |
| 多轮检索 | 迭代搜索直到信息充分 |
| 工具路由 | 根据问题类型选择不同检索工具 |
| 自我反思 | 检查信息充分性后决定继续或回答 |
| 适用场景 | 复杂研究型问题 > 简单问答 |
下一章:代码 Agent