openai agent第二弹:deepresearch原理介绍

devtools/2025/2/6 14:07:51/

文章目录

  • 技术原理
  • 类似开源项目
    • OpenDeepResearcher
    • open-deep-research
    • ollama-deep-researcher
    • smolagents的open_deep_research
  • 参考资料

2月2日openai上线了第二个agent: deep research,具体功能类似24年11月google gemini发布的deep research。

技术原理

deep research 使用端到端强化学习,训练模型在不同领域推理和复杂浏览任务的能力;这种方法的核心原则是,模型学会自主规划和执行多步骤过程以找到相关数据,包括基于实时信息进行回溯和适应的能力。此过程允许模型处理诸如浏览用户上传的文件、生成和细化图形以及引用网络来源等任务。

类似开源项目

OpenDeepResearcher

开源地址:https://github.com/mshumer/OpenDeepResearcher

该项目侧重于用asyncio和aiohttp进行异步编程和请求响应,以此项目为例,具体进行深度研究的流程如下:

  1. 根据用户输入的研究主题,生成多个相关的query:
async def generate_search_queries_async(session, user_query):"""Ask the LLM to produce up to four precise search queries (in Python list format)based on the user’s query."""prompt = ("You are an expert research assistant. Given the user's query, generate up to four distinct, ""precise search queries that would help gather comprehensive information on the topic. ""Return only a Python list of strings, for example: ['query1', 'query2', 'query3'].")messages = [{"role": "system", "content": "You are a helpful and precise research assistant."},{"role": "user", "content": f"User Query: {user_query}\n\n{prompt}"}]response = await call_openrouter_async(session, messages)if response:try:# Expect exactly a Python list (e.g., "['query1', 'query2']")search_queries = eval(response)if isinstance(search_queries, list):return search_querieselse:print("LLM did not return a list. Response:", response)return []except Exception as e:print("Error parsing search queries:", e, "\nResponse:", response)return []return []
  1. 根据多个query,异步式调用搜索引擎API,获取相关网页的url或文本text;

async def perform_search_async(session, query):"""Asynchronously perform a Google search using SERPAPI for the given query.Returns a list of result URLs."""params = {"q": query,"api_key": SERPAPI_API_KEY,"engine": "google"}try:async with session.get(SERPAPI_URL, params=params) as resp:if resp.status == 200:results = await resp.json()if "organic_results" in results:links = [item.get("link") for item in results["organic_results"] if "link" in item]return linkselse:print("No organic results in SERPAPI response.")return []else:text = await resp.text()print(f"SERPAPI error: {resp.status} - {text}")return []except Exception as e:print("Error performing SERPAPI search:", e)return []
  1. 处理网页链接link:
async def process_link(session, link, user_query, search_query):"""Process a single link: fetch its content, judge its usefulness, and if useful, extract the relevant context."""print(f"Fetching content from: {link}")page_text = await fetch_webpage_text_async(session, link)if not page_text:return Noneusefulness = await is_page_useful_async(session, user_query, page_text)print(f"Page usefulness for {link}: {usefulness}")if usefulness == "Yes":context = await extract_relevant_context_async(session, user_query, search_query, page_text)if context:print(f"Extracted context from {link} (first 200 chars): {context[:200]}")return contextreturn None
  1. 使用llm as a judge,根据之前获取的内容,判断是否还需要补充新的query来查询内容;

async def get_new_search_queries_async(session, user_query, previous_search_queries, all_contexts):"""Based on the original query, the previously used search queries, and all the extracted contexts,ask the LLM whether additional search queries are needed. If yes, return a Python list of up to four queries;if the LLM thinks research is complete, it should return ""."""context_combined = "\n".join(all_contexts)prompt = ("You are an analytical research assistant. Based on the original query, the search queries performed so far, ""and the extracted contexts from webpages, determine if further research is needed. ""If further research is needed, provide up to four new search queries as a Python list (for example, ""['new query1', 'new query2']). If you believe no further research is needed, respond with exactly .""\nOutput only a Python list or the token  without any additional text.")messages = [{"role": "system", "content": "You are a systematic research planner."},{"role": "user", "content": f"User Query: {user_query}\nPrevious Search Queries: {previous_search_queries}\n\nExtracted Relevant Contexts:\n{context_combined}\n\n{prompt}"}]response = await call_openrouter_async(session, messages)if response:cleaned = response.strip()if cleaned == "":return ""try:new_queries = eval(cleaned)if isinstance(new_queries, list):return new_querieselse:print("LLM did not return a list for new search queries. Response:", response)return []except Exception as e:print("Error parsing new search queries:", e, "\nResponse:", response)return []return []
  1. 让llm根据之前搜集的资料,编写report:
async def generate_final_report_async(session, user_query, all_contexts):"""Generate the final comprehensive report using all gathered contexts."""context_combined = "\n".join(all_contexts)prompt = ("You are an expert researcher and report writer. Based on the gathered contexts below and the original query, ""write a comprehensive, well-structured, and detailed report that addresses the query thoroughly. ""Include all relevant insights and conclusions without extraneous commentary.")messages = [{"role": "system", "content": "You are a skilled report writer."},{"role": "user", "content": f"User Query: {user_query}\n\nGathered Relevant Contexts:\n{context_combined}\n\n{prompt}"}]report = await call_openrouter_async(session, messages)return report

open-deep-research

开源地址: https://github.com/nickscamara/open-deep-research

使用ts开发的一款AI应用,采用firecrawl提取和搜索网页,用微调的o3模型进行深度推理;

ollama-deep-researcher

开源地址:https://github.com/langchain-ai/ollama-deep-researcher

基于langgraph开发,使用ollama本地部署deepseek-r1的8b版本作为深度推理模型,用TAVILY或PERPLEXITY的搜索服务API;

agentsopen_deep_research_158">smolagents的open_deep_research

开源地址:https://github.com/huggingface/smolagents/tree/main/examples/open_deep_research

基于huggingface的smolagents开发的deep research agent;

参考资料

https://openai.com/index/introducing-deep-research/
https://jina.ai/
https://openrouter.ai/


http://www.ppmy.cn/devtools/156544.html

相关文章

EF Core 学习笔记(数据迁移、一对多)

程序集依赖&#xff1a;Nuget:Microsoft.EntityFrameworkCoreTools 【定义配置文件】 定义上下文配置文件&#xff0c;继承DbContext类 public class InfoManageProDbContext : DbContext{/// <summary>/// 业务系统/// </summary>public DbSet<BusinessSyste…

react18新增了哪些特性

React 18 引入了一系列新特性和改进,主要旨在提升性能和用户体验。以下是一些主要的新特性: 并发特性 并发渲染: React 18 引入了并发模式,使得 React 可以在后台准备多个状态更新,从而提高应用的响应性。 startTransition: 允许开发者标记某些状态更新为“过渡”,以便 Re…

Python 操作列表(元组)

在本章中&#xff0c;你将学习如何遍历 整个列表&#xff0c;这只需要几行代码&#xff0c;无论列表有多长。循环让你能 够对列表的每个元素都采取一个或一系列相同的措施&#xff0c;从而高效地处理任何长度的列表&#xff0c;包括包含数千乃至数百万个元素的列表。 元组 列表…

Spring Boot - 数据库集成06 - 集成ElasticSearch

Spring boot 集成 ElasticSearch 文章目录 Spring boot 集成 ElasticSearch一&#xff1a;前置工作1&#xff1a;项目搭建和依赖导入2&#xff1a;客户端连接相关构建3&#xff1a;实体类相关注解配置说明 二&#xff1a;客户端client相关操作说明1&#xff1a;检索流程1.1&…

使用Pygame制作“贪吃蛇”游戏

贪吃蛇 是一款经典的休闲小游戏&#xff1a;玩家通过操控一条会不断变长的“蛇”在屏幕中移动&#xff0c;去吃随机出现的食物&#xff0c;同时要避免撞到墙壁或自己身体的其他部分。由于其逻辑相对简单&#xff0c;但可玩性和扩展性都不错&#xff0c;非常适合作为新手练习游戏…

35.Word:公积金管理中心文员小谢【37】

目录 Word1.docx ​ Word2.docx Word2.docx ​ 注意本套题还是与上一套存在不同之处 Word1.docx 布局样式的应用设计页眉页脚位置在水平/垂直方向上均相对于外边距居中排列&#xff1a;格式→大小对话框→位置→水平/垂直 按下表所列要求将原文中的手动纯文本编号分别替换…

从零开始实现一个双向循环链表:C语言实战

文章目录 1链表的再次介绍2为什么选择双向循环链表&#xff1f;3代码实现&#xff1a;从初始化到销毁1. 定义链表节点2. 初始化链表3. 插入和删除节点4. 链表的其他操作5. 打印链表和判断链表是否为空6. 销毁链表 4测试代码5链表种类介绍6链表与顺序表的区别7存储金字塔L0: 寄存…

11.享元模式 (Flyweight)

定义 Flyweight 模式&#xff08;享元模式&#xff09; 是一种结构型设计模式&#xff0c;它旨在通过共享对象来有效支持大量细粒度对象的复用。该模式主要通过共享细节来减少内存使用&#xff0c;提升性能&#xff0c;尤其在需要大量对象时非常有效。 基本思想&#xff1a; …