API 能力 OpenAI Responses API 支持新一代智能体API - 结合对话补全的简洁性与强大的工具调用能力

/v1/responses 请求端点介绍 API易全面支持 OpenAI 最新的 Responses API，这是 2025年3月推出的新一代智能体构建接口。Responses API 结合了 Chat Completions 的简洁性与 Assistants API 的工具使用和状态管理能力，为开发者提供更灵活、更强大的AI应用构建体验。新一代API：Responses API 是 Chat Completions 的超集，提供 Chat Completions 功能的同时，还支持内置工具、状态管理等高级特性。—— 但是它仅支持少数新的 OpenAI 模型，具体请看下文。

🚀 核心特性内置工具支持 Web搜索、文件搜索、代码解释器、函数调用等丰富工具状态管理通过 previous_response_id 维护对话上下文和状态推理保持 O3/O4-mini 推理模型的推理令牌在请求间保持连续完全兼容支持所有支持工具的 GPT-4.1、O3系列模型

📋 支持的模型

推理模型（推荐） O3 系列：o3, o3-pro, o4-mini 特色：推理令牌跨请求保持，提供更智能的上下文理解

对话模型 GPT-4.1 系列：gpt-4.1, gpt-4.1-mini 特色：强大的工具调用和多模态能力模型要求：只有较新的模型才支持 /v1/responses 端点。旧模型如 GPT-3.5 不支持此接口。

🔧 基础用法

简单对话

cURL

Python

Node.js

Copy curl https://api.apiyi.com/v1/responses
-H "Content-Type: application/json"
-H "Authorization: Bearer YOUR_API_KEY"
-d '{ "model": "gpt-4.1", "input": "Hello! How can you help me today?", "instructions": "You are a helpful assistant." }'

实际响应示例基于您的测试结果，以下是完整的响应格式：

Copy { "id": "resp_6884fcab4930819dbbc02f15cbe63f6c0a92c38ff214d10a", "object": "response", "created_at": 1753545899, "status": "completed", "background": false, "error": null, "incomplete_details": null, "instructions": "You are a helpful assistant.", "max_output_tokens": null, "max_tool_calls": null, "model": "gpt-4.1-2025-04-14", "output": [ { "id": "msg_6884fcab8f18819dbcdf349f01b424f80a92c38ff214d10a", "type": "message", "status": "completed", "content": [ { "type": "output_text", "annotations": [], "logprobs": [], "text": "Hello! How can I assist you today?" } ], "role": "assistant" } ], "parallel_tool_calls": true, "previous_response_id": null, "prompt_cache_key": null, "reasoning": { "effort": null, "summary": null }, "safety_identifier": null, "service_tier": "default", "store": true, "temperature": 1.0, "text": { "format": { "type": "text" } }, "tool_choice": "auto", "tools": [], "top_logprobs": 0, "top_p": 1.0, "truncation": "disabled", "usage": { "input_tokens": 19, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens": 10, "output_tokens_details": { "reasoning_tokens": 0 }, "total_tokens": 29 }, "user": null, "metadata": {} }

📊 请求参数详解

必需参数参数类型说明 model string 模型名称，如 gpt-4.1, o3 input string 用户输入内容

可选参数参数类型默认值说明 instructions string null 系统指令，定义助手行为 previous_response_id string null 上一个响应的 ID，用于维护上下文 temperature float 1.0 控制输出随机性 (0-2) max_output_tokens int null 最大输出令牌数 tools array [] 可用工具列表 tool_choice string ”auto” 工具选择策略 parallel_tool_calls boolean true 是否允许并行工具调用 store boolean true 是否存储对话用于训练 metadata object 自定义元数据

🛠️ 内置工具支持

函数调用

Copy response = client.responses.create( model="gpt-4.1", input="What's the weather like in Beijing?", instructions="You are a helpful weather assistant.", tools=[ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a city", "parameters": { "type": "object", "properties": { "city": { "type": "string", "description": "City name" } }, "required": ["city"] } } } ] )

代码解释器

Copy response = client.responses.create( model="gpt-4.1", input="Create a chart showing sales data: Jan:100, Feb:150, Mar:120", instructions="You are a data analyst. Use code interpreter to create visualizations.", tools=[{"type": "code_interpreter"}] )

文件搜索

Copy response = client.responses.create( model="gpt-4.1", input="Search for information about quarterly reports", instructions="You are a document analyst.", tools=[{"type": "file_search"}] )

🔄 状态管理

维护对话上下文

Copy

第一轮对话

response1 = client.responses.create( model="gpt-4.1", input="My name is Alice. Please remember this.", instructions="You are a helpful assistant with good memory." )

第二轮对话 - 使用 previous_response_id 维护上下文

response2 = client.responses.create( model="gpt-4.1", input="What's my name?", instructions="You are a helpful assistant with good memory.", previous_response_id=response1.id )

print(response2.output[0].content[0].text) # 应该回答 "Alice"

多轮工具调用

Copy def multi_turn_conversation(): response_id = None

for user_input in ["What's 2+2?", "Now multiply that by 3", "And divide by 2"]:
    response = client.responses.create(
        model="o3",
        input=user_input,
        instructions="You are a math tutor. Show your reasoning.",
        previous_response_id=response_id,
        tools=[{"type": "code_interpreter"}]
    )

    print(f"User: {user_input}")
    print(f"Assistant: {response.output[0].content[0].text}")

    response_id = response.id  # 保持上下文

📈 推理模型特性

O3/O4-mini 推理保持推理模型在 Responses API 中具有特殊优势：

Copy

使用 O3 进行复杂推理

response = client.responses.create( model="o3", input="Solve this step by step: If a train travels 120km in 2 hours, then speeds up 20% for the next hour, how far did it travel in total?", instructions="Think through this problem step by step, showing all reasoning." )

查看推理过程

reasoning_tokens = response.usage.output_tokens_details.reasoning_tokens print(f"Reasoning tokens used: {reasoning_tokens}")

继续对话，推理上下文会保持

follow_up = client.responses.create( model="o3", input="Now what if the train slowed down 10% in the fourth hour?", previous_response_id=response.id )

🆚 与 Chat Completions 对比特性 Chat Completions Responses API 基础对话 ✅ 支持 ✅ 支持流式响应 ✅ 支持 ✅ 支持函数调用 ✅ 支持 ✅ 增强支持内置工具 ❌ 不支持 ✅ 丰富工具状态管理 ❌ 无状态 ✅ 有状态推理保持 ❌ 不支持 ✅ O3/O4支持文件搜索 ❌ 不支持 ✅ 支持代码解释器 ❌ 不支持 ✅ 支持

迁移示例从 Chat Completions 迁移到 Responses API：

Chat Completions

Responses API

Copy

旧方式

response = client.chat.completions.create( model="gpt-4.1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] ) content = response.choices[0].message.content

🔧 高级功能

并行工具调用

Copy response = client.responses.create( model="gpt-4.1", input="Get weather for Beijing and Shanghai, then calculate travel time between them", instructions="You are a travel assistant.", parallel_tool_calls=True, tools=[ {"type": "function", "function": {"name": "get_weather", ...}}, {"type": "function", "function": {"name": "calculate_distance", ...}} ] )

输出格式控制

Copy response = client.responses.create( model="gpt-4.1", input="Summarize this data in JSON format", instructions="Always respond in valid JSON.", text={ "format": { "type": "json_object" } } )

推理努力控制（O3系列）

Copy response = client.responses.create( model="o3", input="Solve this complex physics problem", instructions="Think carefully and show detailed reasoning.", reasoning={ "effort": "high" # low, medium, high } )

📊 响应字段详解

核心字段字段类型说明 id string 响应唯一标识符 object string 固定为 “response” created_at integer 创建时间戳 status string 状态：completed/failed/in_progress model string 实际使用的模型版本 output array 输出消息数组 usage object Token 使用统计

输出消息格式

Copy { "id": "msg_xxx", "type": "message", "status": "completed", "content": [ { "type": "output_text", "text": "响应内容", "annotations": [], "logprobs": [] } ], "role": "assistant" }

使用统计

Copy { "usage": { "input_tokens": 19, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens": 10, "output_tokens_details": { "reasoning_tokens": 0 // 仅推理模型 }, "total_tokens": 29 } }

🚨 错误处理

标准错误格式

Copy { "error": { "type": "invalid_request_error", "code": "model_not_supported", "message": "The model 'gpt-3.5-turbo' is not supported for the responses endpoint.", "param": "model" } }

常见错误错误码说明解决方案 model_not_supported 模型不支持 Responses API 使用支持的新模型 invalid_previous_response_id 无效的上一个响应ID 检查响应ID是否正确 tool_not_available 工具不可用检查工具配置 max_tokens_exceeded 超出令牌限制减少输入或设置max_output_tokens

💡 最佳实践

状态管理策略

Copy class ConversationManager: def init(self, model="gpt-4.1", instructions="You are a helpful assistant."): self.model = model self.instructions = instructions self.last_response_id = None

def send_message(self, input_text, tools=None):
    response = client.responses.create(
        model=self.model,
        input=input_text,
        instructions=self.instructions,
        previous_response_id=self.last_response_id,
        tools=tools or []
    )

    self.last_response_id = response.id
    return response.output[0].content[0].text

def reset_conversation(self):
    self.last_response_id = None

使用示例

conv = ConversationManager() print(conv.send_message("Hello, I'm Alice")) print(conv.send_message("What's my name?")) # 会记住是 Alice

工具调用优化

Copy def smart_tool_calling(user_input): # 根据输入智能选择工具 available_tools = []

if "weather" in user_input.lower():
    available_tools.append(weather_tool)
if "calculate" in user_input.lower():
    available_tools.append(calculator_tool)
if "search" in user_input.lower():
    available_tools.append(search_tool)

response = client.responses.create(
    model="gpt-4.1",
    input=user_input,
    instructions="Use the appropriate tools to help the user.",
    tools=available_tools,
    tool_choice="auto"
)

return response

推理模型优化

Copy def optimized_reasoning(complex_problem): response = client.responses.create( model="o3", input=complex_problem, instructions="Think step by step and show your reasoning process.", reasoning={ "effort": "high" # 对复杂问题使用高推理努力 }, temperature=0.1 # 降低随机性以获得一致结果 )

# 分析推理使用情况
reasoning_tokens = response.usage.output_tokens_details.reasoning_tokens
total_cost = calculate_cost(response.usage)

return {
    "answer": response.output[0].content[0].text,
    "reasoning_tokens": reasoning_tokens,
    "cost": total_cost
}

🔮 未来发展

即将推出的功能完整的 Assistants API 功能集成（2026年上半年）更多内置工具：Web搜索、计算机使用等模型上下文协议 (MCP) 支持增强的多模态能力

迁移时间线现在：可以开始使用 Responses API 2026年上半年：功能对等 Assistants API 2026年：Assistants API 弃用公告 2027年：完全迁移到 Responses API

第一轮对话 ​

第二轮对话 - 使用 previous_response_id 维护上下文 ​

使用 O3 进行复杂推理 ​

查看推理过程 ​

继续对话，推理上下文会保持 ​

旧方式 ​

使用示例 ​

第一轮对话

第二轮对话 - 使用 previous_response_id 维护上下文

使用 O3 进行复杂推理

查看推理过程

继续对话，推理上下文会保持

旧方式

使用示例