Skip to content

OpenAI Responses API

The Responses API is OpenAI’s interface for generating model responses and building agents. It is a superset of the Chat Completions API, with additional support for built-in tools, server-side conversation state management, and multi-modal inputs.

OpenAI is configured as an optional dependency in Strands Agents. To install, run:

Terminal window
pip install 'strands-agents[openai]' strands-agents-tools

After installing dependencies, you can import and initialize the OpenAI Responses provider as follows:

from strands import Agent
from strands.models.openai_responses import OpenAIResponsesModel
model = OpenAIResponsesModel(
model_id="gpt-4o",
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
response = agent("Hello!")
print(response)

OpenAIResponsesModel can connect to Amazon Bedrock’s OpenAI-compatible endpoints powered by Mantle. Authenticate with a Bedrock API key and point the client at your region’s Mantle endpoint.

from strands import Agent
from strands.models.openai_responses import OpenAIResponsesModel
region = "us-east-1"
model = OpenAIResponsesModel(
model_id="openai.gpt-oss-120b",
client_args={
"api_key": "<BEDROCK_API_KEY>",
"base_url": f"https://bedrock-mantle.{region}.api.aws/v1",
},
)
agent = Agent(model=model)
response = agent("What is 2+2?")
print(response)

The client_args configure the underlying OpenAI client. For a complete list of available arguments, refer to the OpenAI Python SDK.

The model configuration sets parameters for inference:

ParameterDescriptionExampleOptions
model_idID of a model to usegpt-4oreference
paramsModel and tool parameters{"tools": [{"type": "web_search"}]}reference
statefulEnable server-side conversation stateTrueTrue / False

Built-in tools run server-side and are passed via the params configuration. They work alongside any function tools registered on the agent.

model = OpenAIResponsesModel(
model_id="gpt-4o",
params={"tools": [{"type": "web_search"}]},
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
response = agent("What are the latest developments in AI?")

Web search responses include URL citations that are streamed through the SDK’s citation system.

model = OpenAIResponsesModel(
model_id="gpt-4o",
params={
"tools": [{"type": "file_search", "vector_store_ids": ["vs_abc123"]}],
},
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
response = agent("What does the document say about pricing?")

File search requires a vector store with uploaded files. Text responses stream correctly; file citation annotations are not yet mapped to the SDK citation schema.

model = OpenAIResponsesModel(
model_id="gpt-4o",
params={
"tools": [{"type": "code_interpreter", "container": {"type": "auto"}}],
},
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
response = agent("Calculate the SHA-256 hash of 'hello world'")

The model executes Python code server-side and includes the results in its text response. The executed code and stdout/stderr are not currently surfaced to the caller.

The mcp built-in tool connects the model to a remote MCP server, letting it call tools hosted externally without any local MCP client setup.

model = OpenAIResponsesModel(
model_id="gpt-4o",
params={
"tools": [
{
"type": "mcp",
"server_label": "deepwiki",
"server_url": "https://mcp.deepwiki.com/mcp",
"require_approval": "never",
}
]
},
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
response = agent("Using deepwiki, what language is the strands-agents/sdk-python repo written in?")

The model discovers and calls tools exposed by the remote MCP server. The approval flow is not currently surfaced, so require_approval must be set to "never".

The shell built-in tool runs shell commands inside a hosted container managed by OpenAI.

model = OpenAIResponsesModel(
model_id="gpt-4o",
params={
"tools": [{"type": "shell", "environment": {"type": "container_auto"}}],
},
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
response = agent("Use the shell to compute the md5sum of the string 'hello world'.")

The model executes commands server-side and includes the output in its text response.

When stateful=True, the model manages conversation history server-side using OpenAI’s previous_response_id mechanism. The agent’s local message history is cleared after each turn, reducing payload size for multi-turn conversations.

model = OpenAIResponsesModel(
model_id="gpt-4o",
stateful=True,
client_args={"api_key": "<KEY>"},
)
agent = Agent(model=model)
agent("My name is Alice.")
# agent.messages is empty; conversation state is on the server
response = agent("What is my name?")
# The model remembers "Alice" via server-side state