OpenAI Responses API

The Responses API is OpenAI’s interface for generating model responses and building agents. It is a superset of the Chat Completions API, with additional support for built-in tools, server-side conversation state management, and multi-modal inputs.

Installation

Python
TypeScript

pip install 'strands-agents[openai]' strands-agents-tools

Requires openai>=2.0.0. Install or upgrade with pip install -U openai.

npm install @strands-agents/sdk openai

Usage

After installing dependencies, you can import and initialize the Responses provider as follows:

Python
TypeScript

from strands import Agent
from strands.models.openai_responses import OpenAIResponsesModel

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
response = agent("Hello!")
print(response)

import { Agent } from '@strands-agents/sdk'
import { OpenAIModel } from '@strands-agents/sdk/models/openai'

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
})

const agent = new Agent({ model })
const response = await agent.invoke('Hello!')
console.log(response)

In TypeScript, both the Responses API and Chat Completions API are available through a single OpenAIModel class. The Responses API is selected by default. Pass api: 'chat' to use Chat Completions instead.

Amazon Bedrock (Mantle)

The Responses provider can connect to Amazon Bedrock’s OpenAI-compatible endpoints powered by Mantle. Authenticate with a Bedrock API key and point the client at your region’s Mantle endpoint.

Python
TypeScript

from strands import Agent
from strands.models.openai_responses import OpenAIResponsesModel

region = "us-east-1"
model = OpenAIResponsesModel(
    model_id="openai.gpt-oss-120b",
    client_args={
        "api_key": "<BEDROCK_API_KEY>",
        "base_url": f"https://bedrock-mantle.{region}.api.aws/v1",
    },
)

agent = Agent(model=model)
response = agent("What is 2+2?")
print(response)

import { Agent } from '@strands-agents/sdk'
import { OpenAIModel } from '@strands-agents/sdk/models/openai'

const region = 'us-east-1'
const model = new OpenAIModel({
  modelId: 'openai.gpt-oss-120b',
  apiKey: '<BEDROCK_API_KEY>',
  clientConfig: {
    baseURL: `https://bedrock-mantle.${region}.api.aws/v1`,
  },
})

const agent = new Agent({ model })
const response = await agent.invoke('What is 2+2?')
console.log(response)

Configuration

The client_args configure the underlying OpenAI client. For a complete list of available arguments, refer to the OpenAI Python SDK.

The clientConfig configures the underlying OpenAI client. For a complete list of available options, refer to the OpenAI TypeScript SDK.

Model Configuration

The model configuration sets parameters for inference:

Python
TypeScript

Parameter	Description	Example	Options
`model_id`	ID of a model to use	`gpt-4o`	reference
`params`	Model and tool parameters	`{"tools": [{"type": "web_search"}]}`	reference
`stateful`	Enable server-side conversation state	`True`	`True` / `False`

Parameter	Description	Example	Options
`modelId`	ID of a model to use	`'gpt-4o'`	reference
`maxTokens`	Maximum tokens to generate	`1000`	reference
`temperature`	Controls randomness (0-2)	`0.7`	reference
`topP`	Nucleus sampling (0-1)	`0.9`	reference
`stateful`	Enable server-side conversation state	`true`	`true` / `false`
`params`	Additional parameters (e.g., built-in tools)	`{ tools: [{ type: 'web_search' }] }`	reference

Built-in Tools

Built-in tools run server-side and are passed via the params configuration. They work alongside any function tools registered on the agent.

Web Search

Python
TypeScript

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    params={"tools": [{"type": "web_search"}]},
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
response = agent("What are the latest developments in AI?")

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
  params: {
    tools: [{ type: 'web_search' }],
  },
})

const agent = new Agent({ model })
const response = await agent.invoke('What are the latest developments in AI?')

Web search responses include URL citations that are streamed through the SDK’s citation system.

File Search

Python
TypeScript

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    params={
        "tools": [{"type": "file_search", "vector_store_ids": ["vs_abc123"]}],
    },
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
response = agent("What does the document say about pricing?")

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
  params: {
    tools: [
      {
        type: 'file_search',
        vector_store_ids: ['vs_abc123'],
      },
    ],
  },
})

const agent = new Agent({ model })
const response = await agent.invoke('What does the document say about pricing?')

File search requires a vector store with uploaded files. Text responses stream correctly; file citation annotations are not yet mapped to the SDK citation schema.

Code Interpreter

Python
TypeScript

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    params={
        "tools": [{"type": "code_interpreter", "container": {"type": "auto"}}],
    },
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
response = agent("Calculate the SHA-256 hash of 'hello world'")

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
  params: {
    tools: [
      {
        type: 'code_interpreter',
        container: { type: 'auto' },
      },
    ],
  },
})

const agent = new Agent({ model })
const response = await agent.invoke("Calculate the SHA-256 hash of 'hello world'")

The model executes code server-side and includes the results in its text response. The executed code and stdout/stderr are not currently surfaced to the caller.

Remote MCP

The mcp built-in tool connects the model to a remote MCP server, letting it call tools hosted externally without any local MCP client setup.

Python
TypeScript

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    params={
        "tools": [
            {
                "type": "mcp",
                "server_label": "deepwiki",
                "server_url": "https://mcp.deepwiki.com/mcp",
                "require_approval": "never",
            }
        ]
    },
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
response = agent("Using deepwiki, what language is the strands-agents/sdk-python repo written in?")

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
  params: {
    tools: [
      {
        type: 'mcp',
        server_label: 'deepwiki',
        server_url: 'https://mcp.deepwiki.com/mcp',
        require_approval: 'never',
      },
    ],
  },
})

const agent = new Agent({ model })
const response = await agent.invoke(
  'Using deepwiki, what language is the strands-agents/sdk-typescript repo written in?'
)

The model discovers and calls tools exposed by the remote MCP server. The approval flow is not currently surfaced, so require_approval must be set to "never".

Shell

The shell built-in tool runs shell commands inside a hosted container managed by OpenAI.

Python
TypeScript

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    params={
        "tools": [{"type": "shell", "environment": {"type": "container_auto"}}],
    },
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
response = agent("Use the shell to compute the md5sum of the string 'hello world'.")

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
  params: {
    tools: [
      {
        type: 'shell',
        environment: { type: 'container_auto' },
      },
    ],
  },
})

const agent = new Agent({ model })
const response = await agent.invoke(
  'Use the shell to compute the md5sum of the string "hello world".'
)

The model executes commands server-side and includes the output in its text response.

Server-side Conversation State

When stateful is enabled, the model manages conversation history server-side using OpenAI’s previous_response_id mechanism. The agent’s local message history is cleared after each turn, reducing payload size for multi-turn conversations.

Python
TypeScript

model = OpenAIResponsesModel(
    model_id="gpt-4o",
    stateful=True,
    client_args={"api_key": "<KEY>"},
)

agent = Agent(model=model)
agent("My name is Alice.")
# agent.messages is empty; conversation state is on the server

response = agent("What is my name?")
# The model remembers "Alice" via server-side state

const model = new OpenAIModel({
  modelId: 'gpt-4o',
  apiKey: '<KEY>',
  stateful: true,
})

const agent = new Agent({ model })
await agent.invoke('My name is Alice.')
// agent.messages is empty — state is on the server

const response = await agent.invoke('What is my name?')
// The model remembers "Alice" via server-side state

When using a stateful model, the agent automatically uses a null conversation manager and throws an error if a conversation manager is also supplied.

Advanced Features

Token Counting

Token counting is used by context management strategies to estimate input tokens before each model call.

Python
TypeScript

The OpenAI Responses provider can use the native responses.input_tokens.count() API, which provides exact token counts including messages, instructions, and tool specifications.

You can enable native token counting with:

model = OpenAIResponsesModel(
    model_id="gpt-4.1",
    use_native_token_count=True,
)

When disabled (or if the API call fails), falls back to estimation with a character-based heuristic (characters ÷ 4 for text, characters ÷ 2 for JSON).

References

Python API
OpenAI Responses API
Amazon Bedrock Mantle
OpenAI Chat Completions (alternative provider using the Chat Completions API)