strands.models.model
Abstract base class for Agent model providers.
BaseModelConfig
Section titled “BaseModelConfig”class BaseModelConfig(TypedDict)Defined in: src/strands/models/model.py:186
Base configuration shared by all model providers.
Attributes:
context_window_limit- Maximum context window size in tokens for the model. This value represents the total token capacity shared between input and output.
CacheConfig
Section titled “CacheConfig”@dataclassclass CacheConfig()Defined in: src/strands/models/model.py:198
Configuration for prompt caching.
Attributes:
strategy- Caching strategy to use.- “auto”: Automatically detect model support and inject cachePoint to maximize cache coverage
- “anthropic”: Inject cachePoint in Anthropic-compatible format without model support check
class Model(abc.ABC)Defined in: src/strands/models/model.py:210
Abstract base class for Agent model providers.
This class defines the interface for all model implementations in the Strands Agents SDK. It provides a standardized way to configure and process requests for different AI model providers.
stateful
Section titled “stateful”@propertydef stateful() -> boolDefined in: src/strands/models/model.py:218
Whether the model manages conversation state server-side.
Returns:
False by default. Model providers that support server-side state should override this.
context_window_limit
Section titled “context_window_limit”@propertydef context_window_limit() -> int | NoneDefined in: src/strands/models/model.py:227
Maximum context window size in tokens, or None if not configured.
update_config
Section titled “update_config”@abc.abstractmethoddef update_config(**model_config: Any) -> NoneDefined in: src/strands/models/model.py:238
Update the model configuration with the provided arguments.
Arguments:
**model_config- Configuration overrides.
get_config
Section titled “get_config”@abc.abstractmethoddef get_config() -> AnyDefined in: src/strands/models/model.py:248
Return the model configuration.
Returns:
The model’s configuration.
structured_output
Section titled “structured_output”@abc.abstractmethoddef structured_output( output_model: type[T], prompt: Messages, system_prompt: str | None = None, **kwargs: Any) -> AsyncGenerator[dict[str, T | Any], None]Defined in: src/strands/models/model.py:258
Get structured output from the model.
Arguments:
output_model- The output model to use for the agent.prompt- The prompt messages to use for the agent.system_prompt- System prompt to provide context to the model.**kwargs- Additional keyword arguments for future extensibility.
Yields:
Model events with the last being the structured output.
Raises:
ValidationException- The response format from the model does not match the output_model
stream
Section titled “stream”@abc.abstractmethoddef stream(messages: Messages, tool_specs: list[ToolSpec] | None = None, system_prompt: str | None = None, *, tool_choice: ToolChoice | None = None, system_prompt_content: list[SystemContentBlock] | None = None, invocation_state: dict[str, Any] | None = None, **kwargs: Any) -> AsyncIterable[StreamEvent]Defined in: src/strands/models/model.py:279
Stream conversation with the model.
This method handles the full lifecycle of conversing with the model:
- Format the messages, tool specs, and configuration into a streaming request
- Send the request to the model
- Yield the formatted message chunks
Arguments:
messages- List of message objects to be processed by the model.tool_specs- List of tool specifications to make available to the model.system_prompt- System prompt to provide context to the model.tool_choice- Selection strategy for tool invocation.system_prompt_content- System prompt content blocks for advanced features like caching.invocation_state- Caller-provided state/context that was passed to the agent when it was invoked.**kwargs- Additional keyword arguments for future extensibility.
Yields:
Formatted message chunks from the model.
Raises:
ModelThrottledException- When the model service is throttling requests from the client.
count_tokens
Section titled “count_tokens”async def count_tokens( messages: Messages, tool_specs: list[ToolSpec] | None = None, system_prompt: str | None = None, system_prompt_content: list[SystemContentBlock] | None = None) -> intDefined in: src/strands/models/model.py:315
Estimate token count for the given input before sending to the model.
Used for proactive context management (e.g., triggering compression at a threshold). Uses tiktoken’s cl100k_base encoding when available, otherwise falls back to a heuristic (characters / 4 for text, characters / 2 for JSON). Accuracy varies by model provider. Not intended for billing or precise quota calculations.
Subclasses may override this method to provide model-specific token counting using native APIs for improved accuracy.
Arguments:
messages- List of message objects to estimate tokens for.tool_specs- List of tool specifications to include in the estimate.system_prompt- Plain string system prompt. Ignored if system_prompt_content is provided.system_prompt_content- Structured system prompt content blocks. Takes priority over system_prompt.
Returns:
Estimated total input tokens.
_ModelPlugin
Section titled “_ModelPlugin”class _ModelPlugin(Plugin)Defined in: src/strands/models/model.py:347
Plugin that manages model-related lifecycle hooks.
@propertydef name() -> strDefined in: src/strands/models/model.py:351
A stable string identifier for this plugin.
init_agent
Section titled “init_agent”def init_agent(agent: "Agent") -> NoneDefined in: src/strands/models/model.py:369
Register model lifecycle hooks with the agent.
Arguments:
agent- The agent instance to register hooks with.