strands.vended_plugins.context_offloader.plugin
ContextOffloader plugin for managing large tool outputs.
This module provides the ContextOffloader plugin that intercepts oversized tool results, persists each content block to a storage backend, and replaces the in-context result with a truncated preview and per-block references.
Example:
from strands import Agentfrom strands.vended_plugins.context_offloader import ( ContextOffloader, InMemoryStorage, FileStorage,)
# In-memory storageagent = Agent(plugins=[ ContextOffloader(storage=InMemoryStorage())])
# File storage with custom thresholds and retrieval tool enabledagent = Agent(plugins=[ ContextOffloader( storage=FileStorage("./artifacts"), max_result_tokens=5_000, preview_tokens=2_000, include_retrieval_tool=True, )])LineRange
Section titled “LineRange”class LineRange(TypedDict)Defined in: src/strands/vended_plugins/context_offloader/plugin.py:56
A span of lines to retrieve (1-indexed, inclusive).
ContextOffloader
Section titled “ContextOffloader”class ContextOffloader(Plugin)Defined in: src/strands/vended_plugins/context_offloader/plugin.py:73
Plugin that offloads oversized tool results to reduce context consumption.
When a tool result exceeds the configured token threshold, this plugin stores each content block individually to a storage backend and replaces the in-context result with a truncated text preview plus per-block references.
Token estimation uses the agent’s model count_tokens method, which
leverages tiktoken when available and falls back to character-based heuristics.
Content type handling:
- Text: stored as
text/plain, replaced with a preview - JSON: stored as
application/json, replaced with a preview - Image: stored in its native format (e.g.,
image/png), replaced with a placeholder showing format and size - Document: stored in its native format (e.g.,
application/pdf), replaced with a placeholder showing format, name, and size - Unknown types: passed through unchanged
This operates proactively at tool execution time via AfterToolCallEvent,
before the result enters the conversation — unlike SlidingWindowConversationManager
which truncates reactively after context overflow.
Arguments:
storage- Backend for storing offloaded content (required).max_result_tokens- Offload results whose estimated token count exceeds this threshold.preview_tokens- Number of tokens to keep as a text preview in context.include_retrieval_tool- Whether to register theretrieve_offloaded_contenttool. Defaults to True.
Example:
from strands import Agentfrom strands.vended_plugins.context_offloader import ContextOffloader, InMemoryStorage
agent = Agent(plugins=[ ContextOffloader(storage=InMemoryStorage())])__init__
Section titled “__init__”def __init__(storage: Storage, max_result_tokens: int = _DEFAULT_MAX_RESULT_TOKENS, preview_tokens: int = _DEFAULT_PREVIEW_TOKENS, *, include_retrieval_tool: bool = True) -> NoneDefined in: src/strands/vended_plugins/context_offloader/plugin.py:117
Initialize the ContextOffloader plugin.
Arguments:
storage- Backend for storing offloaded content.max_result_tokens- Offload results whose estimated token count exceeds this threshold. Defaults to_DEFAULT_MAX_RESULT_TOKENS(2,500).preview_tokens- Number of tokens to keep as a text preview in context. Uses tiktoken for exact slicing when available, falls back to chars/4 heuristic. Defaults to_DEFAULT_PREVIEW_TOKENS(1,000).include_retrieval_tool- Whether to register theretrieve_offloaded_contenttool so the agent can fetch offloaded content. Defaults to True.
Raises:
ValueError- If max_result_tokens is not positive, preview_tokens is negative, or preview_tokens >= max_result_tokens.
init_agent
Section titled “init_agent”def init_agent(agent: Agent) -> NoneDefined in: src/strands/vended_plugins/context_offloader/plugin.py:178
Conditionally register the retrieval tool and bind storage.
retrieve_offloaded_content
Section titled “retrieve_offloaded_content”@tool(context=True)async def retrieve_offloaded_content( reference: str, tool_context: ToolContext, pattern: str | None = None, line_range: LineRange | None = None, context_lines: int | None = None) -> dict | strDefined in: src/strands/vended_plugins/context_offloader/plugin.py:195
Retrieve offloaded content by reference.
When a tool result was too large to keep in context, it was stored externally and replaced with a preview and a reference. Use this tool with that reference to access the stored content.
Returns:
- With pattern: matching lines with line numbers and surrounding context
- With line_range: the specified span of lines with line numbers
- Without pattern/line_range: the full original content (use sparingly — re-injects all tokens)
Constraints:
- pattern/line_range/context_lines only work on text content. For binary content, omit them.
- Line numbers in results are 1-indexed and can be used in follow-up line_range calls.
Examples:
\{"reference"- “ref_1”, “pattern”: “error”} -> lines containing “error” with 5 lines context\{"reference"- “ref_1”, “pattern”: “error|warning”, “context_lines”: 3} -> regex, 3 lines context\{"reference"- “ref_1”, “line_range”: {“start”: 10, “end”: 25}} -> lines 10-25\{"reference"- “ref_1”, “pattern”: “TODO”, “line_range”: {“start”: 1, “end”: 50}} -> search within range
Arguments:
reference- The reference string from the offload placeholder (e.g. “mem_1_tool-123_0”).pattern- Regex or keyword to grep for. Returns only matching lines with context — not the full content.line_range- Return only this span of lines. A dict with ‘start’ and ‘end’ keys (1-indexed). Combine with pattern to search within the range.context_lines- Lines before AND after each match (like grep -C). Default: 5. Without pattern/line_range, returns first N lines.tool_context- Injected by the framework. Not user-facing.