strands.vended_plugins.context_offloader.plugin

ContextOffloader plugin for managing large tool outputs.

This module provides the ContextOffloader plugin that intercepts oversized tool results, persists each content block to a storage backend, and replaces the in-context result with a truncated preview and per-block references.

Example:

from strands import Agent
from strands.vended_plugins.context_offloader import (
    ContextOffloader,
    InMemoryStorage,
    FileStorage,
)

# In-memory storage
agent = Agent(plugins=[
    ContextOffloader(storage=InMemoryStorage())
])

# File storage with custom thresholds and retrieval tool enabled
agent = Agent(plugins=[
    ContextOffloader(
        storage=FileStorage("./artifacts"),
        max_result_tokens=5_000,
        preview_tokens=2_000,
        include_retrieval_tool=True,
    )
])

LineRange

class LineRange(TypedDict)

Defined in: src/strands/vended_plugins/context_offloader/plugin.py:56

A span of lines to retrieve (1-indexed, inclusive).

ContextOffloader

class ContextOffloader(Plugin)

Defined in: src/strands/vended_plugins/context_offloader/plugin.py:73

Plugin that offloads oversized tool results to reduce context consumption.

When a tool result exceeds the configured token threshold, this plugin stores each content block individually to a storage backend and replaces the in-context result with a truncated text preview plus per-block references.

Token estimation uses the agent’s model count_tokens method, which leverages tiktoken when available and falls back to character-based heuristics.

Content type handling:

Text: stored as text/plain, replaced with a preview
JSON: stored as application/json, replaced with a preview
Image: stored in its native format (e.g., image/png), replaced with a placeholder showing format and size
Document: stored in its native format (e.g., application/pdf), replaced with a placeholder showing format, name, and size
Unknown types: passed through unchanged

This operates proactively at tool execution time via AfterToolCallEvent, before the result enters the conversation — unlike SlidingWindowConversationManager which truncates reactively after context overflow.

Arguments:

storage - Backend for storing offloaded content (required).
max_result_tokens - Offload results whose estimated token count exceeds this threshold.
preview_tokens - Number of tokens to keep as a text preview in context.
include_retrieval_tool - Whether to register the retrieve_offloaded_content tool. Defaults to True.

Example:

from strands import Agent
from strands.vended_plugins.context_offloader import ContextOffloader, InMemoryStorage

agent = Agent(plugins=[
    ContextOffloader(storage=InMemoryStorage())
])

init

def __init__(storage: Storage,
             max_result_tokens: int = _DEFAULT_MAX_RESULT_TOKENS,
             preview_tokens: int = _DEFAULT_PREVIEW_TOKENS,
             *,
             include_retrieval_tool: bool = True) -> None

Defined in: src/strands/vended_plugins/context_offloader/plugin.py:117

Initialize the ContextOffloader plugin.

Arguments:

storage - Backend for storing offloaded content.
max_result_tokens - Offload results whose estimated token count exceeds this threshold. Defaults to _DEFAULT_MAX_RESULT_TOKENS (2,500).
preview_tokens - Number of tokens to keep as a text preview in context. Uses tiktoken for exact slicing when available, falls back to chars/4 heuristic. Defaults to _DEFAULT_PREVIEW_TOKENS (1,000).
include_retrieval_tool - Whether to register the retrieve_offloaded_content tool so the agent can fetch offloaded content. Defaults to True.

Raises:

ValueError - If max_result_tokens is not positive, preview_tokens is negative, or preview_tokens >= max_result_tokens.

init_agent

def init_agent(agent: Agent) -> None

Defined in: src/strands/vended_plugins/context_offloader/plugin.py:178

Conditionally register the retrieval tool and bind storage.

retrieve_offloaded_content

@tool(context=True)
async def retrieve_offloaded_content(
        reference: str,
        tool_context: ToolContext,
        pattern: str | None = None,
        line_range: LineRange | None = None,
        context_lines: int | None = None) -> dict | str

Defined in: src/strands/vended_plugins/context_offloader/plugin.py:195

Retrieve offloaded content by reference.

When a tool result was too large to keep in context, it was stored externally and replaced with a preview and a reference. Use this tool with that reference to access the stored content.

Returns:

With pattern: matching lines with line numbers and surrounding context
With line_range: the specified span of lines with line numbers
Without pattern/line_range: the full original content (use sparingly — re-injects all tokens)

Constraints:

pattern/line_range/context_lines only work on text content. For binary content, omit them.
Line numbers in results are 1-indexed and can be used in follow-up line_range calls.

Examples:

\{"reference" - “ref_1”, “pattern”: “error”} -> lines containing “error” with 5 lines context
\{"reference" - “ref_1”, “pattern”: “error|warning”, “context_lines”: 3} -> regex, 3 lines context
\{"reference" - “ref_1”, “line_range”: {“start”: 10, “end”: 25}} -> lines 10-25
\{"reference" - “ref_1”, “pattern”: “TODO”, “line_range”: {“start”: 1, “end”: 50}} -> search within range

Arguments:

reference - The reference string from the offload placeholder (e.g. “mem_1_tool-123_0”).
pattern - Regex or keyword to grep for. Returns only matching lines with context — not the full content.
line_range - Return only this span of lines. A dict with ‘start’ and ‘end’ keys (1-indexed). Combine with pattern to search within the range.
context_lines - Lines before AND after each match (like grep -C). Default: 5. Without pattern/line_range, returns first N lines.
tool_context - Injected by the framework. Not user-facing.