S3 Vectors Memory Plugin
The S3 Vectors Memory Plugin gives any Strands Agent long-term semantic memory backed by Amazon S3 Vectors. At the end of a conversation, the plugin summarizes the exchange using the agent’s own model and stores the summary as a searchable vector. On subsequent conversations, relevant summaries are retrieved and injected into the system prompt — the agent remembers without bloating the context window.
Available in two modes:
- Single-tenant — one shared index, ambient AWS credentials
- Multi-tenant — one index per tenant, IAM credentials scoped per tenant via the Token Vending Machine (TVM) pattern
Requirements
Section titled “Requirements”- Python 3.10+
strands-agents >= 1.0.0boto3 >= 1.34.0cachetools >= 5.3.0- AWS account with Amazon S3 Vectors access
- Amazon Bedrock access for:
- An embedding model —
amazon.nova-2-multimodal-embeddings-v1:0(default) - A chat model — e.g.
us.anthropic.claude-sonnet-4-5-20250929-v1:0
- An embedding model —
Installation
Section titled “Installation”pip install strands-s3-vectors-memoryAWS Setup
Section titled “AWS Setup”1. Create an S3 Vectors bucket:
aws s3vectors create-vector-bucket --vector-bucket-name my-vector-memory2. Create the index:
aws s3vectors create-index \ --vector-bucket-name my-vector-memory \ --index-name memory \ --data-type float32 --dimension 1024 --distance-metric cosine \ --metadata-configuration '{"nonFilterableMetadataKeys":["content","stored_at","conversation_id","type"]}'3. For multi-tenant, create the TVM IAM role using the setup script provided in the repository:
bash setup_tvm_role.sh my-vector-memoryexport S3_VECTOR_TVM_ROLE_ARN=<printed-arn>Single-tenant
Section titled “Single-tenant”import osfrom strands import Agentfrom strands.models import BedrockModelfrom strands_s3_vectors_memory import S3VectorMemory, S3VectorMemoryPlugin
BASE_PROMPT = """You are a helpful assistant.
{memory_context}
Use prior context naturally in your responses."""
store = S3VectorMemory(bucket_name=os.environ["S3_VECTOR_BUCKET_NAME"])plugin = S3VectorMemoryPlugin(store=store, base_prompt=BASE_PROMPT)
agent = Agent( model = BedrockModel(), name = "assistant", # required — used as memory namespace key plugins = [plugin], system_prompt = BASE_PROMPT,)
# Turn 1 — agent responds; memory not yet storedagent("My favourite framework is Strands Agents.", invocation_state={ "user_id": "user-001", "conversation_id": "conv-001", "end_session": False,})
# Turn 2 — end_session=True triggers background summarization and vector storeagent("Thanks, bye.", invocation_state={ "user_id": "user-001", "conversation_id": "conv-001", "end_session": True,})
# Next session — plugin retrieves the stored summary and injects it into the promptagent("What do you know about my preferences?", invocation_state={ "user_id": "user-001", "conversation_id": "conv-002", "end_session": False,})BASE_PROMPT must contain a {memory_context} placeholder. The plugin fills it with retrieved summaries on the first turn of each conversation, or replaces it with an empty string when no relevant memories are found.
Multi-tenant
Section titled “Multi-tenant”import osfrom strands import Agentfrom strands.models import BedrockModelfrom strands_s3_vectors_memory import MultiTenantS3VectorMemory, S3VectorMemoryPlugin
store = MultiTenantS3VectorMemory( bucket_name = os.environ["S3_VECTOR_BUCKET_NAME"], tvm_role_arn = os.environ["S3_VECTOR_TVM_ROLE_ARN"],)plugin = S3VectorMemoryPlugin(store=store, base_prompt=BASE_PROMPT)
agent = Agent( model = BedrockModel(), name = "assistant", plugins = [plugin], system_prompt = BASE_PROMPT,)
agent("Our Q4 budget is $2M.", invocation_state={ "tenant_context": {"tenantId": "tenant-001"}, "user_id": "user-456", "conversation_id": "conv-001", "end_session": True,})Configuration
Section titled “Configuration”Environment variables
Section titled “Environment variables”| Variable | Required | Description |
|---|---|---|
S3_VECTOR_BUCKET_NAME | Yes | S3 Vectors bucket name |
AWS_REGION | No (default: us-east-1) | AWS region |
S3_VECTOR_TVM_ROLE_ARN | Multi-tenant only | TVM IAM role ARN |
invocation_state keys
Section titled “invocation_state keys”| Key | Required | Description |
|---|---|---|
user_id | Yes | Scopes vector filter to this user |
conversation_id | Yes | Scopes buffer and summary key |
end_session | No (default False) | If True, summarizes and stores the conversation after the response (non-blocking) |
tenant_context | Multi-tenant only | Dict with tenantId key |