Skip to content

S3 Vectors Memory Plugin

The S3 Vectors Memory Plugin gives any Strands Agent long-term semantic memory backed by Amazon S3 Vectors. At the end of a conversation, the plugin summarizes the exchange using the agent’s own model and stores the summary as a searchable vector. On subsequent conversations, relevant summaries are retrieved and injected into the system prompt — the agent remembers without bloating the context window.

Available in two modes:

  • Single-tenant — one shared index, ambient AWS credentials
  • Multi-tenant — one index per tenant, IAM credentials scoped per tenant via the Token Vending Machine (TVM) pattern
  • Python 3.10+
  • strands-agents >= 1.0.0
  • boto3 >= 1.34.0
  • cachetools >= 5.3.0
  • AWS account with Amazon S3 Vectors access
  • Amazon Bedrock access for:
    • An embedding model — amazon.nova-2-multimodal-embeddings-v1:0 (default)
    • A chat model — e.g. us.anthropic.claude-sonnet-4-5-20250929-v1:0
Terminal window
pip install strands-s3-vectors-memory

1. Create an S3 Vectors bucket:

Terminal window
aws s3vectors create-vector-bucket --vector-bucket-name my-vector-memory

2. Create the index:

Terminal window
aws s3vectors create-index \
--vector-bucket-name my-vector-memory \
--index-name memory \
--data-type float32 --dimension 1024 --distance-metric cosine \
--metadata-configuration '{"nonFilterableMetadataKeys":["content","stored_at","conversation_id","type"]}'

3. For multi-tenant, create the TVM IAM role using the setup script provided in the repository:

Terminal window
bash setup_tvm_role.sh my-vector-memory
export S3_VECTOR_TVM_ROLE_ARN=<printed-arn>
import os
from strands import Agent
from strands.models import BedrockModel
from strands_s3_vectors_memory import S3VectorMemory, S3VectorMemoryPlugin
BASE_PROMPT = """You are a helpful assistant.
{memory_context}
Use prior context naturally in your responses."""
store = S3VectorMemory(bucket_name=os.environ["S3_VECTOR_BUCKET_NAME"])
plugin = S3VectorMemoryPlugin(store=store, base_prompt=BASE_PROMPT)
agent = Agent(
model = BedrockModel(),
name = "assistant", # required — used as memory namespace key
plugins = [plugin],
system_prompt = BASE_PROMPT,
)
# Turn 1 — agent responds; memory not yet stored
agent("My favourite framework is Strands Agents.", invocation_state={
"user_id": "user-001", "conversation_id": "conv-001", "end_session": False,
})
# Turn 2 — end_session=True triggers background summarization and vector store
agent("Thanks, bye.", invocation_state={
"user_id": "user-001", "conversation_id": "conv-001", "end_session": True,
})
# Next session — plugin retrieves the stored summary and injects it into the prompt
agent("What do you know about my preferences?", invocation_state={
"user_id": "user-001", "conversation_id": "conv-002", "end_session": False,
})

BASE_PROMPT must contain a {memory_context} placeholder. The plugin fills it with retrieved summaries on the first turn of each conversation, or replaces it with an empty string when no relevant memories are found.

import os
from strands import Agent
from strands.models import BedrockModel
from strands_s3_vectors_memory import MultiTenantS3VectorMemory, S3VectorMemoryPlugin
store = MultiTenantS3VectorMemory(
bucket_name = os.environ["S3_VECTOR_BUCKET_NAME"],
tvm_role_arn = os.environ["S3_VECTOR_TVM_ROLE_ARN"],
)
plugin = S3VectorMemoryPlugin(store=store, base_prompt=BASE_PROMPT)
agent = Agent(
model = BedrockModel(),
name = "assistant",
plugins = [plugin],
system_prompt = BASE_PROMPT,
)
agent("Our Q4 budget is $2M.", invocation_state={
"tenant_context": {"tenantId": "tenant-001"},
"user_id": "user-456",
"conversation_id": "conv-001",
"end_session": True,
})
VariableRequiredDescription
S3_VECTOR_BUCKET_NAMEYesS3 Vectors bucket name
AWS_REGIONNo (default: us-east-1)AWS region
S3_VECTOR_TVM_ROLE_ARNMulti-tenant onlyTVM IAM role ARN
KeyRequiredDescription
user_idYesScopes vector filter to this user
conversation_idYesScopes buffer and summary key
end_sessionNo (default False)If True, summarizes and stores the conversation after the response (non-blocking)
tenant_contextMulti-tenant onlyDict with tenantId key