Ollama
Ollama is a framework for running open-source large language models locally. Strands provides native support for Ollama, allowing you to use locally-hosted models in your agents.
The OllamaModel class in Strands enables seamless integration with Ollama’s API, supporting:
- Text generation
- Image understanding
- Tool/function calling
- Streaming responses
- Configuration management
Getting Started
Section titled “Getting Started”Prerequisites
Section titled “Prerequisites”First install the python client into your python environment:
pip install 'strands-agents[ollama]' strands-agents-toolsNext, you’ll need to install and setup ollama itself.
Option 1: Native Installation
Section titled “Option 1: Native Installation”- Install Ollama by following the instructions at ollama.ai
- Pull your desired model:
Terminal window ollama pull llama3.1 - Start the Ollama server:
Terminal window ollama serve
Option 2: Docker Installation
Section titled “Option 2: Docker Installation”-
Pull the Ollama Docker image:
Terminal window docker pull ollama/ollama -
Run the Ollama container:
Terminal window docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollamaNote: Add
--gpus=allif you have a GPU and if Docker GPU support is configured. -
Pull a model using the Docker container:
Terminal window docker exec -it ollama ollama pull llama3.1 -
Verify the Ollama server is running:
Terminal window curl http://localhost:11434/api/tags
Basic Usage
Section titled “Basic Usage”Here’s how to create an agent using an Ollama model:
from strands import Agentfrom strands.models.ollama import OllamaModel
# Create an Ollama model instanceollama_model = OllamaModel( host="http://localhost:11434", # Ollama server address model_id="llama3.1" # Specify which model to use)
# Create an agent using the Ollama modelagent = Agent(model=ollama_model)
# Use the agentagent("Tell me about Strands agents.") # Prints model output to stdout by defaultConfiguration Options
Section titled “Configuration Options”The OllamaModel supports various configuration parameters:
| Parameter | Description | Default |
|---|---|---|
host | The address of the Ollama server | Required |
model_id | The Ollama model identifier | Required |
keep_alive | How long the model stays loaded in memory | ”5m” |
max_tokens | Maximum number of tokens to generate | None |
temperature | Controls randomness (higher = more random) | None |
top_p | Controls diversity via nucleus sampling | None |
stop_sequences | List of sequences that stop generation | None |
options | Additional model parameters (e.g., top_k) | None |
additional_args | Any additional arguments for the request | None |
Example with Configuration
Section titled “Example with Configuration”from strands import Agentfrom strands.models.ollama import OllamaModel
# Create a configured Ollama modelollama_model = OllamaModel( host="http://localhost:11434", model_id="llama3.1", temperature=0.7, keep_alive="10m", stop_sequences=["###", "END"], options={"top_k": 40})
# Create an agent with the configured modelagent = Agent(model=ollama_model)
# Use the agentresponse = agent("Write a short story about an AI assistant.")Advanced Features
Section titled “Advanced Features”Updating Configuration at Runtime
Section titled “Updating Configuration at Runtime”You can update the model configuration during runtime:
# Create the model with initial configurationollama_model = OllamaModel( host="http://localhost:11434", model_id="llama3.1", temperature=0.7)
# Update configuration laterollama_model.update_config( temperature=0.9, top_p=0.8)This is especially useful if you want a tool to update the model’s config for you:
@tooldef update_model_id(model_id: str, agent: Agent) -> str: """ Update the model id of the agent
Args: model_id: Ollama model id to use. """ print(f"Updating model_id to {model_id}") agent.model.update_config(model_id=model_id) return f"Model updated to {model_id}"
@tooldef update_temperature(temperature: float, agent: Agent) -> str: """ Update the temperature of the agent
Args: temperature: Temperature value for the model to use. """ print(f"Updating Temperature to {temperature}") agent.model.update_config(temperature=temperature) return f"Temperature updated to {temperature}"Using Different Models
Section titled “Using Different Models”Ollama supports many different models. You can switch between them (make sure they are pulled first). See the list of available models here: https://ollama.com/search
# Create models for different use casescreative_model = OllamaModel( host="http://localhost:11434", model_id="llama3.1", temperature=0.8)
factual_model = OllamaModel( host="http://localhost:11434", model_id="mistral", temperature=0.2)
# Create agents with different modelscreative_agent = Agent(model=creative_model)factual_agent = Agent(model=factual_model)Structured Output
Section titled “Structured Output”Ollama supports structured output for models that have tool calling capabilities. When you pass structured_output_model parameter to the agent invocation, the Strands SDK converts your Pydantic models to tool specifications that compatible Ollama models can understand.
from pydantic import BaseModel, Fieldfrom strands import Agentfrom strands.models.ollama import OllamaModel
# 1) Define the Pydantic modelclass BookAnalysis(BaseModel): """Analyze a book's key information.""" title: str = Field(description="The book's title") author: str = Field(description="The book's author") genre: str = Field(description="Primary genre or category") summary: str = Field(description="Brief summary of the book") rating: int = Field(description="Rating from 1-10", ge=1, le=10)
# 2) Define Ollama Model and Promptollama_model = OllamaModel( host="http://localhost:11434", model_id="llama3.1",)
prompt = """ Analyze this book: "The Hitchhiker's Guide to the Galaxy" by Douglas Adams. It's a science fiction comedy about Arthur Dent's adventures through space after Earth is destroyed. It's widely considered a classic of humorous sci-fi with good ratings. """
# 3) Pass the model to the agentagent = Agent(model=ollama_model)result = agent(prompt, structured_output_model=BookAnalysis)
# 4) Access the `structured_output` from the resultresult_info: BookAnalysis = result.structured_outputprint(f"Title: {result_info.title}")print(f"Author: {result_info.author}")print(f"Genre: {result_info.genre}")print(f"Rating: {result_info.rating}")Tool Support
Section titled “Tool Support”Ollama models that support tool use can use tools through Strands’ tool system:
from strands import Agentfrom strands.models.ollama import OllamaModelfrom strands_tools import calculator, current_time
# Create an Ollama modelollama_model = OllamaModel( host="http://localhost:11434", model_id="llama3.1")
# Create an agent with toolsagent = Agent( model=ollama_model, tools=[calculator, current_time])
# Use the agent with toolsresponse = agent("What's the square root of 144 plus the current time?")Troubleshooting
Section titled “Troubleshooting”Common Issues
Section titled “Common Issues”-
Connection Refused:
- Ensure the Ollama server is running (
ollama serveor check Docker container status) - Verify the host URL is correct
- For Docker: Check if port 11434 is properly exposed
- Ensure the Ollama server is running (
-
Model Not Found:
- Pull the model first:
ollama pull model_nameordocker exec -it ollama ollama pull model_name - Check for typos in the model_id
- Pull the model first:
-
Module Not Found:
- If you encounter the error
ModuleNotFoundError: No module named 'ollama', this means you haven’t installed theollamadependency in your python environment - To fix, run
pip install 'strands-agents[ollama]'
- If you encounter the error