What Are the Prerequisites for Using Ollama
Before getting started, ensure your system meets these requirements:- Operating System: macOS, Linux, or Windows
- RAM: Minimum 8GB (16GB+ recommended)
- Storage: At least 10GB free space
- Continue extension installed
How to Install Ollama - Step-by-Step
Step 1: Install Ollama
Choose the installation method for your operating system:Step 2: Start Ollama Service
After installation, start the Ollama service:Step 3: Download Models
Important: Always use
ollama pull
instead of ollama run
to download
models. The run
command starts an interactive session which isn’t needed for
Continue.:latest
- Default version (used if no tag specified):32b
,:7b
,:1.5b
- Parameter count versions:instruct
,:base
- Model variants
If a model page shows
deepseek-r1:32b
on Ollama’s website, you must pull it
with that exact tag. Using just deepseek-r1
will pull :latest
which may be
a different size.How to Configure Ollama with Continue
There are multiple ways to configure Ollama models in Continue:Method 1: Using Hub Model Blocks in Local config.yaml
The easiest way is to use pre-configured model blocks from the Continue Hub in your local configuration:~/.continue/assistants/My Local Assistant.yaml
Important: Hub blocks only provide configuration - you still need to pull
the model locally. The hub block If the model isn’t installed, Ollama will return:
ollama/deepseek-r1-32b
configures Continue
to use model: deepseek-r1:32b
, but the actual model must be installed:404 model "deepseek-r1:32b" not found, try pulling it first
Method 2: Using Autodetect
Continue can automatically detect available Ollama models. You can configure this in your YAML:~/.continue/config.yaml
- Click on the model selector dropdown
- Select “Autodetect” option
- Continue will scan for available Ollama models
- Select your desired model from the detected list
The Autodetect feature scans your local Ollama installation and lists all
available models. When set to
AUTODETECT
, Continue will dynamically populate
the model list based on what’s installed locally via ollama list
. This is
useful for quickly switching between models without manual configuration. For
any roles not covered by the detected models, you may need to manually
configure them.apiBase
with the IP address of a remote machine serving Ollama.
Method 3: Manual Configuration
For custom configurations or models not on the hub:Model Capabilities and Tool Support
Some Ollama models support tools (function calling) which is required for Agent mode. However, not all models that claim tool support work correctly:Checking Tool Support
Known Issue: Some models like DeepSeek R1 may show “Agent mode is not
supported” or “does not support tools” even with capabilities configured. This
is a known limitation where the model’s actual tool support differs from its
advertised capabilities.
If Agent Mode Shows “Not Supported”
- First, add
capabilities: [tool_use]
to your model config - If you still get errors, the model may not actually support tools despite documentation
- Use a different model known to work with tools (e.g., Llama 3.1, Mistral)
How to Configure Advanced Settings
For optimal performance, consider these advanced configuration options:What Are the Best Practices for Ollama
How to Choose the Right Model
Choose models based on your specific needs (see recommended models for more options):-
Code Generation:
qwen2.5-coder:7b
- Excellent for code completioncodellama:13b
- Strong general coding supportdeepseek-coder:6.7b
- Fast and efficient
-
Chat & Reasoning:
llama3.1:8b
- Latest Llama with tool supportmistral:7b
- Fast and versatiledeepseek-r1:32b
- Advanced reasoning capabilities
-
Autocomplete:
qwen2.5-coder:1.5b
- Lightweight and faststarcoder2:3b
- Optimized for code completion
-
Memory Requirements:
- 1.5B-3B models: ~4GB RAM
- 7B models: ~8GB RAM
- 13B models: ~16GB RAM
- 32B models: ~32GB RAM
How to Optimize Performance
To get the best performance from Ollama:- Monitor system resources with
ollama ps
to see memory usage - Adjust context window size based on available RAM
- Use appropriate model sizes for your hardware
- Enable GPU acceleration when available (NVIDIA CUDA or AMD ROCm)
- Use
ollama logs
to debug performance issues
How to Troubleshoot Ollama Issues
Common Configuration Problems
”404 model not found, try pulling it first”
This error occurs when the model isn’t installed locally: Problem: Using a hub block or config that references a model not yet pulled Solution:Model Tag Mismatches
Problem:ollama pull deepseek-r1
installs :latest
but hub block expects :32b
Solution: Always pull with the exact tag:
“Agent mode is not supported”
Problem: Model doesn’t support tools/function calling Solutions:- Add
capabilities: [tool_use]
to your model config - If still not working, the model may not actually support tools
- Switch to a model with confirmed tool support (Llama 3.1, Mistral)
Using Hub Blocks in Local Config
Problem: Unclear how to use hub models locally Solution: Create a local assistant file:How to Fix Connection Problems
- Verify Ollama is running:
curl http://localhost:11434
- Check service status:
systemctl status ollama
(Linux) - Ensure port 11434 is not blocked by firewall
- For remote connections, set
OLLAMA_HOST=0.0.0.0:11434
How to Resolve Performance Issues
- Insufficient RAM: Use smaller models (7B instead of 32B)
- Model too large: Check available memory with
ollama ps
- GPU issues: Verify CUDA/ROCm installation for GPU acceleration
- Slow generation: Adjust
num_gpu
layers in model configuration - Check system diagnostics:
ollama ps
for active models and memory usage
What Are Example Workflows with Ollama
How to Use Ollama for Code Generation
How to Use Ollama for Code Review
Use Continue with Ollama to:- Analyze code quality
- Suggest improvements
- Identify potential bugs
- Generate documentation
Conclusion
Ollama with Continue provides a powerful local development environment for AI-assisted coding. You now have complete control over your AI models, ensuring privacy and enabling offline development workflows.This guide is based on Ollama v0.11.x and Continue v1.1.x. Please check for updates regularly.