Memory and RAG Tutorial (15 minutes) β
Overview β
Learn how to add persistent memory and knowledge systems to your agents using RAG (Retrieval-Augmented Generation). You'll set up vector databases, implement document ingestion, and create knowledge-aware agents.
Prerequisites β
- Complete the Multi-Agent Collaboration tutorial
- Docker installed (for database setup)
- Basic understanding of vector databases
Learning Objectives β
By the end of this tutorial, you'll understand:
- How to set up vector databases for agent memory
- Document ingestion and chunking strategies
- RAG implementation for knowledge-aware responses
- Hybrid search combining semantic and keyword matching
What You'll Build β
A knowledge-aware agent system that can:
- Ingest documents into a vector database
- Search knowledge using semantic similarity
- Generate responses enhanced with retrieved context
- Remember conversations across sessions
Part 1: Basic Memory Setup (5 minutes) β
Start with in-memory storage to understand the concepts.
Create a Memory-Enabled Project β
# Create project with basic memory
agentcli create knowledge-agent --memory memory --agents 2
cd knowledge-agent
Understanding Memory Configuration β
The generated agentflow.toml
includes memory settings:
[agent_memory]
provider = "memory" # In-memory storage (temporary)
auto_embed = true # Automatically create embeddings
max_results = 10 # Maximum search results
dimensions = 1536 # Embedding dimensions
[agent_memory.embedding]
provider = "ollama" # Local embeddings (recommended)
model = "nomic-embed-text:latest"
Test Basic Memory β
# Make sure Ollama is running with the embedding model
ollama pull nomic-embed-text:latest
# Set your API key and run
export OPENAI_API_KEY=your-api-key-here
go run main.go
The agents now have basic memory capabilities, but data is lost when the program stops.
Part 2: Persistent Memory with PostgreSQL (5 minutes) β
Set up persistent memory using PostgreSQL with pgvector extension.
Create a Persistent Memory Project β
# Create project with PostgreSQL memory
agentcli create persistent-agent --memory pgvector --rag default --agents 2
cd persistent-agent
Start the Database β
The project includes a docker-compose.yml
file:
# Start PostgreSQL with pgvector extension
docker-compose up -d
# Wait for database to be ready (about 30 seconds)
docker-compose logs -f postgres
Understanding Persistent Configuration β
[agent_memory]
provider = "pgvector" # PostgreSQL with vector extension
connection = "postgres://agentflow:password@localhost:5432/agentflow?sslmode=disable"
enable_knowledge_base = true # Enable document storage
enable_rag = true # Enable RAG functionality
[agent_memory.documents]
supported_types = ["pdf", "txt", "md", "web"]
auto_chunk = true # Automatically chunk documents
chunk_size = 1000 # Tokens per chunk
chunk_overlap = 200 # Overlap between chunks
[agent_memory.search]
hybrid_search = true # Combine semantic + keyword search
keyword_weight = 0.3 # 30% keyword, 70% semantic
semantic_weight = 0.7
Test Persistent Memory β
export OPENAI_API_KEY=your-api-key-here
go run main.go
Now your agents have persistent memory that survives restarts!
Part 3: RAG Implementation (5 minutes) β
Implement full RAG (Retrieval-Augmented Generation) with document ingestion.
Create a RAG-Enabled System β
# Create comprehensive RAG system
agentcli create rag-system --template rag-system
cd rag-system
Start the Enhanced Database β
docker-compose up -d
Understanding RAG Configuration β
[agent_memory]
provider = "pgvector"
enable_rag = true
rag_max_context_tokens = 4000 # Max context for RAG
rag_personal_weight = 0.3 # Weight for personal memory
rag_knowledge_weight = 0.7 # Weight for knowledge base
[agent_memory.documents]
enable_metadata_extraction = true # Extract document metadata
enable_url_scraping = true # Support web URLs
max_file_size = "10MB" # Maximum document size
[agent_memory.search]
hybrid_search = true # Semantic + keyword search
top_k = 5 # Top results to retrieve
score_threshold = 0.7 # Minimum similarity score
Add Documents to the Knowledge Base β
Create a sample document:
# Create a sample knowledge document
cat > knowledge.md << 'EOF'
# AgenticGoKit Knowledge Base
## Multi-Agent Systems
AgenticGoKit supports multiple orchestration patterns:
- Collaborative: Agents work in parallel
- Sequential: Agents work in pipeline
- Mixed: Combination of both patterns
## Memory Systems
AgenticGoKit provides several memory providers:
- In-memory: Fast but temporary
- PostgreSQL: Persistent with pgvector
- Weaviate: Dedicated vector database
## Tool Integration
Agents can use external tools through MCP:
- Web search capabilities
- File operations
- Custom API integrations
EOF
Test RAG System β
export OPENAI_API_KEY=your-api-key-here
go run main.go
The system will:
- Ingest the knowledge document
- Chunk it into searchable pieces
- Embed chunks using vector embeddings
- Retrieve relevant context for queries
- Generate enhanced responses using RAG
Query the Knowledge Base β
The agents can now answer questions using the ingested knowledge:
- "What orchestration patterns does AgenticGoKit support?"
- "How do I set up persistent memory?"
- "What tools can agents use?"
Memory Providers Comparison β
Provider | Persistence | Performance | Use Case |
---|---|---|---|
memory | β Temporary | β‘ Fastest | Development, testing |
pgvector | β Persistent | π Fast | Production, SQL integration |
weaviate | β Persistent | π Fast | Advanced vector operations |
RAG Configuration Options β
Document Processing β
# Customize document processing
agentcli create doc-system --memory pgvector --rag 512
Search Configuration β
# Fine-tune search behavior
agentcli create search-system --memory pgvector --rag default
Embedding Models β
# Use OpenAI embeddings (requires API key)
agentcli create openai-system --memory pgvector --embedding openai
# Use local Ollama embeddings (recommended)
agentcli create local-system --memory pgvector --embedding ollama:nomic-embed-text
Advanced Memory Features β
Session Memory β
# Enable session-based memory isolation
agentcli create session-system --template chat-system
Session memory keeps conversations separate for different users or contexts.
Hybrid Search β
# Configure hybrid search weights
agentcli create hybrid-system --memory pgvector --rag default
Hybrid search combines:
- Semantic search: Understanding meaning and context
- Keyword search: Exact term matching
Troubleshooting β
Common Issues β
Database connection failed:
# Check if PostgreSQL is running
docker-compose ps
# Check logs
docker-compose logs postgres
# Restart if needed
docker-compose restart postgres
Embedding model not found:
# For Ollama embeddings
ollama pull nomic-embed-text:latest
ollama list # Verify model is installed
# Check Ollama is running
curl http://localhost:11434/api/tags
RAG not working:
# Verify documents are ingested
# Check agentflow.toml configuration
# Ensure embedding provider is working
Performance Issues β
Slow search:
- Reduce
rag_top_k
value - Increase
score_threshold
- Use smaller embedding models
High memory usage:
- Reduce
chunk_size
- Limit
max_results
- Use pgvector instead of in-memory
Memory System Architecture β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Documents βββββΆβ Chunking βββββΆβ Embeddings β
β (PDF, MD, β β (1000 tokens) β β (Vector DB) β
β TXT, Web) β β β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Agent ββββββ RAG Context ββββββ Similarity β
β Response β β Injection β β Search β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
Next Steps β
Now that your agents have memory and knowledge capabilities:
- Add Tools: Learn Tool Integration to connect external services
- Go Production: Check Production Deployment for scaling
- Advanced Memory: Explore Memory System Tutorials for deep dives
Key Takeaways β
- Memory Providers: Choose based on persistence and performance needs
- RAG: Combines retrieval and generation for knowledge-aware responses
- Document Processing: Automatic chunking and embedding for searchability
- Hybrid Search: Best results combining semantic and keyword matching
- Session Memory: Isolate conversations for multi-user scenarios
Further Reading β
- Memory Systems Deep Dive - Advanced memory concepts
- Vector Databases Guide - Database comparison
- RAG Implementation - Advanced RAG patterns