Lab 7: Memory & Feedback Systems¶
⏱️ Estimated completion time: 50 minutes
Overview¶
This lab demonstrates a hybrid memory architecture that combines short-term buffer memory with long-term vector storage, featuring a feedback loop for continuous learning. It showcases:
- Typed memory structures and management
- Integration of memory systems in state graphs
- Context window creation for relevant information retrieval
- Feedback mechanisms for memory quality improvement
Learning Objectives¶
By the end of this lab, you will understand: - How to implement hybrid memory systems in agents - Short-term vs. long-term memory management strategies - Context window construction for LLM interactions - Memory quality assessment and feedback loops
Prerequisites¶
- Python 3.8+
- LangGraph installed (
pip install langgraph
) - Understanding of vector databases (conceptual)
Key Concepts¶
Hybrid Memory Architecture¶
- Short-term Memory: Recent conversation history and interactions
- Long-term Memory: Persistent knowledge stored in vector databases
- Context Window: Combined relevant information for current processing
Memory Feedback Loop¶
- Quality assessment of retrieved information
- Continuous improvement of memory relevance
- Dynamic adjustment of memory importance scores
Lab Code¶
#!/usr/bin/env python3
"""
Chapter 7 - Hybrid Memory System with LangGraph
-----------------------------------------------
This example demonstrates how to implement a hybrid memory architecture that combines:
1. Short-term buffer memory for recent interactions
2. Long-term vector storage for semantic search
3. Memory feedback loop for continuous learning
Key concepts:
- Typed memory structures
- Memory integration in the state graph
- Context windows for relevant information retrieval
- Feedback mechanism for memory quality improvement
"""
import argparse
import json
import time
from datetime import datetime
from typing import Dict, List, TypedDict, Optional, Any
from langgraph.graph import StateGraph
# ---------------------------------------------------------------------------
# Mock vector database ------------------------------------------------------
# ---------------------------------------------------------------------------
class MockVectorDB:
"""Simple mock vector database for demonstration purposes."""
def __init__(self):
self.docs = []
self.doc_id = 0
def add_document(self, text: str, metadata: Dict) -> int:
"""Add document to the store and return its ID."""
self.doc_id += 1
self.docs.append({
"id": self.doc_id,
"text": text,
"metadata": metadata,
"created_at": datetime.now().isoformat()
})
print(f"Added document #{self.doc_id} to vector store")
return self.doc_id
def search(self, query: str, top_k: int = 3) -> List[Dict]:
"""
Search for relevant documents.
This is a mock implementation that uses simple keyword matching.
In a real system, this would use embeddings and vector similarity.
"""
# Simulate retrieval latency
time.sleep(0.1)
# Simple keyword search (in real code, would be vector similarity)
matches = []
query_terms = set(query.lower().split())
# Find docs with term overlap
for doc in self.docs:
doc_terms = set(doc["text"].lower().split())
# Calculate simple overlap score
overlap = len(query_terms.intersection(doc_terms))
if overlap > 0:
matches.append({
"id": doc["id"],
"text": doc["text"],
"metadata": doc["metadata"],
"score": overlap / len(query_terms) # Normalize score
})
# Sort by score and take top_k
sorted_matches = sorted(matches, key=lambda x: x["score"], reverse=True)
return sorted_matches[:top_k]
def update_document_quality(self, doc_id: int, quality_score: float) -> None:
"""Update quality score metadata for a document."""
for doc in self.docs:
if doc["id"] == doc_id:
doc["metadata"]["quality"] = quality_score
print(f"Updated quality score for document #{doc_id}: {quality_score:.2f}")
break
# ---------------------------------------------------------------------------
# State definition ----------------------------------------------------------
# ---------------------------------------------------------------------------
class MemoryEntry(TypedDict):
text: str
timestamp: str
type: str # 'user' or 'agent'
metadata: Dict[str, Any]
class MemoryState(TypedDict, total=False):
query: str # Current user query
short_term: List[MemoryEntry] # Recent conversation history
long_term_results: List[Dict] # Results from long-term memory
context_window: List[str] # Combined context for the agent
response: str # Agent's response
memory_quality: Dict[int, float] # Quality ratings for memory items
# ---------------------------------------------------------------------------
# Initialize vector store ---------------------------------------------------
# ---------------------------------------------------------------------------
# This would typically be a persistent database
vector_db = MockVectorDB()
# Populate with some initial knowledge
initial_knowledge = [
{
"text": "The user prefers vegetarian food options when traveling.",
"metadata": {"source": "preference", "category": "food", "quality": 0.9}
},
{
"text": "The user traveled to Japan in 2022 and enjoyed the cherry blossom season.",
"metadata": {"source": "travel_history", "category": "location", "quality": 0.8}
},
{
"text": "The user likes hotels with good gym facilities and high-speed internet.",
"metadata": {"source": "preference", "category": "accommodation", "quality": 0.7}
},
{
"text": "The user prefers window seats on flights and typically books economy class.",
"metadata": {"source": "preference", "category": "transport", "quality": 0.85}
},
{
"text": "The user is interested in historical sites and museums when visiting new places.",
"metadata": {"source": "preference", "category": "activities", "quality": 0.75}
}
]
# Add initial knowledge to the vector store
for item in initial_knowledge:
vector_db.add_document(item["text"], item["metadata"])
# ---------------------------------------------------------------------------
# Memory system nodes -------------------------------------------------------
# ---------------------------------------------------------------------------
def update_short_term_memory(state: MemoryState) -> MemoryState:
"""
Update short-term memory with the current query.
In a real system, this would also include the previous response if available.
"""
# Initialize short-term memory if it doesn't exist
if "short_term" not in state:
state["short_term"] = [] # type: ignore
# Add the current query to short-term memory
new_entry: MemoryEntry = {
"text": state["query"],
"timestamp": datetime.now().isoformat(),
"type": "user",
"metadata": {"source": "conversation"}
}
# Append to short-term memory
state["short_term"].append(new_entry) # type: ignore
# In a real system, we would limit the size of short-term memory
# For example, keeping only the last 10 entries
if len(state["short_term"]) > 10:
state["short_term"] = state["short_term"][-10:] # type: ignore
print(f"Updated short-term memory with query: {state['query']}")
return state
def search_long_term_memory(state: MemoryState) -> MemoryState:
"""
Retrieve relevant information from long-term (vector) memory.
"""
query = state["query"]
# Search the vector database
results = vector_db.search(query, top_k=3)
# Store results in state
state["long_term_results"] = results # type: ignore
# Log the retrieved information
print(f"Retrieved {len(results)} items from long-term memory")
for i, result in enumerate(results):
print(f" {i+1}. {result['text']} (score: {result['score']:.2f})")
return state
def build_context_window(state: MemoryState) -> MemoryState:
"""
Combine short-term and long-term memory to create a context window.
"""
# Initialize context window
context_items = []
# Add relevant items from short-term memory
short_term = state.get("short_term", [])
for item in short_term[-5:]: # Last 5 interactions
context_items.append(f"Recent interaction: {item['text']}")
# Add relevant items from long-term memory
long_term = state.get("long_term_results", [])
for item in long_term:
quality = item["metadata"].get("quality", 0.5)
if item["score"] > 0.2 and quality > 0.6: # Filter by relevance and quality
context_items.append(f"From memory ({item['metadata']['category']}): {item['text']}")
# Store the assembled context window
state["context_window"] = context_items # type: ignore
print(f"Built context window with {len(context_items)} items")
return state
def generate_response(state: MemoryState) -> MemoryState:
"""
Generate a response based on the query and context window.
In a real system, this would use an LLM. Here we'll use a simple template.
"""
query = state["query"]
context = state.get("context_window", [])
# Simple template-based response for demo purposes
# In a real system, this would be an LLM call using the context
response = f"I'm responding to your query: '{query}'\n"
if context:
response += "\nI've considered the following information:\n"
for item in context:
response += f"- {item}\n"
# Add a default response if no context is available
if not context:
response += "\nI don't have much context about this query yet."
# Store the response
state["response"] = response # type: ignore
# Add the response to short-term memory for future context
new_entry: MemoryEntry = {
"text": response,
"timestamp": datetime.now().isoformat(),
"type": "agent",
"metadata": {"source": "conversation"}
}
state["short_term"].append(new_entry) # type: ignore
return state
def provide_memory_feedback(state: MemoryState) -> MemoryState:
"""
Evaluate the quality of retrieved memory items based on their usefulness.
In a real system, this would use an LLM to evaluate relevance to the query.
"""
query = state["query"]
long_term = state.get("long_term_results", [])
# Initialize memory quality tracking if not present
if "memory_quality" not in state:
state["memory_quality"] = {} # type: ignore
# Evaluate each retrieved memory item
for item in long_term:
doc_id = item["id"]
relevance = item["score"]
# Simple quality score based on relevance
# In a real system, this would be a more sophisticated evaluation
quality = min(0.95, relevance * 1.2)
# Store quality score in state
state["memory_quality"][doc_id] = quality # type: ignore
# Update quality score in vector database
vector_db.update_document_quality(doc_id, quality)
return state
def store_new_memory(state: MemoryState) -> MemoryState:
"""
Store important new information in long-term memory.
In a real system, this would use an LLM to extract key information.
"""
query = state["query"]
# Simple heuristic: store queries that look like preferences or facts
# In a real system, this would use an LLM to extract key information
if "I like" in query or "I prefer" in query or "I want" in query:
# Extract what seems to be a preference
metadata = {
"source": "preference",
"category": "general",
"quality": 0.8,
"extracted_from": "user query"
}
# Add to vector store
vector_db.add_document(query, metadata)
print(f"Stored new preference in long-term memory: {query}")
return state
# ---------------------------------------------------------------------------
# Graph construction --------------------------------------------------------
# ---------------------------------------------------------------------------
def build_memory_graph() -> StateGraph:
"""Build the graph for the memory system."""
g = StateGraph(MemoryState)
# Define the processing nodes
g.add_node("update_short_term", update_short_term_memory)
g.add_node("search_long_term", search_long_term_memory)
g.add_node("build_context", build_context_window)
g.add_node("generate_response", generate_response)
g.add_node("provide_feedback", provide_memory_feedback)
g.add_node("store_new_memory", store_new_memory)
# Define the flow
g.set_entry_point("update_short_term")
g.add_edge("update_short_term", "search_long_term")
g.add_edge("search_long_term", "build_context")
g.add_edge("build_context", "generate_response")
g.add_edge("generate_response", "provide_feedback")
g.add_edge("provide_feedback", "store_new_memory")
# Set the exit point
g.set_finish_point("store_new_memory")
return g
# ---------------------------------------------------------------------------
# Main function -------------------------------------------------------------
# ---------------------------------------------------------------------------
def main():
# Parse command-line arguments
parser = argparse.ArgumentParser(description="Hybrid Memory System Demo")
parser.add_argument("--query", type=str, default="Can you recommend some vegetarian restaurants for my trip?",
help="User query to process")
args = parser.parse_args()
# Build and compile the memory graph
graph = build_memory_graph().compile()
# Create initial state with user query
initial_state: MemoryState = {"query": args.query}
# Print header
print("\n=== Hybrid Memory System Demo ===\n")
print(f"Processing query: \"{args.query}\"\n")
# Execute the graph
final_state = graph.invoke(initial_state)
# Display the response
print("\n=== Agent Response ===\n")
print(final_state["response"])
# Show memory statistics
print("\n=== Memory System Stats ===")
print(f"Short-term memory size: {len(final_state.get('short_term', []))} entries")
print(f"Long-term memory items retrieved: {len(final_state.get('long_term_results', []))} items")
print(f"Context window items: {len(final_state.get('context_window', []))} items")
# Process a follow-up query to demonstrate memory continuity
if args.query != "I prefer hotels with swimming pools and room service":
print("\n=== Processing Follow-up Query ===")
follow_up = "I prefer hotels with swimming pools and room service"
print(f"Follow-up query: \"{follow_up}\"\n")
# Preserve short-term memory from previous interaction
second_state: MemoryState = {
"query": follow_up,
"short_term": final_state.get("short_term", [])
}
# Execute the graph again
follow_up_state = graph.invoke(second_state)
# Display the response
print("\n=== Agent Response to Follow-up ===\n")
print(follow_up_state["response"])
if __name__ == "__main__":
main()
How to Run¶
- Save the code above as
07_memory_feedback.py
- Install dependencies:
pip install langgraph
- Run the script:
python 07_memory_feedback.py
- Try with custom queries:
python 07_memory_feedback.py --query "I like luxury hotels with spa facilities"
Expected Output¶
=== Hybrid Memory System Demo ===
Processing query: "Can you recommend some vegetarian restaurants for my trip?"
Added document #1 to vector store
Added document #2 to vector store
Added document #3 to vector store
Added document #4 to vector store
Added document #5 to vector store
Updated short-term memory with query: Can you recommend some vegetarian restaurants for my trip?
Retrieved 1 items from long-term memory
1. The user prefers vegetarian food options when traveling. (score: 0.25)
Built context window with 2 items
Updated quality score for document #1: 0.30
=== Agent Response ===
I'm responding to your query: 'Can you recommend some vegetarian restaurants for my trip?'
I've considered the following information:
- Recent interaction: Can you recommend some vegetarian restaurants for my trip?
- From memory (food): The user prefers vegetarian food options when traveling.
=== Memory System Stats ===
Short-term memory size: 2 entries
Long-term memory items retrieved: 1 items
Context window items: 2 items
=== Processing Follow-up Query ===
Follow-up query: "I prefer hotels with swimming pools and room service"
Updated short-term memory with query: I prefer hotels with swimming pools and room service
Retrieved 1 items from long-term memory
1. The user likes hotels with good gym facilities and high-speed internet. (score: 0.20)
Built context window with 4 items
Updated quality score for document #3: 0.24
Stored new preference in long-term memory: I prefer hotels with swimming pools and room service
Added document #6 to vector store
=== Agent Response to Follow-up ===
I'm responding to your query: 'I prefer hotels with swimming pools and room service'
I've considered the following information:
- Recent interaction: Can you recommend some vegetarian restaurants for my trip?
- Recent interaction: I'm responding to your query: 'Can you recommend some vegetarian restaurants for my trip?'
I've considered the following information:
- From memory (food): The user prefers vegetarian food options when traveling.
- Recent interaction: I prefer hotels with swimming pools and room service
Key Concepts Explained¶
Hybrid Memory Architecture¶
graph TD
A[User Query] --> B[Update Short-term Memory]
B --> C[Search Long-term Memory]
C --> D[Build Context Window]
D --> E[Generate Response]
E --> F[Provide Feedback]
F --> G[Store New Memory]
H[Vector Database] --> C
C --> H
F --> H
G --> H
Memory Components¶
Short-term Memory¶
- Purpose: Store recent conversation history
- Characteristics: Limited size, ephemeral, fast access
- Use Cases: Context for current conversation, recent user preferences
Long-term Memory¶
- Purpose: Store persistent knowledge and learned information
- Characteristics: Unlimited size, persistent, semantic search
- Use Cases: User preferences, historical data, domain knowledge
Context Window¶
- Purpose: Combine relevant information for LLM processing
- Characteristics: Filtered and ranked information
- Use Cases: Prompt augmentation, relevant context delivery
Memory Feedback Loop¶
Quality Assessment¶
- Relevance scoring for retrieved information
- Usage tracking for memory items
- Continuous improvement of retrieval quality
Dynamic Updating¶
- Real-time quality score adjustments
- Preference learning from interactions
- Memory importance weighting
Advanced Patterns¶
Hierarchical Memory¶
class HierarchicalMemory:
"""Multi-level memory system with different retention policies."""
def __init__(self):
self.working_memory = [] # Current session
self.episodic_memory = [] # Recent sessions
self.semantic_memory = {} # Long-term facts
self.procedural_memory = {} # Learned procedures
Memory Consolidation¶
def consolidate_memory(short_term: List[MemoryEntry]) -> List[Dict]:
"""Convert short-term memories to long-term storage."""
consolidated = []
for entry in short_term:
if is_important(entry):
consolidated.append({
"text": extract_key_information(entry),
"metadata": enrich_metadata(entry),
"consolidation_time": datetime.now()
})
return consolidated
Forgetting Mechanisms¶
def apply_forgetting_curve(memory_item: Dict) -> float:
"""Apply Ebbinghaus forgetting curve to memory importance."""
time_elapsed = datetime.now() - memory_item["created_at"]
access_frequency = memory_item.get("access_count", 1)
# Forgetting curve with reinforcement
importance = math.exp(-time_elapsed.days / (access_frequency * 30))
return max(0.1, importance) # Minimum retention
Exercises¶
- Implement forgetting curves: Add time-based decay to memory importance
- Add memory categories: Implement specialized storage for different information types
- Create memory visualization: Build a dashboard showing memory state and quality
- Implement memory search ranking: Add more sophisticated retrieval algorithms
- Add external memory sources: Integrate with external knowledge bases
Real-World Applications¶
- Personal Assistants: Remember user preferences and history
- Customer Service: Maintain conversation context and customer history
- Educational Tutors: Track learning progress and adapt content
- Healthcare Agents: Maintain patient history and treatment context
- E-commerce: Personalized recommendations based on behavior history
Performance Considerations¶
- Memory Size Management: Implement efficient pruning strategies
- Search Optimization: Use vector similarity for fast retrieval
- Context Window Limits: Balance information richness with processing speed
- Quality vs. Quantity: Trade-off between memory completeness and relevance
Integration with Production Systems¶
Vector Database Integration¶
# Example with Pinecone, Weaviate, or Chroma
import pinecone
def create_production_memory():
"""Initialize production-ready vector database."""
pinecone.init(api_key="your-api-key")
index = pinecone.Index("agent-memory")
return ProductionMemorySystem(index)
LLM Integration¶
def llm_memory_extraction(text: str) -> Dict:
"""Use LLM to extract structured information from text."""
prompt = f"Extract key facts and preferences from: {text}"
response = llm.invoke(prompt)
return parse_structured_response(response)