Overview
Conversation memory is the foundation of agent memory systems. This architecture handles storing, summarizing, and retrieving conversation history to maintain continuity across sessions while managing context window limitations.
The Challenge
Context Window Limits
LLMs have finite context windows:
Session Continuity
Users expect agents to remember:
Architecture Components
┌─────────────────────────────────────────────────────────────┐
│ Current Session │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Message 1 → Message 2 → Message 3 → ... → Message N │ │
│ └───────────────────────────────────────────────────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│ session end
▼
┌─────────────────────────────────────────────────────────────┐
│ Summarization Pipeline │
│ ├── Extract key facts and decisions │
│ ├── Identify action items and outcomes │
│ ├── Note preferences expressed │
│ └── Generate session summary │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Memory Storage │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Session Index │ │ Full Transcripts│ │
│ │ (Summaries + │ │ (Raw messages │ │
│ │ Embeddings) │ │ if needed) │ │
│ └─────────────────┘ └─────────────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│ new session starts
▼
┌─────────────────────────────────────────────────────────────┐
│ Context Assembly │
│ ├── Retrieve relevant past session summaries │
│ ├── Load any ongoing task context │
│ ├── Include user profile/preferences │
│ └── Assemble into system prompt │
└─────────────────────────────────────────────────────────────┘
Message Storage
What to Store
For each message:
Message:
├── id: unique identifier
├── session_id: which conversation
├── user_id: whose conversation
├── role: user | assistant
├── content: message text
├── timestamp: when sent
├── tokens: token count
└── metadata: any additional context
Session Metadata
For each conversation session:
Session:
├── id: unique identifier
├── user_id: whose session
├── started_at: timestamp
├── ended_at: timestamp
├── message_count: number of messages
├── summary: generated summary
├── summary_embedding: for retrieval
├── topics: extracted topics
└── outcome: resolved | ongoing | abandoned
Summarization Strategy
When to Summarize
What to Extract
**Session Summary:**
**Facts Learned:**
**Task State:**
Summarization Prompt
Summarize this conversation for future reference:
[CONVERSATION]
Extract:
5. One paragraph summary
Format as structured JSON.
Retrieval Strategy
Starting a New Session
- Recency (last few sessions)
- Relevance (if topic known)
- Importance (flagged sessions)
During Conversation
When the user references past discussion:
Context Budget
Allocate limited context window:
Total Context: 8000 tokens
├── System Prompt: 500 tokens
├── User Profile: 200 tokens
├── Recent Sessions: 1000 tokens (2-3 summaries)
├── Current Session: 5000 tokens
└── Buffer: 1300 tokens
Sliding Window Patterns
Simple Sliding Window
Keep last N messages:
Summarize and Slide
Summarize older messages:
Hierarchical Summarization
Multiple levels of compression:
Implementation Example
Session End Handler
async function onSessionEnd(sessionId):
messages = await getSessionMessages(sessionId)
summary = await llm.summarize(messages)
embedding = await embed(summary.text)
await storage.saveSession({
id: sessionId,
summary: summary.text,
summary_embedding: embedding,
topics: summary.topics,
facts: summary.facts,
ended_at: now()
})
// Also save individual facts to memory
for fact in summary.facts:
await memory.add({
content: fact,
source: sessionId,
type: "conversation_fact"
})
Context Assembly
async function assembleContext(userId, currentTopic):
profile = await getProfile(userId)
recentSessions = await getRecentSessions(userId, limit=3)
if currentTopic:
relevantSessions = await searchSessions(
userId,
query=currentTopic,
limit=2
)
ongoingTasks = await getOngoingTasks(userId)
return buildPrompt({
profile,
recentSessions,
relevantSessions,
ongoingTasks
})