Memory as Tool Architecture - Agent Memory Architecture

Overview

Instead of automatically injecting memories into context, this architecture gives agents explicit tools to read and write memories. The agent decides when to store information and when to retrieve it. This provides transparency, control, and efficient context usage.

The Tool-Based Approach

Explicit Over Implicit

Traditional approach:

System automatically retrieves relevant memories

Injects them into every prompt

Agent doesn't know what memories exist

No control over storage

Tool-based approach:

Agent explicitly calls `remember()` to store

Agent explicitly calls `recall()` to retrieve

Full visibility into memory operations

Intentional memory management

Memory Tools

Store Memory

Tool: remember

Description: Store important information for future reference

Parameters:

- content: The information to remember

- type: fact | preference | event | task

- importance: low | medium | high

- tags: Array of categorization tags

Returns: memory_id for reference

Retrieve Memories

Tool: recall

Description: Search for relevant memories

Parameters:

- query: What to search for

- type: Optional type filter

- limit: Max results (default 5)

- time_range: Optional date filter

Returns: Array of relevant memories with scores

Update Memory

Tool: update_memory

Description: Modify an existing memory

Parameters:

- memory_id: Which memory to update

- content: New content (optional)

- importance: New importance (optional)

Returns: Updated memory

Forget

Tool: forget

Description: Remove a memory

Parameters:

- memory_id: Which memory to remove

- reason: Why it's being forgotten

Returns: Confirmation

Architecture Flow

User: "Remember that I prefer window seats on flights"

Agent thinks: "User wants me to store a preference"

Agent calls: remember(

content="User prefers window seats on flights",

type="preference",

importance="medium",

tags=["travel", "flights", "seating"]

)

Agent responds: "Got it! I'll remember you prefer window seats."

---

User: "Book me a flight to NYC"

Agent thinks: "I should check if I have any flight preferences"

Agent calls: recall(

query="flight preferences travel",

type="preference"

)

Returns: [

{content: "User prefers window seats", score: 0.92},

{content: "User likes morning flights", score: 0.78}

]

Agent: "I'll look for morning flights with window seats to NYC..."

Benefits

Transparency

Users see memory operations:

Agent explicitly says "Let me remember that"

Agent explicitly says "Let me check my notes"

Clear audit trail of memory actions

No hidden context injection

Efficiency

Only retrieve when needed:

No wasted context on irrelevant memories

Agent judges when recall is valuable

Larger effective context window

Faster responses when no recall needed

Control

Agent manages its own memory:

Decides what's worth remembering

Chooses appropriate importance levels

Can update or remove outdated info

Intentional knowledge management

Debuggability

Easy to understand behavior:

See exactly what memories were retrieved

Understand why agent knows something

Trace decisions to specific recalls

Fix issues by adjusting stored memories

Implementation

System Prompt

Instruct agent on memory tool usage:

You have access to memory tools:

Use `remember` to store important information the user shares

Use `recall` before tasks where past context might help

Use `forget` when information is outdated or user requests

Guidelines:

Remember explicit preferences and important facts

Don't remember transient information

Recall before giving personalized recommendations

Be transparent about what you remember

Memory Store

Backend for the tools:

Vector database for semantic search

Structured storage for metadata

User isolation for multi-tenant

Efficient updates and deletes

Tool Handler

Process memory tool calls:

function handleMemoryTool(tool, params, userId):

switch tool:

case "remember":

embedding = embed(params.content)

return memory_store.add({

user_id: userId,

content: params.content,

embedding: embedding,

type: params.type,

importance: params.importance,

tags: params.tags,

created_at: now()

})

case "recall":

return memory_store.search({

user_id: userId,

query_embedding: embed(params.query),

filters: {type: params.type},

limit: params.limit

})

case "forget":

return memory_store.delete(params.memory_id)

When to Use

Good Fit

Agents with diverse tasks (not all need memory)

Transparency is important

Context window is limited

Users want to understand agent behavior

Consider Alternatives

Always-on memory needed (auto-inject better)

Simple, single-purpose agents

Memory is always relevant to task

Overhead of tool calls is undesirable

Hybrid Approach

Combine explicit and implicit:

Auto-inject critical memories (user name, key preferences)

Tools for everything else

Best of both worlds

Agent can still request additional context