MemGPT: Towards LLMs as Operating Systems

Charles Packer, Vivian Fang, Shishir G. Patil, Kevin Lin, Sarah Wooders, Joseph E. Gonzalez

arXiv · 2023

virtual-contextmemory-managementoperating-systemslong-context

TL;DR

Introduces a memory management system inspired by OS virtual memory, enabling LLMs to handle unlimited context through hierarchical memory and self-directed memory operations.

Key Contribution

MemGPT introduces a novel approach to handling LLM context limitations by treating the context window as "main memory" and using external storage as "disk." The LLM itself manages memory through function calls, deciding when to move information between memory tiers.

Architecture

Memory Hierarchy

  • Main Context: Limited LLM context window (the "RAM")
  • External Storage: Vector database and structured storage (the "disk")
  • Memory Management: LLM-directed paging between tiers
  • Self-Directed Memory

    Unlike traditional RAG where retrieval is automatic, MemGPT lets the LLM:

  • Explicitly request memory retrieval
  • Decide what to store and evict
  • Manage its own working set
  • Handle memory pressure gracefully
  • Key Mechanisms

    Memory Functions

    The LLM has access to memory operations:

  • `core_memory_append`: Add to persistent context
  • `core_memory_replace`: Update persistent context
  • `archival_memory_insert`: Store to long-term
  • `archival_memory_search`: Retrieve from long-term
  • `conversation_search`: Search past conversations
  • Paging System

    When context fills up:

  • System detects memory pressure
  • Older/less relevant content moved to archival
  • Space freed for new information
  • LLM can recall archived content when needed
  • Evaluation

    Tasks Tested

  • Document QA: Answer questions requiring information across long documents
  • Conversational Agents: Multi-session conversations with memory
  • Nested Retrieval: Tasks requiring multiple retrieval steps
  • Results

  • Outperforms fixed-context baselines on long documents
  • Maintains conversation coherence across many sessions
  • Successfully manages memory without human intervention
  • Implications for Agent Memory

    Design Principles

  • Let agents manage their own memory
  • Provide explicit memory operations as tools
  • Use hierarchical storage with different access patterns
  • Enable self-reflection about memory state
  • Practical Considerations

  • Memory operations consume context/tokens
  • Requires careful prompt engineering
  • Trade-off between autonomy and efficiency
  • Related Work

  • Builds on retrieval-augmented generation
  • Inspired by operating systems literature
  • Related to tool-use and function calling
  • Connects to cognitive architecture research
  • Citation

    @article{packer2023memgpt,

    title={MemGPT: Towards LLMs as Operating Systems},

    author={Packer, Charles and Fang, Vivian and Patil, Shishir G and Lin, Kevin and Wooders, Sarah and Gonzalez, Joseph E},

    journal={arXiv preprint arXiv:2310.08560},

    year={2023}

    }