Research Papers

Academic research on memory systems for AI agents. Key papers covering architectures, retrieval methods, cognitive approaches, and empirical evaluations.

Retrieval-Augmented Generation

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari Asai, Zeqiu Wu, Yizhong Wang et al. · ICLR 2024

2024

Trains LLMs to adaptively retrieve information and self-critique outputs using special reflection tokens, improving both accuracy and attribution.

self-ragreflectionretrievalcritique

Retrieval-Augmented Generation for Large Language Models: A Survey

Yunfan Gao, Yun Xiong, Xinyu Gao et al. · arXiv 2023

2023

Comprehensive survey of RAG techniques covering retrieval methods, generation approaches, and augmentation strategies for enhancing LLMs with external knowledge.

ragsurveyretrievalknowledge-augmentation

Cognitive Architectures

Generative Agents: Interactive Simulacra of Human Behavior

Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai et al. · UIST 2023

2023

Introduces generative agents with memory, reflection, and planning capabilities that produce believable human-like behavior in a simulated environment.

generative-agentssimulationmemory-streamreflection

Memory Architectures

LongMem: Augmenting Large Language Models with Long-Term Memory

Weizhi Wang, Li Dong, Hao Cheng et al. · arXiv 2023

2023

Proposes decoupling memory from model parameters, using a frozen LLM with trainable memory retrieval to enable unlimited context through a memory bank.

long-term-memorymemory-bankdecoupledscalable

MemGPT: Towards LLMs as Operating Systems

Charles Packer, Vivian Fang, Shishir G. Patil et al. · arXiv 2023

2023

Introduces a memory management system inspired by OS virtual memory, enabling LLMs to handle unlimited context through hierarchical memory and self-directed memory operations.

virtual-contextmemory-managementoperating-systemslong-context

Improving Language Models by Retrieving from Trillions of Tokens

Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann et al. · ICML 2022

2022

Shows that retrieval from a massive corpus (2 trillion tokens) can match the performance of 25x larger models, demonstrating retrieval as an efficient alternative to scaling parameters.

retroretrievalscalingefficient

Foundational Work

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu et al. · ICLR 2023

2023

Introduces the ReAct paradigm where language models interleave reasoning traces with actions, enabling more grounded and interpretable agent behavior.

reasoningactingagent-loopchain-of-thought

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì et al. · arXiv 2023

2023

Demonstrates that LLMs can learn to use external tools (calculator, search, etc.) in a self-supervised way by generating and filtering their own training data.

tool-useself-supervisedapi-callsaugmentation

Dense Passage Retrieval for Open-Domain Question Answering

Vladimir Karpukhin, Barlas Oguz, Sewon Min et al. · EMNLP 2020

2020

Introduces Dense Passage Retrieval (DPR), showing that learned dense embeddings significantly outperform sparse methods like BM25 for open-domain QA retrieval.

dense-retrievaldprembeddingsquestion-answering

REALM: Retrieval-Augmented Language Model Pre-Training

Kelvin Guu, Kenton Lee, Zora Tung et al. · ICML 2020

2020

Introduces end-to-end training of retrieval-augmented language models, jointly learning what to retrieve and how to use retrieved information during pre-training.

realmpretrainingretrievalend-to-end