ReAct: Synergizing Reasoning and Acting in Language Models - Agent Memory Research

Key Contribution

ReAct introduces a simple but powerful paradigm: interleaving reasoning (thinking) with acting (using tools). This synergy enables language models to both reason about tasks AND take grounded actions, with each informing the other.

The ReAct Paradigm

Traditional Approaches

Reasoning only: (Chain-of-Thought): Think through problem, no external actions

Acting only: (WebGPT, etc.): Take actions without explicit reasoning

ReAct Approach

Alternate between:

**Thought**: Reason about current situation, plan next step

**Action**: Execute action (search, lookup, etc.)

**Observation**: Receive result of action

Repeat until task complete

Example Trace

Question: What is the elevation of the birthplace of the inventor of the telephone?

Thought 1: I need to find who invented the telephone.

Action 1: Search[inventor of telephone]

Observation 1: Alexander Graham Bell invented the telephone...

Thought 2: Alexander Graham Bell invented the telephone. Now I need his birthplace.

Action 2: Search[Alexander Graham Bell birthplace]

Observation 2: Bell was born in Edinburgh, Scotland...

Thought 3: Bell was born in Edinburgh. Now I need Edinburgh's elevation.

Action 3: Search[Edinburgh elevation]

Observation 3: Edinburgh has an elevation of 47 meters...

Thought 4: The elevation of Edinburgh is 47 meters.

Action 4: Finish[47 meters]

Why It Works

Reasoning Helps Acting

Plans what action to take next

Interprets action results

Handles unexpected observations

Tracks progress toward goal

Acting Helps Reasoning

Grounds reasoning in real information

Prevents hallucination

Provides new information to reason about

Enables complex multi-step tasks

Evaluation

Tasks

HotpotQA: Multi-hop question answering

Fever: Fact verification

ALFWorld: Interactive text game

WebShop: Web navigation shopping

Results

Outperforms reasoning-only and acting-only baselines

More interpretable due to visible reasoning

Better at recovering from errors

Fewer hallucinations than pure reasoning

Relevance to Agent Memory

Memory in ReAct

The observation history serves as working memory:

Past observations inform current reasoning

Accumulated knowledge guides actions

Errors can be recognized and corrected

Extensions for Long-term Memory

ReAct can be extended with:

Persistent storage of observations

Retrieval of relevant past experiences

Learning from successful traces

Implementation Patterns

Prompt Structure

Solve a question answering task with interleaving Thought, Action, Observation steps.

Thought: [reasoning about current state]

Action: [Search/Lookup/Finish][input]

Observation: [result of action]

(repeat)

Action Space

Define available actions:

Search[query]: Web/knowledge search

Lookup[term]: Look up in retrieved document

Finish[answer]: Complete with answer

Limitations

Requires careful prompt engineering

Action space must be predefined

Can get stuck in loops

Reasoning adds latency

Impact

ReAct has become foundational for:

Agent architectures (LangChain, AutoGPT)

Tool-using language models

Interactive AI systems

Multi-step reasoning systems

Citation

@inproceedings{yao2023react,

title={ReAct: Synergizing Reasoning and Acting in Language Models},

author={Yao, Shunyu and Zhao, Jeffrey and Yu, Dian and Du, Nan and Shafran, Izhak and Narasimhan, Karthik and Cao, Yuan},

booktitle={International Conference on Learning Representations},

year={2023}

}

TL;DR