How AI Agents Work: Core Design Patterns

A vendor-neutral guide to how AI agents work: the agent loop, tool use, ReAct, planning, reflection, and memory, with primary sources.

By Aditya Kumar Jha · 2026-06-01 · 13 min read · AI Explained

An AI agent never runs its own tools. It is a language model placed inside a loop: the model emits a structured request, your code runs the search or API call, and the result is fed back as text for its next turn. That is the whole idea. Strip the marketing off almost any agent and you find the same four-beat loop, perceive, reason, act, observe, wrapped around a handful of patterns that have barely changed since the foundational papers landed in 2022 and 2023. This is the durable mental model: the patterns outlast the model releases.

Quick takeaways: an agent is an LLM in a loop with tools. The model never runs a tool itself; it emits a structured request, your system runs it, and the result is fed back into the context. The core patterns are tool use, ReAct (reason plus act), planning, reflection, and memory. Multi-agent setups are just these patterns composed. And the agent's weakest link is almost always memory.

Workflow versus agent: the distinction that matters

Anthropic's widely referenced 2024 guide draws the cleanest line here. A workflow is a system where the LLM and its tools follow predefined code paths that you wrote. An agent is a system where the LLM dynamically directs its own process, choosing which tools to call and in what order to reach a goal. Most production systems people call agents are not agents at all. They are workflows: the route is fixed, and that is usually the right call, because workflows are predictable and cheap to debug. Reach for a true agent only when the path cannot be known in advance.

The agent loop

Every agent runs the same cycle. The model perceives its current context, reasons about what to do, acts by requesting a tool, and then observes the result, which is appended back into its context for the next pass. The loop repeats until the task is done or a stopping condition is hit. The single most important thing to understand is what the model does and does not do: it does not execute anything. It outputs a structured request, for example a function name and arguments, and the surrounding harness actually runs the search, the API call, or the database query and hands the result back as text the model reads on its next turn.

This is why an agent can be confidently, specifically wrong: it reasons over whatever the last observation put in front of it. Good agents are mostly good plumbing: clean tools, clear observations, and a context that does not overflow.

Pattern 1: Tool use and function calling

Tool use is the foundation everything else sits on. You describe a set of tools to the model, each with a name, a description, and a schema for its inputs. When the model decides a tool is needed, it returns a structured call matching that schema instead of a plain answer. Your code runs the tool and returns the result. This is how an LLM, which only produces text, reaches out to the live world: search, code execution, a calendar, a payments API, a private database. Without tools, a model is a closed book. With them, it becomes an operator.

Pattern 2: ReAct, reason plus act

ReAct, introduced by Yao and colleagues in 2022, is the pattern that made agents work in practice. Instead of reasoning in one block and then acting, the model interleaves the two: it writes a short thought, takes an action, reads the observation, then writes the next thought informed by what it just learned. The reasoning steers the actions, and the actions feed real information back into the reasoning. The payoff is grounding. Because the model checks the world between steps rather than reasoning in a vacuum, it hallucinates less than pure chain-of-thought prompting. Almost every modern agent framework is a descendant of this thought, action, observation loop.

Pattern 3: Planning and task decomposition

Hard tasks fail when an agent tries to do them in one leap. Planning is the pattern of breaking a goal into ordered sub-steps before executing, then working through them. There is no single founding paper here; it is a family of techniques, from decomposing a prompt into modular sub-tasks to plan-and-solve approaches that draft a plan first and carry it out second. The practical value is the same across all of them: a plan turns an impossible one-shot request into a sequence of steps each small enough to get right, and it gives the agent something to check progress against.

Pattern 4: Reflection and self-critique

An agent that can review its own work is far more reliable than one that cannot. Reflection is the pattern of having the model critique a result, name what went wrong, and try again with that critique in hand. The Reflexion paper from Shinn and colleagues in 2023 formalized a strong version: the agent reflects in plain language on feedback from a failed attempt and stores those reflections in a short episodic memory, so the next attempt is informed by the last. It improves without any change to the model's weights, purely by feeding its own lessons back into context. In production this often appears as an evaluator-optimizer pair: one step generates, another judges and sends it back for a fix.

Pattern 5: Memory, the part that breaks first

Memory is where most agents quietly fall apart, and it splits cleanly in two. Short-term memory is the context window: everything the agent can see in the current run, wiped when the run ends. Long-term memory is a separate, persistent store the agent can search across runs. The standard way to build the second is retrieval, the pattern introduced as retrieval-augmented generation by Lewis and colleagues in 2020: keep knowledge in an external index, and at each step pull the few relevant pieces into the context window rather than trying to hold everything at once.

This is the part worth dwelling on, because it is the difference between a demo and something useful. An agent with no long-term memory restarts cold every session and re-learns nothing. The fix is not a bigger context window, which still resets; it is an external memory layer the agent reads from on demand. That is the role MemX plays for a personal AI: a durable store of your documents, notes, voice memos, and decisions that any model can retrieve from in plain language, so the memory survives even when you switch the model underneath. The model is the reasoning engine. The memory is the part you own and keep.

Multi-agent orchestration is just these patterns composed

Multi-agent systems sound like a separate category. They are not: they are the same patterns arranged so that several model calls cooperate. Anthropic's guide catalogs the common shapes: prompt chaining (each step feeds the next), routing (classify the request, then send it to the right handler), parallelization (split work across calls and merge), orchestrator-workers (a lead model delegates subtasks to workers), and evaluator-optimizer (one generates, another critiques in a loop). Frameworks such as LangGraph, CrewAI, and AutoGen give you scaffolding for these shapes, but the shapes themselves are what to learn, because they outlive any framework.

How agents connect to tools: MCP

One last piece ties the room together. The Model Context Protocol, introduced by Anthropic in late 2024, is an open standard for connecting agents to external tools and data in a uniform way, so you do not hand-write a custom integration for every source. It has become broadly adopted, and Anthropic has since donated it to the Agentic AI Foundation, a directed fund under the Linux Foundation, with the protocol keeping its own technical governance. The takeaway for a builder is simple: tool use is becoming standardized, which means the agent patterns above are the durable skill, not the wiring beneath them.

The patterns at a glance

Pattern	What it does	Anchor reference
Tool use	Lets the model call the live world via structured requests	Function calling, model emits a call, your code runs it
ReAct	Interleaves reasoning and actions to stay grounded	Yao et al., 2022 (arXiv 2210.03629)
Planning	Breaks a goal into ordered, checkable sub-steps	A family of methods, e.g. decomposed prompting
Reflection	Critiques its own output and retries with the lesson	Reflexion, Shinn et al., 2023 (arXiv 2303.11366)
Memory	Persists knowledge across runs via retrieval	RAG, Lewis et al., 2020 (arXiv 2005.11401)

Key takeaway: an AI agent is an LLM in a loop that can use tools, check its work, and remember. The five patterns, tool use, ReAct, planning, reflection, and memory, are the stable core. Learn them once and every new framework and model release becomes a variation on a theme you already understand.

What is an AI agent in simple terms? An AI agent is a language model placed inside a loop where it can use tools, observe the results, and decide its next step toward a goal. Unlike a plain chatbot that just answers, an agent can take actions, search, call an API, run code, and react to what it finds, repeating until the task is done.

What is the difference between a workflow and an agent? In a workflow, the LLM and tools follow predefined code paths you wrote, so the route is fixed and predictable. In an agent, the model dynamically decides which tools to call and in what order. Most reliable production systems are workflows; use a true agent only when the path cannot be planned in advance.

What is the ReAct pattern? ReAct, from a 2022 paper by Yao and colleagues, interleaves reasoning and acting: the model writes a short thought, takes an action, reads the result, then reasons again with that new information. Because it checks the world between steps, it stays grounded and hallucinates less than reasoning alone.

Do AI agents actually run the tools themselves? No. The model never executes anything. It outputs a structured request, such as a tool name and arguments, and the surrounding system runs the tool and feeds the result back into the model's context. This separation is why an agent's reliability depends heavily on the quality of its tools and observations.

Why do AI agents struggle with memory? An agent's short-term memory is its context window, which resets every run, and a bigger window still resets. Persisting knowledge across sessions requires a separate long-term store the agent retrieves from, the retrieval-augmented pattern. Without that external memory layer, an agent restarts cold and re-learns nothing each time.

Insight

Workflow versus agent: the distinction that matters

The agent loop

Also on LumiChats

AI Explained

How to Write AI Prompts That Actually Work

12 min read→

AI Explained

What Is an LLM? How AI Chatbots Work

10 min read→

AI Explained

What Are AI Tokens? Explained Simply

9 min read→

Pattern 1: Tool use and function calling

Pattern 2: ReAct, reason plus act

Pattern 3: Planning and task decomposition

Pattern 4: Reflection and self-critique

Pattern 5: Memory, the part that breaks first

Multi-agent orchestration is just these patterns composed

How agents connect to tools: MCP

The patterns at a glance

Pattern	What it does	Anchor reference
Tool use	Lets the model call the live world via structured requests	Function calling, model emits a call, your code runs it
ReAct	Interleaves reasoning and actions to stay grounded	Yao et al., 2022 (arXiv 2210.03629)
Planning	Breaks a goal into ordered, checkable sub-steps	A family of methods, e.g. decomposed prompting
Reflection	Critiques its own output and retries with the lesson	Reflexion, Shinn et al., 2023 (arXiv 2303.11366)
Memory	Persists knowledge across runs via retrieval	RAG, Lewis et al., 2020 (arXiv 2005.11401)

Insight

Frequently Asked Questions

01What is an AI agent in simple terms?

An AI agent is a language model placed inside a loop where it can use tools, observe the results, and decide its next step toward a goal. Unlike a plain chatbot that just answers, an agent can take actions, search, call an API, run code, and react to what it finds, repeating until the task is done.

02What is the difference between a workflow and an agent?

In a workflow, the LLM and tools follow predefined code paths you wrote, so the route is fixed and predictable. In an agent, the model dynamically decides which tools to call and in what order. Most reliable production systems are workflows; use a true agent only when the path cannot be planned in advance.

03What is the ReAct pattern?

ReAct, from a 2022 paper by Yao and colleagues, interleaves reasoning and acting: the model writes a short thought, takes an action, reads the result, then reasons again with that new information. Because it checks the world between steps, it stays grounded and hallucinates less than reasoning alone.

04Do AI agents actually run the tools themselves?

No. The model never executes anything. It outputs a structured request, such as a tool name and arguments, and the surrounding system runs the tool and feeds the result back into the model's context. This separation is why an agent's reliability depends heavily on the quality of its tools and observations.

05Why do AI agents struggle with memory?

An agent's short-term memory is its context window, which resets every run, and a bigger window still resets. Persisting knowledge across sessions requires a separate long-term store the agent retrieves from, the retrieval-augmented pattern. Without that external memory layer, an agent restarts cold and re-learns nothing each time.

How AI Agents Work: Core Design Patterns

Workflow versus agent: the distinction that matters

The agent loop

Pattern 1: Tool use and function calling

Pattern 2: ReAct, reason plus act

Pattern 3: Planning and task decomposition

Pattern 4: Reflection and self-critique

Pattern 5: Memory, the part that breaks first

Multi-agent orchestration is just these patterns composed

How agents connect to tools: MCP

The patterns at a glance

How AI Agents Work: Core Design Patterns

Workflow versus agent: the distinction that matters

The agent loop

Pattern 1: Tool use and function calling

Pattern 2: ReAct, reason plus act

Pattern 3: Planning and task decomposition

Pattern 4: Reflection and self-critique

Pattern 5: Memory, the part that breaks first

Multi-agent orchestration is just these patterns composed

How agents connect to tools: MCP

The patterns at a glance

Claude, GPT-5.4, Gemini —
all in one place.

Keep reading

Workflow versus agent: the distinction that matters

The agent loop

Pattern 1: Tool use and function calling

Pattern 2: ReAct, reason plus act

Pattern 3: Planning and task decomposition

Pattern 4: Reflection and self-critique

Pattern 5: Memory, the part that breaks first

Multi-agent orchestration is just these patterns composed

How agents connect to tools: MCP

The patterns at a glance

Claude, GPT-5.4, Gemini —all in one place.

Keep reading

Claude, GPT-5.4, Gemini —
all in one place.