Agents
Observe. Think. Act. Repeat until done.
Agents
An agent is a language model that can take actions. Instead of generating a single response and stopping, an agent iterates: observe, think, act, observe again. Actions include calling tools, reading files, writing code, searching the web, or invoking other models.
Analogy
Think of a home cook following a recipe. A non-agent just reads the whole recipe aloud in one breath and stops. A cook actually cooks: they glance at the pan, taste the sauce, decide it needs salt, open the spice drawer, add a pinch, taste again, then move on to the next step. Each loop of glance-taste-act-taste changes what the next step will be. Give them a rusty paring knife (bad tool) or no timer (bad observation) and the whole meal starts to drift.
The agent loop
while not done:
observation = get_context()
thought = model.generate(system + history + observation)
action = parse_action(thought)
result = execute(action)
history.append((thought, action, result))
At each step, the model decides what to do next based on everything in its context: the original task, all prior thoughts, all prior action results. This continues until the model generates a stop condition or a maximum step limit is reached.
Tools
Tools are functions the model can call. They are declared in the system prompt as a schema (name, description, parameters). The model generates a structured call; the agent runtime executes it and returns the result as an observation.
Common tool categories:
| Category | Examples |
|---|---|
| Search | Web search, vector database, file system |
| Code | Python interpreter, shell, browser console |
| APIs | Weather, calendar, database queries |
| Memory | Read/write persistent storage |
| Subagents | Spawn another agent for a subtask |
The model's ability to use tools depends almost entirely on how well the tools are described. A tool with a bad description is a tool that will be called wrong.
ReAct pattern
ReAct (Reasoning + Acting) structures each agent step as an explicit thought followed by an action:
Thought: I need to find the population of France.
Action: search("France population 2024")
Observation: France has approximately 68 million people.
Thought: Now I can answer the original question.
Action: respond("France's population is approximately 68 million.")
The explicit thought step improves task performance: the model reasons before acting, rather than immediately executing an action that may be wrong. The thought tokens also make agent behavior interpretable.
Memory systems
An agent without memory starts fresh on every call. Real agent systems layer multiple memory types:
In-context — everything in the current context window. Immediate, but bounded by window size.
External (retrieval) — documents, prior conversation summaries, or notes stored in a vector database. Retrieved at the start of each step.
Procedural — system prompts and tool definitions. Static across a session.
Episodic — summaries of past sessions. Updated at session end, retrieved at session start.
Memory architecture determines what an agent "remembers" across tasks and sessions.
Multi-agent systems
Complex tasks often involve multiple specialized agents collaborating:
Orchestrator — decomposes the task, delegates subtasks, aggregates results.
Subagents — domain specialists (code agent, research agent, writing agent). Receive focused subtasks.
Critic — reviews outputs from other agents and requests revisions.
Communication between agents is just messages passed through shared context or explicit message queues. Each agent runs the same loop independently; coordination is through message structure.
Reliability challenges
Hallucination under tool use — the model may invent tool results it didn't actually receive. Mitigated by requiring explicit observation blocks and checking for claimed but uncalled tools.
Tool call errors — malformed parameters, API failures, permission errors. Agents need robust retry logic and error reporting.
Infinite loops — an agent that fails to make progress can loop indefinitely. Step limits and loop detection are required.
Prompt injection via tool results — a web page or database record might contain instructions to override the agent's goals. Treat all external data as untrusted.
When to use an agent
Agents add latency, cost, and complexity. Use them when:
- The task requires multiple sequential decisions that cannot be determined upfront
- The task requires external information or tools
- The correct sequence of steps depends on intermediate results
For single-turn tasks with known structure, a direct prompt is faster and more reliable than an agent loop.