Agentic Design Patterns: A book that made me rethink "What exactly is an Agent?"

By: rootdata|2026/05/26 03:45:00

Author: Yanhua

Antonio Gullí is the engineering director at Google. He wrote a 453-page book that breaks down the development of AI Agents into 21 design patterns.

But this is not a book review. My motivation for reading this book is very specific: I have written about Harness Engineering, shared my pitfalls with Clawdbot, and discussed the seven turning points from "AI agents are not magic" that go from burning tokens to being truly useful. After each writing, I was left with a question that I hadn't fully thought through: Is there a reusable underlying logic behind these things?

This book gave me the answer, and it was deeper than I expected.

You may not be writing an Agent at all

The harshest judgment in the book is hidden in the prologue.

Most of the "AI" that people are using is just Level 0: bare LLM, with no tools, no memory, and no actions. If you ask it what the best picture at the Oscars in 2025 is, it guesses. The book states plainly: Level 0 is not an Agent.

Moving up is where the real Agents are:

Level 1: Tool User

The Agent starts using tools: search, APIs, databases. But it’s not just about "being able to call interfaces"; it also needs to judge when to call, what to call, and how to use the results. The book provides a very specific example: when a user asks, "What new shows are there recently?", the Agent realizes that this information is not in the training data and proactively calls the search tool to find it, then synthesizes the result. The key step is "realizing on its own." It’s not a human telling it, "go search," but rather it judging that it needs to search. This judgment ability is the threshold for Level 1.
Level 2: Strategic Thinker

Two more elements are added: planning and Context Engineering. The book defines Context Engineering: not just piling up information, but carefully selecting, trimming, and packaging context. A clever example is given: a user wants to find a coffee shop between two locations. The Agent first calls the map tool to gather a bunch of data, then judges that "only the street names are needed next," trims the map output into a short list, and feeds it to the local search tool. Each step is about reducing noise in the information.

There’s a sentence in the book that I read several times: "To achieve the highest accuracy with AI, it must be given short, focused, and powerful context." Context Engineering is about doing this.

At this level, the Agent can also self-reflect. After completing a task, it reviews its work, identifies problems, and makes corrections on its own. I will elaborate on this later.
Level 3: Multi-Agent Collaboration

The book's stance is clear: stop thinking about creating an all-powerful super agent. The truly reliable approach is to build a team, like a project manager Agent + researcher Agent + designer Agent + copywriter Agent. The example given in the book is a new product launch: a "project manager Agent" coordinates everything, assigning tasks to "market research Agent," "product design Agent," and "marketing Agent." The key is communication: how Agents transmit data, synchronize states, and handle conflicts. This chapter illustrates six types of communication topologies, from the simplest single Agent to the most flexible custom mix, with explanations of which scenarios each is suitable for.

After reading these four levels, I suddenly understood why many people say, "My Agent is not useful." The model is not the problem; the issue is that you are treating it like a chatbot, and it may not even have reached Level 1.

Context Engineering: The Most Underestimated Concept in the Book

I wrote an article on Harness Engineering, discussing how track design is more important than engine horsepower. After reading this book, I realized that Context Engineering is the mapping of Harness Engineering at the prompt level.

Traditional Prompt Engineering only cares about "how you ask." The book's Context Engineering concerns "what context is in front of the Agent before asking." It includes four layers of information:

First layer, system prompt. Defines who the Agent is, what tone to use, and what boundaries to set. Most people only write this layer.
Second layer, external data. Documents retrieved by RAG, return values from tool calls, real-time API data. This is where most people get stuck: they know they need to feed data but don’t know how to do it without overwhelming the model.
Third layer, implicit data. User identity, interaction history, environmental state. Things that are not explicitly stated but the Agent should know. For example, if you tell the Agent, "Help me send an email to John to confirm tomorrow's meeting," it should know what tomorrow's meeting is in your calendar and what your relationship with John is.
Fourth layer, feedback loop. After each output, the Agent automatically evaluates quality and adjusts the context strategy for the next time. The book refers to this as "automated context optimization," and Google’s Vertex AI Prompt Optimizer is an engineering implementation of this idea.

When I read this, I remembered a previous experience I shared in "AI agents are not magic," where I mentioned that "your agent needs rules, and many rules." Looking back, those rules are essentially the manual version of Context Engineering, which the book has systematized.

Reflection: Two Agents are Really Better than One

This is the most practically valuable pattern in the entire book for me.

The core of Reflection is simple: the Agent reviews its work after completing a task and makes corrections on its own. But the implementation method is crucial. The book clearly states: The Producer and Critic must use two different Agents, with different system prompts. A single persona reviewing its own work will always have blind spots. If you have the same LLM write code and then review its own code, it is very likely to say, "It’s pretty good."

The book provides a complete code example.

The Producer's prompt is "You are a Python developer, write a function to calculate the factorial, handling edge cases and exceptions."
The Critic's prompt is "You are a nitpicking senior engineer, review the code line by line, checking for bugs, style, missed edge cases, and areas for improvement. If it’s perfect, output CODE_IS_PERFECT; otherwise, list all the issues."
Then there’s a for loop: Producer writes code → Critic reviews → Producer makes changes based on feedback → Critic reviews again → until Critic says CODE_IS_PERFECT or the maximum iteration count is reached.

It’s that simple. But the book reminds us of a cost issue that is easily overlooked: each reflection loop is a new LLM call, and the more iterations, the more expensive it becomes. Additionally, as the conversation history expands, the context window gets filled with earlier versions and critiques, reducing the actual usable reasoning space. Therefore, the best practice for Reflection is: set a reasonable maximum iteration count (the book uses 3), and stop once the Critic is satisfied; don’t pursue perfection.

The uses extend far beyond writing code. Writing articles, making plans, summarizing documents, solving logic problems—all can apply the Producer-Critic model. The book lists seven application scenarios, with the core logic being the same: produce first, then review, and finally correct.

Multi-Agent is Not Better When More Complex

What I liked most about the Multi-Agent Collaboration chapter is the six communication topology diagrams. Many people jump straight into complexity, but in most scenarios, three types are sufficient:

Single Agent (Independent Execution): Tasks can be broken down into independent sub-problems, each Agent handles its own. Simple and easy to maintain.
Peer-to-Peer Network: Agents communicate directly with each other, with no central control node. Decentralized and fault-tolerant; if one Agent fails, it doesn’t affect the whole system. However, coordination costs are high, and it can easily become chaotic.
Supervisor (Central Coordination): A Supervisor Agent manages a group of Worker Agents. It allocates tasks, collects results, and resolves conflicts. Clear hierarchy and easy management. However, the Supervisor is a single point of failure and a performance bottleneck.

The other three (Supervisor-as-Tool, hierarchical, custom mix) are variations and combinations of the first three. The book states practically: The topology you need depends on the complexity of your task. The more fragmented the task, the higher the communication costs; at a certain point, the Supervisor model can be more efficient than hierarchical.

My experience is that many people spend 80% of their time on communication protocols when building Multi-Agents, forgetting to ask a more fundamental question: does this task really need multiple Agents? The book clearly states that a Level 2 single Agent with Reflection is often sufficient. Level 3 is meant for scenarios that a single Agent truly cannot handle.

Memory Three-Layer Model, I Had a Vague Sense of It but Didn’t Name It

The Memory chapter resonated with me the most because when I wrote the articles on Obsidian + Claude, I was constantly pondering a question: how should the Agent's memory be layered?

The book provides the answer:

Session (Conversation Layer): The context window of the current conversation, which is the shortest memory and disappears once the conversation ends. Long-context models simply enlarge this window, but essentially it’s still temporary, and each inference has to process the entire window, which is costly and slow.
State (State Layer): Temporary data during the current task. For example, "What is the current task?", "How far has it progressed?", "What data has been generated in between?". Longer than Session, but cleared once the task ends; the book uses Google ADK's State mechanism as a complete example.
Memory (Persistent Layer): Long-term memory that spans sessions and tasks. User preferences, learned experiences, important historical decisions stored in databases or vector stores, with semantic retrieval. The book emphasizes an important point: Memory is not just about storage; it also requires designing a complete strategy for "what to store, when to store, and how to retrieve." Storing too much creates noise, while storing too little is insufficient.

In my previous article on Clawdbot, I mentioned "state files" and "workspace documents," which essentially were my manual attempts at creating State and Memory layers, and the book has framed this process.

Five Assumptions, the Fifth is the Most Absurd

At the end of the book, five assumptions about the future of Agents are mentioned, with the first four still within reasonable extrapolation: general-purpose Agents evolving from coding to project management, deeply personalized proactive discovery of your needs, embodied intelligence moving from screens into the physical world, and Agents becoming independent economic entities.

The fifth assumption shocked me: Transforming Multi-Agent.

You only declare a goal, such as "create an e-commerce business selling premium coffee." The system automatically decides: first create a "market research Agent" and a "branding Agent." After running some data, it judges that the branding Agent is no longer needed and splits it into three new Agents: "Logo Design Agent," "Website Building Agent," and "Supply Chain Agent." If the Website Building Agent becomes a bottleneck, the system will automatically duplicate three parallel Agents to work on different pages simultaneously. Throughout the process, the system continuously optimizes each Agent's prompt and reorganizes the team structure.

The book refers to this as a "goal-driven, self-transforming multi-Agent system." It is not executing a plan you wrote; it is generating its own plans, adjusting its plans, and reorganizing its execution team on its own.

This reminds me of Karpathy's AutoResearch: write a program.md, define goals, metrics, and boundaries, and hit "start." Humans are outside the loop. But this book pushes it further: even how the Agent team is formed and reorganized is left to the system to decide. Humans only declare "what they want."

Three Actions You Can Take Immediately

After finishing this book, I have three immediate actions I can implement:

First, add a Critic to your current Agent. Whether you are using Claude Code, CrewAI, or a framework you built yourself, add a step at the end of your existing workflow: have another Agent (with a different system prompt) review the output of the previous step. Code generation plus code review, article writing plus fact-checking, planning plus feasibility assessment. It adds one more LLM call, but the quality improvement is often doubled. The Producer-Critic model in the book is plug-and-play.
Second, start doing Context Engineering, not just Prompt Engineering. Look back at the instruction files you wrote for the Agent. If they are all rules about "how you should do it," lacking context about "what environment you are facing right now," fill that in. Tell the Agent what project it is currently in, what decisions have been made previously, and what user preferences are. The Context Engineering chapter in the book and your AGENTS.md are two expressions of the same thing.
Third, don’t rush into Multi-Agent. Get your single Agent to Level 2: with tools, Reflection, and Memory. The book repeatedly emphasizes that a Level 2 single Agent combined with Producer-Critic and Context Engineering can cover the vast majority of practical scenarios. Level 3 is meant for tasks that truly require cross-domain, multi-stage, and parallel division of labor. Most people's problem is not that they lack enough Agents, but that they haven't optimized a single Agent.

This book has 453 pages and will be published by Springer in 2025. The code examples cover LangChain/LangGraph, Google ADK, CrewAI, and OpenAI API. The foreword is written by the Google Cloud AI VP, and there’s a recommendation from the CIO of Goldman Sachs, which is unexpectedly well-written.

But the reason I recommend it is not for its "comprehensiveness." It’s because after reading it, you will realize one thing: the pitfalls you encountered with Agents over the past six months have already been organized into patterns by someone else. You don’t need to reinvent Reflection, you don’t need to guess how to layer Memory, and you don’t need to experiment with which communication topology to use for Multi-Agent.

Someone has drawn the map for you; all that’s left is to walk it.

Are you using AI Agents for development? What level is your current Agent at?

On June 9, The Kobeissi Letter, citing Goldman Sachs data, reported that global investors are selling South Korean stocks at an unusually rapid pace. In the latest trading session, foreign investors sold about $801 million worth of Kospi constituent stocks again; total foreign outflows last week reached about $10 billion, and the market has been in net foreign selling on nearly every trading day over the past month. According to the data cited in the report, foreign investors have sold about $75 billion worth of South Korean stocks so far this year. Meanwhile, South Korean retail and institutional investors together recorded roughly $69 billion in net buying over the same period, suggesting that the market’s main buying support has come from domestic capital rather than returning overseas funds. The information currently disclosed still mainly comes from The Kobeissi Letter’s retelling and Goldman Sachs data summaries, while public details on the statistical period and the specific definition of “selling” remain relatively limited.

Fortune Warns of Strategy’s Financing Structure Risks as Bitcoin Premium Narrows

Fortune warned that Strategy’s Bitcoin treasury model faces growing financing risks as MSTR’s net asset premium narrows and preferred stock dividend pressure increases.

Ferrari Challenge Le Mans: Carl Moon to Dominate in WEEX Livery

The art of absolute control. Inside Carl Moon’s Ferrari 296 Challenge quest at Le Mans, taming the storm together with the official WEEX livery.

Sahara AI Responds to SAHARA’s Sharp Drop: No Contract or Product Security Issues Found, Internal Investigation Underway

Sahara AI responded to SAHARA’s 60% price drop, saying no token contract or product security issues have been found and an internal investigation is underway.

WEEX Deposit/Withdrawal Dynamic Island: Your Asset Status, Always in Sight

WEEX introduces Deposit and Withdrawal Info on Dynamic Island for iOS. See fund transfer progress on your dynamic island, lock screen, or while using other apps. No more guessing. No more refreshing.

Scaling Crypto Derivatives: The Digital Asset Infrastructure Behind High-Volume Trading

In the fast-moving digital asset ecosystem, derivatives platforms face an extreme architectural test. High-leverage futures markets demand more than just standard security—they require absolute operational precision, zero-latency matching engines, and ironclad structural scalability, all while navigating intense market volatility.

As global platforms scale to meet these demands, the industry is shifting away from rigid, monolithic setups toward a more agile, "decoupled" infrastructure philosophy.

The Blueprint for High-Volume Copy Trading

For elite global exchanges like WEEX (founded in 2018), this architectural choice becomes critical when scaling high-volume retail features like social copy trading. When thousands of users automatically mirror the real-time strategies of elite traders simultaneously, it triggers sudden, monumental spikes in concurrent transactional volume.

To prevent execution latency or settlement bottlenecks during these peak volatility events, a platform's primary engine must remain entirely dedicated to risk management, copy-trade synchronization, and order matching.

The Architectural Rule: New-generation platforms must separate front-end user execution engines from heavy backend infrastructural overhead to eliminate operational friction.

By separating these layers, platforms can maintain complete sovereignty over their trading environments and user experiences while strategically aligning with institutional-grade infrastructure ecosystems. This strategic framework allows modern exchanges to leverage advanced Digital Asset Custody infrastructure such as Cobo’s behind the scenes, ensuring that backend wallet management scales elastically alongside trading spikes.

Capitalizing on Market Momentum and 400× Leverage

In a derivatives arena where platforms offer up to 400× leverage on perpetual contracts, capital efficiency and market agility are core business metrics. To capture market momentum, an exchange needs the ability to rapidly expand its asset offerings, supporting everything from legacy crypto assets to sudden, trending altcoins across a massive library of trading pairs.

Adopting a flexible, scalable Wallet-as-a-Service (WaaS) solution such as Cobo’s could completely rewrite the development timeline for high-growth exchanges. Instead of spending months of engineering capital building out custom backend wallet architectures for every new blockchain network, platforms can deploy localized infrastructure in days.

This agility allows platforms to instantly scale their listings to over a thousand trading pairs without compromising security or delaying time-to-market. It mirrors the exact operational advantages seen during high-velocity market events, similar to how advanced wallet infrastructure empowers platforms during sudden asset surges; allowing exchanges to pass that speed and liquidity directly to their global user base.

A Mature Foundation for Growth

The synergy between trusted infrastructure ecosystems and global trading platforms represents the natural evolution of a maturing crypto market. As WEEX continues to scale its global spot and derivatives offerings for over 6 million users, adopting robust backend paradigms proves that platforms no longer have to compromise between cutting-edge trading velocity and uncompromised structural security.

Get Paid to Onboard? Try WEEX’s New Homepage with Rewards for Registration, Deposit & Trade

WEEX just launched a brand new homepage and a 3-step new user onboarding guidance. Complete Registration → Deposit → Trade to earn exclusive rewards. Faster navigation, clear progress, and instant bonuses. Download the latest WEEX App to try it now.

WEEX Custom Layout: Build Your Perfect Trading Workspace in Seconds

WEEX introduces custom layout on futures trading page: left/right panel switch, hide/show core modules, full-screen focus, and one-click reset. Trade your way now.

Morning Report | BitMine increased its holdings by 126,971 ETH last week; trader Eugene announced his exit from the crypto market

Overview of Important Market Events on June 8th

Wang Chuan: How can one not feel anxious after the neighbor Old Wang made thirty times profit by investing in storage stocks? (Seven) - A quarter-century cycle