AI Agent Architecture — AI & LLM
15 key concepts covering AI & LLM
Key Terms in AI Agent Architecture
- Agent Loop
- The core pattern behind every AI agent. The LLM thinks about what to do, takes an action (like reading a file), observes the result, and thinks again. Think, Act, Observe, repeat. Unlike a chatbot that responds once, an agent keeps going through this loop until the task is done. This is what turns an LLM from a text generator into a problem solver.
- Planner and Executor
- Two roles inside the agent runtime. The Planner is the LLM reasoning about what to do next (which tool to call, what file to read, how to fix an error). The Executor is the system component that actually runs the tool call in a controlled environment. Separating them is important because the Planner can make mistakes, and the Executor can enforce safety limits.
- Tool Calling
- Agents cannot directly interact with the outside world. Instead, they request actions through structured tool calls (search_files, edit_file, run_command), and the system executes them in a controlled environment. Like a doctor writing prescriptions instead of dispensing medicine directly. Each tool has a typed schema defining its inputs and outputs.
- MCP (Model Context Protocol)
- A standardized interface for how agents discover and use tools, developed to make tools portable across different LLM providers. Before MCP, every provider had its own tool format. MCP defines tool discovery, input schemas, output formats, and timeouts in one standard. Think of it as USB-C for AI tools: one connector that works everywhere.
- Execution Sandbox
- An isolated environment where agent commands run safely, so a buggy or malicious command cannot damage the host system. Docker containers provide basic isolation for L2 tasks. Firecracker microVMs provide full virtual machine isolation for L3 autonomous sessions that run for hours. Like a chemistry lab's fume hood: lets the experiment happen safely.
- Human-in-the-Loop Levels
- Different approval modes based on the risk of the change. Fix a typo: auto-apply, no approval needed. Edit a single function: show diff, auto-approve after 5 seconds. Multi-file refactor: show the plan first, require explicit 'go ahead'. Delete files: always require explicit approval. The risk level determines how much human oversight the system requires.
- 3-Strikes Rule
- A safety mechanism to prevent agents from getting stuck in infinite loops. If the agent encounters the same error pattern 3 times, it stops trying to fix it automatically and asks a human for guidance. Without this, an agent can burn hundreds of dollars in tokens going in circles trying to fix an unfixable error.
- Checkpointing
- Periodically saving the agent's progress so work is not lost if something crashes. Every N steps, the system makes a git commit (capturing file state) and saves a JSON file (capturing the agent's plan, completed tasks, and decisions). If the process dies, it resumes from the last checkpoint instead of starting over. Critical for sessions that run for hours.
- Agent Memory Hierarchy
- Three layers of memory with different lifetimes. Working memory (the current context window): lasts one LLM call, limited to the context window size. Session memory (database): lasts for the current task, stores tool results and progress. Project memory (filesystem, like CLAUDE.md): permanent, stores architecture decisions and coding conventions that persist across sessions.
- Context Compaction
- When a long-running agent session fills up the context window with old tool results, the system compresses older entries. Recent results (last 20) are kept verbatim. Older results are summarized into one-line descriptions ('Read auth.ts: found JWT middleware using RS256'). Decisions are kept permanently. This prevents the agent from forgetting early decisions while making room for new information.
- Progressive Autonomy
- A trust model where the system starts cautious and earns more freedom over time. New users approve every change the agent proposes. As the system proves reliable (high acceptance rate for that specific user and codebase), it gradually auto-approves low-risk changes. The developer can always revoke trust. Like training wheels that come off as confidence builds.
- Multi-Agent Orchestration
- Splitting a large task across multiple specialized agents that work in parallel, coordinated by an orchestrator. One agent handles backend code, another handles frontend, a third handles infrastructure. A reviewer agent checks their output before committing. File-level locks prevent two agents from editing the same file simultaneously.
- Agentic RAG
- When a query is too complex for a single retrieval step, the agent decomposes it into sub-queries and retrieves information iteratively. 'Why is checkout slow and what changed after the Q3 migration?' becomes two separate searches, with results from the first informing the second. The agent decides when it has enough context to generate an answer, or when to search again.
- Kill Switches
- Hard limits that stop an agent unconditionally. Token budget exceeded: stop. Wall-clock timeout (e.g., 5 minutes for L2, 4 hours for L3): stop. Same error 3 times: stop. No heartbeat for 2 minutes: assume crashed, restart from checkpoint. These exist because LLM agents can get stuck in subtle ways that look like progress but aren't.
- Token Budget per Task
- A spending limit that prevents agents from running indefinitely. Set a hard ceiling (e.g., 50K tokens or $0.50 per task). The system tracks token consumption in real-time. At 80% consumed: warn the user. At 100%: stop execution, save a checkpoint, and present whatever partial results are available.