Agents

June 26, 2025
AIAgentsLLMs

Quick Start

Google recently open-sourced Gemini CLI, an AI agent that lives in the terminal. You can quickly get it running with:

npm install -g @google/gemini-cli
gemini

 ███            █████████  ██████████ ██████   ██████ █████ ██████   █████ █████
░░░███         ███░░░░░███░░███░░░░░█░░██████ ██████ ░░███ ░░██████ ░░███ ░░███
  ░░░███      ███     ░░░  ░███  ░███░█████░███  ░███  ░███░███ ░███  ░███
    ░░░███   ░███          ░██████    ░███░░███ ░███  ░███  ░███░░███░███  ░███
     ███░    ░███    █████ ░███░░█    ░███ ░░░  ░███  ░███  ░███ ░░██████  ░███
   ███░      ░░███  ░░███  ░███ ░███      ░███  ░███  ░███  ░░█████  ░███
 ███░         ░░█████████  ██████████ █████     █████ █████ █████  ░░█████ █████
░░░            ░░░░░░░░░  ░░░░░░░░░░ ░░░░░     ░░░░░ ░░░░░ ░░░░░    ░░░░░ ░░░░░

What is an “Agent”?

✨ Insight

AI agents are systems that can think, act, and learn from results. Unlike chatbots that only generate text, agents interact with the world through tools and APIs to complete real tasks.

The core pattern is the Thought-Action-Observation (TAO) loop:

Loading diagram...

Agents are Just Programs

✨ Insight

Agents don't contain AI models! The Gemini CLI is simply a program that makes API calls to Google's external LLM servers. It's no different from a weather app calling a weather API—the intelligence lives in the "cloud", not in your terminal.

The entire “agent” is just orchestration code:

Loading diagram...


From Theory to Practice

When you ask “How many files are in the src directory?”, the agent doesn’t guess—it uses the TAO loop to gather real information. The key difference: agents perform real-world actions, while chatbots only process training data.

Loading diagram...


How Gemini Works: Streaming Architecture

Google built Gemini CLI around a streaming-first design that processes events in real-time. Instead of waiting for complete responses, it streams thoughts, actions, and results as they happen.

The Four Core Components

🔄 Conversation Manager

Handles the back-and-forth with Google's AI and manages conversation history

⚙️ Tool Scheduler

Manages when and how tools run, including safety approvals

🔧 Tool Registry

Library of available actions like reading files, running commands, web searches

🖥️ User Interface

Terminal display that shows everything happening in real-time

Loading diagram...

Safety and Control

Gemini CLI has three safety modes:

  • 🛡️ DEFAULT: Ask permission for dangerous operations (delete files, run commands)
  • ⚡ AUTO_EDIT: Auto-approve file edits, but ask for everything else
  • 🚀 YOLO: Run everything automatically (for experienced users)

✨ Smart Safety

The system can tell the difference between safe operations (reading files) and dangerous ones (deleting files). It only asks permission when it actually matters.

Real Example: Finding TypeScript Files

When you ask “Find all TypeScript files and analyze their imports”, here’s what happens:

👤 You: "Find all TypeScript files and analyze their imports"
🤔 AI Thinks: "I need to find .ts files, then read each one to check imports"
⚡ AI Acts: Runs find_files("*.ts") and multiple read_file() commands
👁️ AI Observes: Gets list of files and their contents
💬 AI Responds: "Found 15 TypeScript files. Here are the import patterns..."

Extensible Tools

The Tool Registry automatically discovers and loads tools from multiple sources—built-in capabilities, project-specific discovery commands, and MCP servers. This means the AI’s abilities expand dynamically based on your project’s needs, whether that’s connecting to databases, GitHub APIs, or development environments through MCP protocol.

  • Built-in tools for files, web, commands
  • MCP (Model Context Protocol) server support
  • Project-specific tool discovery

Loading diagram...


Key Takeaways

🎯 What Makes Agents Special

  • They can actually DO things, not just talk
  • They use real-time reasoning and feedback loops
  • They're just programs that coordinate AI with tools

🚀 Why This Matters

  • Agents will become our main AI interface
  • The patterns here work for any domain
  • Safety and control are built-in from the start
  • Real-time streaming makes everything feel natural

The Future of AI Agents

✨ Looking Ahead

As AI models get better, agents like Gemini CLI show us the path forward: AI that can think, act, and learn in real-time while keeping humans in control. The magic isn't in the AI model itself—it's in how we connect AI reasoning to real-world actions.

The patterns from Gemini CLI—streaming responses, safety controls, extensible tools, and transparent reasoning—are the blueprint for the next generation of AI interfaces. We’re moving from “AI that talks” to “AI that works.”


This post was written with assistance from Gemini CLI and Claude Code (mainly Claude) for research tasks on the Gemini CLI Codebase and helping with blog writing + styling