Google ADK
- This chapter discuss how to build Agents using the Google agent development kit.
- Source and important links
Tools
- Definition: Tools are functions that extend the capabilities of LLM-based agents beyond their inherent knowledge and reasoning.
- Types of Tools in ADK:
- Python functions (short-lived or long-running).
- Other AI agents.
- Classes derived from BaseTool.
- Specialized tools like MCP tools or OpenAPI tools.
- Default Tools Provided:
- Google Search.
- Code Execution.
- Guidelines for Defining Function-Based Tools:
- Use descriptive function names (e.g., get_weather, schedule_meeting) with type hints for return types.
- Include detailed
docstringsand type hints for all parameters. - Use clear, specific parameter names (e.g., city: str instead of c: str) without default values.
- Return a dictionary with:
- status: "success" or "error".
- data: The result or output.
- reason: Explanation for failures (if applicable).
- Long-Running Tasks:
- Suitable for tasks requiring extended processing or human-in-the-loop interaction.
- Must be wrapped with
LongRunningFunctionToolfor proper handling.
def get_weather(city: str) -> dict:
"""Retrieves the current weather report for a specified city.
Args:
city (str): The name of the city (e.g., "New York", "London", "Tokyo").
Returns:
dict: A dictionary containing the weather information.
Includes a 'status' key ('success' or 'error').
If 'success', includes a 'report' key with weather details.
If 'error', includes an 'error_message' key.
"""
print(f"--- Tool: get_weather called for city: {city} ---") # Log tool execution
city_normalized = city.lower().replace(" ", "") # Basic normalization
# Mock weather data
mock_weather_db = {
"newyork": {"status": "success", "report": "The weather in New York is sunny with a temperature of 25°C."},
"london": {"status": "success", "report": "It's cloudy in London with a temperature of 15°C."},
"tokyo": {"status": "success", "report": "Tokyo is experiencing light rain and a temperature of 18°C."},
}
if city_normalized in mock_weather_db:
return mock_weather_db[city_normalized]
else:
return {"status": "error", "error_message": f"Sorry, I don't have weather information for '{city}'."}
weather_agent = Agent(
name="weather_agent_v1",
model=AGENT_MODEL, # Can be a string for Gemini or a LiteLlm object
description="Provides weather information for specific cities.",
instruction="You are a helpful weather assistant. "
"When the user asks for the weather in a specific city, "
"use the 'get_weather' tool to find the information. "
"If the tool returns an error, inform the user politely. "
"If the tool is successful, present the weather report clearly.",
tools=[get_weather], # Pass the function directly
)
Type of Agents
- Overview: ADK supports three main types of agents for building AI systems.
- LLM Agents:
- Powered by large language models (LLMs) as their core engine.
- Capabilities include understanding natural language, reasoning, planning, and generating responses.
- Can autonomously select appropriate tools based on tasks.
- Workflow Agents:
- Specialize in managing the execution flow of other agents without relying on LLMs for control logic.
- Subtypes:
- Sequential Agent: Executes tasks in a predefined order.
- Parallel Agent: Runs multiple tasks concurrently.
- Loop Agent: Repeats tasks based on specified conditions.
- Custom Agents:
- User-defined agents tailored for specific use cases not covered by LLM or Workflow agents.
- LLM Agents:
- Multi-Agent Systems:
- Combine different agent types (e.g., Workflow and LLM agents) to create complex architectures for advanced task handling.
LLM Agent Configuration
- Required Parameters:
- model: Specifies the LLM model name (source of model names not specified in ADK documentation).
- name: Unique identifier for the agent.
- Optional Parameters:
- description: Summarizes the agent’s capabilities.
- Used in multi-agent systems to determine task routing.
- Should be unique and clearly distinguish the agent from others.
- instruction: Defines the agent’s goals, tool usage, and output format.
- Specify explicitly when and how tools should be used.
- It's best to check and follow the models prompt engineering guidance, different model providers provide their prompt engineering best practices.
- tools: Functions enabling specific tasks (e.g., web search).
- description: Summarizes the agent’s capabilities.
- Configuration Options:
- temperature: Controls the LLM’s randomness (higher values increase creativity, lower values enhance determinism).
- input_schema: Enforces the structure of data sent to the agent.
- output_schema: Specifies the desired output format, with ADK converting the LLM’s response to match.
- output_key: Tags the agent’s output for storage and later retrieval.
capital_agent = LlmAgent(
model="gemini-2.0-flash",
name="capital_agent",
description="Answers user questions about the capital city of a given country.",
instruction="""You are an agent that provides the capital city of a country.
When a user asks for the capital of a country:
1. Identify the country name from the user's query.
2. Use the `get_capital_city` tool to find the capital.
3. Respond clearly to the user, stating the capital city.
Example Query: "What's the capital of France?"
Example Response: "The capital of France is Paris."
""",
tools=[get_capital_city]
output_key="capital_city", # Stores output in state['capital_city']
)
# Define a tool function
def get_capital_city(country: str) -> str:
"""Retrieves the capital city for a given country."""
# Replace with actual logic (e.g., API call, database lookup)
capitals = {"france": "Paris", "japan": "Tokyo", "canada": "Ottawa"}
return capitals.get(country.lower(), f"Sorry, I don't know the capital of {country}.")
Workflow Agents
- Overview: Workflow agents manage the execution flow of tasks or sub-agents in a structured manner. ADK offers two approaches to define workflows:
- Workflow Agents: Use predefined agents (e.g., SequentialAgent, ParallelAgent, LoopAgent) for predictable, deterministic workflows.
- LLM Instructions: Define workflows within an LLM’s instructions, which can be less deterministic and harder to test (see Delegations section for details).
Sequential Agent
- Function: Executes sub-agents in a specified order.
- Use Case Example: Code development pipeline.
- Sub-agents work sequentially: Code Writing → Code Review → Code Refactoring.
- Behavior: Each sub-agent completes its task before the next begins, ensuring a linear workflow.
code_pipeline_agent = SequentialAgent(
name="CodePipelineAgent",
sub_agents=[code_writer_agent, code_reviewer_agent, code_refactorer_agent],
description="Executes a sequence of code writing, reviewing, and refactoring.",
# The agents will run in the order provided: Writer -> Reviewer -> Refactorer
)
Loop Agents
- Function: Iteratively executes sub-agents in a sequential manner.
- Termination: Stops when a predefined number of iterations is reached or a termination condition is met.
- Use Case Example: Generating an image with exactly four apples.
- The agent repeatedly generates images and checks the apple count until the condition (four apples) is satisfied.
refinement_loop = LoopAgent(
name="RefinementLoop",
# Agent order is crucial: Critique first, then Refine/Exit
sub_agents=[
generate_image_agent,
count_images_agent,
],
max_iterations=5 # Limit loops
)
count_image_agent = LlmAgent(
name="ImageCounterAgent",
mode="...",
instruction="..., if there are 4 apples, call the "exit_loop" function and do not output any text."
...
tools=[exit_loop]
)
def exit_loop(tool_context: ToolContext):
"""Call this function ONLY when the critique indicates no further changes are needed, signaling the iterative process should end."""
print(f" [Tool Call] exit_loop triggered by {tool_context.agent_name}")
tool_context.actions.escalate = True
# Return empty dict as tools should typically return JSON-serializable output
return {}
Parallel Agents
- Function: Executes sub-agents concurrently, ideal for independent tasks.
- Behavior: Each sub-agent operates in its own execution branch, with no automatic sharing of conversation history or state.
- Coordination Methods:
- Shared Data Store: Use with locks to prevent race conditions.
- External State Management: Utilize databases, queues, or similar systems.
- Post-Processing: Collect outputs from all sub-agents and apply logic to coordinate results.
- Use Case Example: Running multiple independent analyses simultaneously and aggregating results afterward.
# This agent orchestrates the concurrent execution of the researchers.
# It finishes once all researchers have completed and stored their results in state.
parallel_research_agent = ParallelAgent(
name="ParallelWebResearchAgent",
sub_agents=[researcher_agent_1, researcher_agent_2, researcher_agent_3],
description="Runs multiple research agents in parallel to gather information."
)
# This is the main agent that will be run. It first executes the ParallelAgent
# to populate the state, and then executes the MergerAgent to produce the final output.
sequential_pipeline_agent = SequentialAgent(
name="ResearchAndSynthesisPipeline",
# Run parallel research first, then merge
sub_agents=[parallel_research_agent, merger_agent],
description="Coordinates parallel research and synthesizes the results."
)
Custom Agents
- Function: Enable custom orchestration logic by inheriting from BaseAgent and defining bespoke control flows.
- Purpose: Supports complex workflows beyond the capabilities of SequentialAgent, LoopAgent, or ParallelAgent.
- Use Case Example: An agent with conditional logic that branches based on specific states or conditions, tailored to unique requirements.
Delegations
- Overview: Beyond Workflow Agents, ADK allows dynamic control of agent execution flow using LLM instructions, session state, or context manipulation.
Llm Driven Delegation
- Function: Uses the LLM’s understanding to dynamically route tasks to appropriate sub-agents within a hierarchy.
- Requirements:
- The coordinator agent must have clear instructions specifying how to delegate tasks.
- Sub-agents need well-defined descriptions to enable accurate task routing by the coordinator.
booking_agent = LlmAgent(name="Booker", description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", description="Provides general information and answers questions.")
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
instruction="You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
description="Main coordinator.",
# AutoFlow is typically used implicitly here
sub_agents=[booking_agent, info_agent]
)
# If coordinator receives "Book a flight", its LLM should generate:
# FunctionCall(name='transfer_to_agent', args={'agent_name': 'Booker'})
# ADK framework then routes execution to booking_agent.
comparison_generator_agent = Agent(
model=constants.MODEL,
name="comparison_generator_agent",
description="A helpful agent to generate comparison.",
instruction=prompt.COMPARISON_AGENT_PROMPT,
)
comparsion_critic_agent = Agent(
model=constants.MODEL,
name="comparison_critic_agent",
description="A helpful agent to critique comparison.",
instruction=prompt.COMPARISON_CRITIC_AGENT_PROMPT,
)
comparison_root_agent = Agent(
model=constants.MODEL,
name="comparison_root_agent",
description="A helpful agent to compare titles",
instruction= """
You are a routing agent
1. Route to `comparison_generator_agent` to generate comparison
2. Route to `comparsion_critic_agent` to critic this comparison
3. Loop through these agents
4. Stop when the `comparison_critic_agent` is satisfied
5. Relay the comparison report to the user
""",
sub_agents=[comparison_generator_agent, comparsion_critic_agent],
)
State or Context Driven Delegation
- Function: Modifies attributes in the context object to influence the agent’s behavior after a tool completes execution.
- Mechanism:
- Use
tool_context.actions.transfer_to_agentto instruct the ADK framework to pause the current agent’s execution and transfer control of the conversation to a specified agent.
- Use
main_agent = Agent(
model='gemini-2.0-flash',
name='main_agent',
instruction="""You are the first point of contact for customer support of an analytics tool. Answer general queries. If the user indicates urgency, use the 'check_and_transfer' tool.""",
tools=[check_and_transfer]
sub_agents = [support_agent]
)
def check_and_transfer(query: str, tool_context: ToolContext) -> str:
"""Checks if the query requires escalation and transfers to another agent if needed."""
if "urgent" in query.lower():
print("Tool: Detected urgency, transferring to the support agent.")
tool_context.actions.transfer_to_agent = "support_agent"
return "Transferring to the support agent..."
else:
return f"Processed query: '{query}'. No further action needed."
escalation_tool = FunctionTool(func=check_and_transfer)
support_agent = Agent(
model='gemini-2.0-flash',
name='support_agent',
instruction="""You are the dedicated support agent. Mentioned you are a support handler and please help the user with their urgent issue."""
)
Multi Agent Architecture
- Overview: ADK enables the combination of multiple agent types (e.g., LLM, Workflow, Custom) to build robust and complex agent systems.
- Common Patterns:
- Linear/Sequential Pattern:
- Executes tasks sequentially using a SequentialAgent.
- Tasks are performed one after another in a predefined order.
- Parallel Fan-Out/Gather Pattern:
- Utilizes ParallelAgent to run tasks concurrently across multiple agents.
- Results are collected and processed afterward, often using a SequentialAgent for coordination.
- Linear/Sequential Pattern:
fetch_api1 = LlmAgent(name="API1Fetcher", instruction="Fetch data from API 1.", output_key="api1_data")
fetch_api2 = LlmAgent(name="API2Fetcher", instruction="Fetch data from API 2.", output_key="api2_data")
gather_concurrently = ParallelAgent(
name="ConcurrentFetch",
sub_agents=[fetch_api1, fetch_api2]
)
synthesizer = LlmAgent(
name="Synthesizer",
instruction="Combine results from state keys 'api1_data' and 'api2_data'."
)
overall_workflow = SequentialAgent(
name="FetchAndSynthesize",
sub_agents=[gather_concurrently, synthesizer] # Run parallel fetch, then synthesize
)
- Hierarchical Task Decomposition
- Involves conditional task delegation in a tree-like structure.
- A high-level agent decomposes a complex goal into sub-tasks and delegates them to lower-level agents for execution.
report_writer = LlmAgent(
name="ReportWriter",
model="gemini-2.0-flash",
instruction="Write a report on topic X. Use the tool ResearchAssistant to gather information.",
sub_agent=[research_assistant]
)
# Mid-level agent combining tools
research_assistant = LlmAgent(
name="ResearchAssistant",
model="gemini-2.0-flash",
description="Finds and summarizes information on a topic.",
tools=[web_searcher, summarizer]
)
web_searcher = LlmAgent(name="WebSearch", description="Performs web searches for facts.")
summarizer = LlmAgent(name="Summarizer", description="Summarizes text.")
Dealing with Data
- How do you accept/generate structured data and how do you pass data around tools and agents.
Structured I/O
- Use Pydantic classes to enforce structured input and output for agents.
- input_schema: Ensures inputs conform to a specific Pydantic schema.
- output_schema: Ensures outputs follow a defined Pydantic schema.
- Note: When output_schema is used, agents cannot call tools or transfer control to other agents.
- Requirement: Instruct the LLM on the expected JSON format for inputs and outputs.
structured_info_agent_schema = LlmAgent(
model=MODEL_NAME,
name="structured_info_agent_schema",
description="Provides capital and estimated population in a specific JSON format.",
instruction=f"""You are an agent that provides country information.
The user will provide the country name in a JSON format like {{"country": "country_name"}}.
Respond ONLY with a JSON object matching this exact schema:
{json.dumps(CapitalInfoOutput.model_json_schema(), indent=2)}
Use your knowledge to determine the capital and estimate the population. Do not use any tools.
""",
input_schema=CountryInput,
output_schema=CapitalInfoOutput, # Enforce JSON output structure
output_key="structured_info_result", # Store final JSON response
)
Every Combination
- Session State:
- Agents within the same invocation share a common session state (a dictionary storing data).
- Agent outputs are automatically stored under the
output_keyfor convenience. - Suitable for SequentialAgent and LoopAgent, but ParallelAgent may encounter race conditions, requiring safer alternatives (e.g., locks or external state management).
- Agents can access Session states within their instruction using a special syntax
{variable_name}session state with key variable_nameuse data in state[variable_name]
- Tools can access session state using ToolContext
Agent to Agent
- LLM Delegation:
- In hierarchical setups, the coordinator agent uses instructions to pass the current session to an appropriate sub-agent.
- State Variables:
- Access and use state variables using within the instruction.
- Similar to
output_key, other state variables are available for use.
booking_agent = LlmAgent(name="Booker", description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", description="Provides general information and answers questions.")
coordinator = LlmAgent(
name="Coordinator",
model="gemini-2.0-flash",
instruction="You are an assistant. Delegate booking tasks to Booker and info requests to Info.",
description="Main coordinator.",
# AutoFlow is typically used implicitly here
sub_agents=[booking_agent, info_agent]
)
# Conceptual Example: Using output_key and reading state
from google.adk.agents import LlmAgent, SequentialAgent
agent_A = LlmAgent(name="AgentA", instruction="Find the capital of France.", output_key="capital_city")
agent_B = LlmAgent(name="AgentB", instruction="Tell me about the city stored in state key {capital_city}.")
pipeline = SequentialAgent(name="CityInfo", sub_agents=[agent_A, agent_B])
# AgentA runs, saves "Paris" to state['capital_city'].
# AgentB runs, its instruction processor reads state['capital_city'] to get "Paris".
Agent to Tool
- Session State Variables:
- Tools can access session state data (e.g., output_key) via ToolContext.
- Arguments:
- Agents can pass required data directly to tools as function arguments.
pipeline = SequentialAgent(name="CityInfo", sub_agents=[agent_A, tool])
agent_A = LlmAgent(name="AgentA", instruction="Find the capital of France.", output_key="capital_city")
def tool(tool_context: ToolContext):
city = tool_context.state.get("capital_city", "NA")
...
agent_A = Agent(name="AgentA", instruction="Help user find capity city of a country using the tool 'get_capity_city'.", tools=[get_capital_city])
def get_capital_city(country: str) -> dict:
"""
Finds the capital for a given country
Args:
country(str): The name of the country
Returns:
dict: A dictionary containing the capital city information.
Includes a 'status' key ('success' or 'error').
If 'success', includes a 'report' key with city details.
If 'error', includes an 'error_message' key.
"""
return capital_city_result_dict
Tool to Agent
- Session State Variables:
- Tools can set session state variables, which agents can access in their instructions.
- Direct Agent Calls:
- Tools can invoke agents like regular functions, passing queries or questions along with relevant data.
pipeline = SequentialAgent(name="CityInfo", sub_agents=[tool, agent_A])
def tool(tool_context: ToolContext):
city = tool_context.state["capital_city"] = "New York"
...
agent_A = LlmAgent(name="AgentA", instruction="What interesting things can be visisted in {capital_city}.")
def tool(question: str, tool_context: ToolContext):
agent_tool = AgentTool(agent=agent_name)
agent_output = await agent_tool.run_async(
args={"request": question}, tool_context=tool_context
)
tool_context.state["agent_output"] = agent_output
return agent_output
Tool Context
- Overview: When using tools, ToolContext is automatically injected into the function, eliminating the need to define it in
docstrings. - Session State Access:
- The
tool_context.stateattribute provides read and write access to the session’s state (a dictionary). - Key Prefixes:
user:Limits state access to the current user.app:Shares state across all users of the application.temp:Restricts state to the current invocation only.
- The
# In an Agent:
my_agent = SequentialAgent(..., sub_agents=[update_user_preference, agent2])
def update_user_preference(preference: str, value: str, tool_context: ToolContext):
"""Updates a user-specific preference."""
user_prefs_key = "user:preferences"
# Get current preferences or initialize if none exist
preferences = tool_context.state.get(user_prefs_key, {})
preferences[preference] = value
# Write the updated dictionary back to the state
tool_context.state[user_prefs_key] = preferences
print(f"Tool: Updated user preference '{preference}' to '{value}'")
return {"status": "success", "updated_preference": preference}
More on ToolContext
- Controlling agents with ToolContext
transfer_to_agent = "agent_name": Halts the current agent’s execution and transfers control to the specified agent.- escalate = True:
- In hierarchical structures, signals that the current agent cannot handle the request and forwards it to the parent agent.
- In a LoopAgent, terminates the loop.
- skip_summarization = True:
- By default, ADK summarizes tool outputs. Setting this to True bypasses summarization, returning the raw tool output.
tool_context.actions.transfer_to_agent = "support_agent"
- Other things you can do with ToolContext
- Authentication: Manage authentication processes within the session.
- Artifacts: Access and manage files (documents generated by agents/tools or uploaded by users) in the current session.
- Search Memory: Query the user’s long-term memory using the configured memory_service.
Callbacks
- Overview: Callbacks enable observation, customization, and control of agent behavior at specific points in the execution flow without modifying the core ADK framework.
- Execution Points for Interception:
- Agent Callbacks:
before_agent_callback: Sets necessary states or resources before the agent runs.after_agent_callback: Reformats or sanitizes the LLM’s response after generation.
- LLM Callbacks:
before_model_callback: Inspects or modifies data sent to the LLM.after_model_callback: Processes or validates data returned from the LLM, useful for blocking harmful requests.
- Tool Callbacks:
before_tool_callback: Validates arguments before tool execution (e.g., range checks or SQL injection prevention).after_tool_callback: Processes or modifies tool outputs.
- Agent Callbacks:
- Use Cases:
- Debugging: Log detailed information at various execution steps for troubleshooting.
- Customization and Control: Modify data flowing through agents to tailor behavior.
- Guardrails: Implement security measures, such as filtering harmful inputs or enforcing safe outputs.
How to Set up Configure and Run Your Application
- Creating new project
- Use the command
adk createto generate a new project or manually create a root agent following the standard ADK folder structure.
- Use the command
- Running out app
- After creating the root agent (via
adk create app_nameor manually), run the app in one of two ways: - ADK-Powered (Simpler, Less Flexible):
adk web: Launches a web interface.adk cli <agent_name>: Runs the app in CLI mode.adk api_server: Starts a FastAPI-powered API server.- Requires a root agent defined in
agents.py.
- Custom Runner (More Flexible, More Code):
- Configure Session, Runner, and Events manually.
- Handle incoming requests and responses programmatically.
- Pass any agent (e.g., root or other available agents) to the runner options for custom execution.
- After creating the root agent (via
- What are Sessions, State, Events and Memory
- Session:
- Represents the current conversation thread.
- Acts as a container for all data related to a specific chat thread, including identification, history, data, and activity tracking.
- Types:
- In-Memory Session: Resets when the app closes.
- Database-Based Session: Persists across sessions for future access.
- Session has State and Events
- State:
- A dictionary storing dynamic key-value pairs within the current conversation.
- Used to track:
- Personalization (e.g., conversation_mode: "playful", user_preferred_name: "Zack").
- Task progress (e.g., booking_step: "pending").
- Events:
- Immutable records capturing specific points in an agent’s execution (e.g., user messages, agent replies, tool requests, tool results, state changes, control signals, errors).
- The entire process, from user query to agent response, is managed through the generation, interpretation, and processing of Event objects.
- Memory:
- A searchable, cross-session knowledge base accessible to agents.
- Supports Retrieval-Augmented Generation (RAG) services, enabling agents to query information from past chats or other sources.
- Session: