Skip to main content

Deep Dive into the Google Agent Development Kit (ADK): Features and Code Examples



In our previous overview, we introduced the Google Agent Development Kit (ADK) as a powerful Python framework for building sophisticated AI agents. Now, let's dive deeper into some of the specific features that make ADK a compelling choice for developers looking to create agents that can reason, plan, use tools, and interact effectively with the world.

1. The Core: Configuring the `LlmAgent`

The heart of most ADK applications is the LlmAgent (aliased as Agent for convenience). This agent uses a Large Language Model (LLM) for its core reasoning and decision-making. Configuring it effectively is key:

  • name (str): A unique identifier for your agent within the application.
  • model (str | BaseLlm): Specify the LLM to use. You can provide a model name string (like 'gemini-1.5-flash') or an instance of a model class (e.g., Gemini()). ADK resolves string names using its registry.
  • instruction (str | Callable): This is crucial for guiding the agent's behavior, personality, and task execution. It can be a simple string or a callable function that dynamically generates instructions based on the current context.
  • tools (list[Callable | BaseTool]): A list of capabilities you grant the agent. This can include Python functions or instances of BaseTool subclasses. More on this below!
  • generate_content_config (types.GenerateContentConfig): Fine-tune the LLM's generation parameters, such as temperature, top-p, safety settings, and stop sequences.

Code Sample: Basic Agent Configuration


from google.adk import Agent
from google.genai import types
from google.generativeai.types import HarmCategory, HarmBlockThreshold # Corrected import

# Define safety settings to allow discussion about specific topics if needed
safety_settings = [
    types.SafetySetting(
        category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=HarmBlockThreshold.BLOCK_LOW_AND_ABOVE # Example: adjust as needed
    ),
]

# Configure generation parameters
gen_config = types.GenerateContentConfig(
    temperature=0.7,
    top_p=0.9,
    safety_settings=safety_settings
)

# Create the agent
simple_chatbot = Agent(
    name='friendly_explainer',
    model='gemini-1.5-flash',
    instruction='You are Bob, a friendly and knowledgeable assistant who explains complex topics simply. Always introduce yourself as Bob.',
    generate_content_config=gen_config
)

# This agent can now be used with a Runner

2. The Power of Tools: Extending Agent Capabilities

Agents become truly useful when they can interact with external systems. ADK's tooling system is exceptionally flexible.

Effortless Function Tools

One of ADK's most convenient features is its ability to turn standard Python functions into tools automatically. Simply add the function to the agent's tools list. ADK inspects the function's signature (type hints) and docstring to generate the necessary schema (FunctionDeclaration) for the LLM to understand how and when to use it.

Code Sample: Function as a Tool


import random
from google.adk import Agent

# Define a standard Python function with type hints and a docstring
def get_weather(city: str) -> str:
    """Gets the current weather for a specified city."""
    # In a real scenario, this would call a weather API
    print(f"--> Checking weather for {city}...")
    conditions = ["Sunny", "Cloudy", "Rainy", "Windy"]
    temp = random.randint(5, 30)
    condition = random.choice(conditions)
    result = f"The weather in {city} is {condition} with a temperature of {temp}°C."
    print(f"--> Result: {result}")
    return result

# Create an agent and simply add the function to its tools list
weather_agent = Agent(
    name='weather_reporter',
    model='gemini-1.5-flash',
    instruction='You provide weather information when asked about a city.',
    tools=[get_weather] # ADK handles the rest!
)

# Now, if the user asks "What's the weather like in London?",
# the agent can call the get_weather function.

Built-in and External Code Execution

Agents can also execute code:

  • Built-in Execution: For Gemini 2 models, you can use the built_in_code_execution tool. This leverages the model's internal capabilities without running code in your environment. Simply import and add it to the tools list.
  • External Executors: For more control or different environments, assign an instance of a BaseCodeExecutor subclass to the agent's code_executor parameter. Options include VertexAiCodeExecutor (runs code securely in a managed Vertex AI environment), ContainerCodeExecutor, or even an UnsafeLocalCodeExecutor (use with extreme caution). These executors handle parsing code blocks from the LLM response and returning the output.

Code Sample: Adding Code Execution Capabilities


from google.adk import Agent
from google.adk.tools import built_in_code_execution # For Gemini 2+
from google.adk.code_executors import VertexAiCodeExecutor # Example external executor

# Option 1: Using Built-in Execution (Gemini 2+)
analysis_agent_builtin = Agent(
    name='data_analyst_builtin',
    model='gemini-2.0-flash-001', # Requires a Gemini 2 model
    instruction='Analyze data using code execution when necessary.',
    tools=[built_in_code_execution]
)

# Option 2: Using an External Executor (e.g., Vertex AI)
# Requires setting up the Vertex AI Code Interpreter Extension
# vertex_executor = VertexAiCodeExecutor() # resource_name might be needed

analysis_agent_external = Agent(
    name='data_analyst_external',
    model='gemini-1.5-flash', # Can use other models
    instruction='Analyze data using the provided code executor.',
    code_executor=vertex_executor # Assign the executor instance
    # Note: If using an external executor, don't add built_in_code_execution tool
)

Retrieval-Augmented Generation (RAG)

Give your agents access to up-to-date or domain-specific knowledge using retrieval tools. ADK provides a BaseRetrievalTool and specific implementations like VertexAiRagRetrieval. These tools allow the agent to query knowledge bases (like those hosted on Vertex AI) and incorporate the retrieved information into their responses. For newer models like Gemini 2, adding the VertexAiRagRetrieval tool can leverage the model's built-in RAG capabilities directly.

Code Sample: Adding a RAG Tool


from google.adk import Agent
from google.adk.tools.retrieval import VertexAiRagRetrieval

# Configure the RAG tool to point to your Vertex AI RAG Corpus/Resources
# Replace with your actual resource names/corpora IDs
doc_retriever = VertexAiRagRetrieval(
    name='internal_doc_search',
    description='Searches the company knowledge base for policy documents.',
    # Example: Specify either rag_corpora or rag_resources
    rag_corpora=['projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID']
    # rag_resources=[vertexai.preview.rag.RagResource(...)]
)

policy_agent = Agent(
    name='policy_advisor',
    model='gemini-2.0-flash-001', # Gemini 2 preferred for built-in integration
    instruction='Answer questions about company policy using the internal document search tool.',
    tools=[doc_retriever]
)

API Integration

While not shown in detail here, ADK includes powerful tools for interacting with REST APIs defined by OpenAPI specifications or Google API Discovery Docs (found in modules like openapi_tool and google_api_tool). These tools can automatically parse API definitions, handle authentication (using ADK's `auth` components), and allow the agent to call external APIs.

3. Orchestrating Agent Workflows

Complex tasks often require multiple steps or different specialized agents. ADK provides container agents to manage these workflows:

  • SequentialAgent: Runs a list of sub-agents one after another, passing the context along. Useful for multi-step processes.
  • ParallelAgent: (Found in `adk/agents/parallel_agent.py`) Runs sub-agents concurrently (behavior might depend on the runner implementation).
  • LoopAgent: (Found in `adk/agents/loop_agent.py`) Allows for iterative processes based on conditions.

ADK also supports dynamic agent transfer, where an LLM agent can decide to hand off control to another agent (parent, peer, or sub-agent) based on the conversation, using mechanisms like the transfer_to_agent_tool.

Code Sample: Sequential Workflow


from google.adk import Agent, SequentialAgent

# Define specialized sub-agents
researcher = Agent(
    name='web_researcher',
    model='gemini-1.5-flash',
    instruction='Find information on the web about a topic using search tools.',
    tools=[Google Search_tool] # Assuming Google Search_tool is defined/imported
)

summarizer = Agent(
    name='report_summarizer',
    model='gemini-1.5-flash',
    instruction='Summarize the provided text into a concise report.',
    # This agent might expect text input via session state or context
)

# Create a sequential workflow
workflow_manager = SequentialAgent(
    name='research_and_summarize_workflow',
    description='Finds information online and then summarizes it.',
    sub_agents=[researcher, summarizer]
)

# Running workflow_manager will first execute researcher, then summarizer.

4. Planning and Reasoning Strategies

For tasks requiring complex reasoning and planning, ADK offers Planners. The LlmAgent can be configured with a planner. One example is the PlanReActPlanner. This planner guides the LLM to first generate an explicit plan (marked with /*PLANNING*/), then execute steps (often involving tool calls marked with /*ACTION*/), interleave reasoning about the results (/*REASONING*/), and potentially replan (/*REPLANNING*/) if needed, before producing the /*FINAL_ANSWER*/. This structured approach makes the agent's thought process more transparent and controllable.

5. Managing State, Memory, and Artifacts

Agents need context. ADK manages this through several service abstractions, often used implicitly by a Runner:

  • Session Service (`BaseSessionService`): Manages the history of events (user messages, agent responses, tool calls) and the current key-value state within a single conversation (session). Implementations like InMemorySessionService are provided.
  • Memory Service (`BaseMemoryService`): Allows for longer-term persistence and retrieval of information, potentially across sessions. You might use this to recall past interactions or facts.
  • Artifact Service (`BaseArtifactService`): Handles the storage and retrieval of binary data or files (like images, PDFs, CSVs) associated with a session. Implementations for in-memory and Google Cloud Storage (`GcsArtifactService`) exist.

The InMemoryRunner conveniently bundles in-memory versions of these services for easy local development and testing.

Code Sample: Using the InMemoryRunner


from google.adk import Agent
from google.adk.runners import InMemoryRunner
from google.genai import types
import uuid # For generating unique IDs

# Assume 'my_agent' is an already defined ADK Agent instance
my_agent = Agent(name='test_agent', model='gemini-1.5-flash', instruction='Be helpful.')

# InMemoryRunner handles session, memory, artifact services internally
runner = InMemoryRunner(agent=my_agent, app_name='MyTestApp')

# Prepare input
user_input = "Hello ADK!"
message = types.Content(role='user', parts=[types.Part(text=user_input)])
user_id = str(uuid.uuid4())
session_id = str(uuid.uuid4())

# Run and process events
print(f"User: {user_input}")
for event in runner.run(user_id=user_id, session_id=session_id, new_message=message):
    if event.is_final_response() and event.content and event.content.parts:
        response_text = ''.join(part.text for part in event.content.parts if part.text)
        print(f"Agent: {response_text}")

6. Advanced Customization: Callbacks

ADK provides numerous callback points within the agent lifecycle (e.g., before_model_callback, after_tool_callback, before_agent_callback) allowing developers to inspect, modify, or even intercept requests and responses at various stages. This enables fine-grained control and integration with custom logging, monitoring, or logic.

Conclusion

The Google Agent Development Kit (ADK) offers a rich set of features designed to streamline the creation of powerful and sophisticated AI agents. From its flexible tooling system and code execution capabilities to agent orchestration, planning strategies, and state management, ADK provides the components needed to build agents that can tackle complex, real-world tasks. By understanding and leveraging these features, developers can significantly accelerate the development of next-generation AI applications.

Comments

Popular posts from this blog

Curious case of Cisco AnyConnect and WSL2

One thing Covid has taught me is the importance of VPN. Also one other thing COVID has taught me while I work from home  is that your Windows Machine can be brilliant  as long as you have WSL2 configured in it. So imagine my dismay when I realized I cannot access my University resources while being inside the University provided VPN client. Both of the institutions I have affiliation with, requires me to use VPN software which messes up WSL2 configuration (which of course I realized at 1:30 AM). Don't get me wrong, I have faced this multiple times last two years (when I was stuck in India), and mostly I have been lazy and bypassed the actual problem by side-stepping with my not-so-noble  alternatives, which mostly include one of the following: Connect to a physical machine exposed to the internet and do an ssh tunnel from there (not so reliable since this is my actual box sitting at lab desk, also not secure enough) Create a poor man's socks proxy in that same box to have...

My Google I/O 2024 Adventure: A GDE's Front-Row Seat to the Gemini Era

Hey tech enthusiasts! Rabimba Karanjai here, your friendly neighborhood Google Developer Expert (GDE), back from an exhilarating whirlwind tour of Google I/O 2024. Let me tell you, this wasn't just your average tech conference – it was an AI-infused extravaganza that left me utterly mind-blown! And you know what made it even sweeter? I had front-row seats, baby! Huge shoutout to the GDE program for this incredible opportunity. Feeling grateful and a tad spoiled, I must admit. 😉 Gemini: The AI Marvel That's Stealing the Show Now, let's dive into the star of the show: Gemini . This ain't your grandpa's AI model – it's the multimodal powerhouse that's set to redefine how we interact with technology. Imagine an AI that doesn't just understand text, but images, videos, code, and even your wacky doodles. Yep, that's Gemini for you! Google's been cooking up this AI masterpiece, and boy, did they deliver! The keynote demo had us all gawk...

MovieBuff: Dive Deeper into Movies with Generative AI

MovieBuff: Dive Deeper into Movies Before You Watch MovieBuff: Dive Deeper into Movies Before You Watch Have you ever spent two hours watching a movie only to be disappointed? MovieBuff is here to help! This Streamlit application leverages the power of Google's Generative AI, specifically the Gemini-Pro model, to provide you with detailed information about movies and TV series before you invest your precious time. Motivation Choosing a movie can be overwhelming. With countless options available, it's hard to know which ones are worth watching. MovieBuff aims to solve this problem by offering a quick and easy way to explore movies based on your interests. How it Works MovieBuff is incredibly user-friendly. You can either: Enter the movie title and year: Simply type the name of the movie you're interested in, and MovieBuff will fetch relevant information like plot summaries, directors, genres, themes, main conflicts, settings, character descriptions, tr...