Agents Don't Just Chat. They Act. Here's How I Built a Multi-Agent AI System with Google ADK and Shipped It Live.

A model waits for a question. An agent goes looking for answers.

That distinction sounds small. It changes everything about how you build.

I built a shopping assistant where three AI agents search Google Shopping, Reddit, and YouTube at the exact same moment. A fourth reads everything they found and writes one ranked recommendation with clickable links. The whole system runs on Google Cloud. An Android app calls it live. This is how I built it, deployed it, and wired it to a mobile app.

Watch on YouTube

What is an AI Agent?

When you ask a language model something, it draws on what it was trained on and gives you a response. That is useful. But it is limited. The model cannot check today's prices. It cannot read a Reddit thread from this morning. It cannot search YouTube for recent reviews.

An agent is a model that has been given tools. Tools are exactly what they sound like: functions the agent can call. Search the web. Query a database. Call an API. The agent decides which tool to use, runs it, reads the result, and then decides what to do next.

Think of it like the difference between asking a friend for their opinion and hiring a researcher. The friend gives you what they know off the top of their head. The researcher goes and finds the actual answer.

The Agent Landscape Today

A few frameworks have emerged for building these systems, and they are worth knowing:

AWS Bedrock Agents: Amazon's managed agent platform. Connect Claude or Titan models to tools, databases, and Lambda functions without managing infrastructure yourself.
LangGraph: Open-source from LangChain. Build agent workflows as a graph where each node is a step. Good for complex branching and loops.
OpenAI Agents SDK: OpenAI's own framework for building agents with GPT models. Supports handoffs between multiple agents in a pipeline.
AutoGen: Microsoft's open-source framework. Agents communicate with each other in conversation threads. Widely used in research.
Google ADK (Agent Development Kit): Google's open-source framework. This is what I used.

ADK has several agent types including LoopAgent for repeating steps until a condition is met, and CustomAgent for complete custom control. But the two that matter for this project are SequentialAgent and ParallelAgent. They solve the two core problems in any multi-agent design.

Google ADK: Two Agent Types That Do the Work

SequentialAgent solves the "do this, then do that" problem. It runs its sub-agents one after another, in strict order, waiting for each to finish before starting the next. Like a relay race: the baton passes only when the runner crosses the line.

ParallelAgent solves the "do all of this at once" problem. It fires all its sub-agents simultaneously, at the exact same moment. They run independently and in true parallel. ParallelAgent collects all results when everyone is done. Like asking three people to check three different stores at the same time, instead of visiting each store yourself, one after another.

For this project, I combined both. A ParallelAgent fires three research agents simultaneously. Each one searches a different source. Once all three return, a SequentialAgent hands their combined output to a Synthesizer that writes the final recommendation.

Three sources searched simultaneously. One coherent recommendation at the end. The total wait time is roughly the same as a single search, not three stacked on top of each other.

Deploying to Vertex AI Agent Engine

Running this on your laptop is the easy part. Making it available to a mobile app, at any time, from anywhere in the world, is the real problem.

Google provides Vertex AI Agent Engine for this. It is a managed hosting service for ADK agents. You point it at your code, it handles all the infrastructure, and it gives you a live endpoint your apps can call.

The deploy command is one line:

adk deploy agent_engine \
    --project=your-gcp-project-id \
    --region=us-central1 \
    --display_name="Personal Shopper Agent" \
    .

It takes about five to ten minutes. Google Cloud packages your code, builds a container in the cloud, and starts the agent running on its infrastructure. When it finishes, you get a Resource ID. That is your live agent.

Calling It from an Android App

There is no single right way to connect a mobile app to a backend AI system. You could use Firebase Cloud Functions as a lightweight intermediary. You could put an API Gateway in front of Vertex AI. You could build a dedicated backend service with your own auth layer. Each approach has tradeoffs around security, latency, and complexity.

The approach I went with is a proxy. Here is why.

Vertex AI requires a Google Cloud auth token with every request. These tokens expire every hour. You cannot safely store a rotating credential inside an Android app binary, because anyone can decompile the app and extract it. A proxy solves this cleanly: a small Python server sits between the Android app and Vertex AI, handling all authentication on the app's behalf.

The proxy is packaged as a Docker container. A Docker container is like a lunchbox for your code: it seals your Python server, all its libraries, and everything it needs to run into one self-contained unit. Google Cloud Run reads that container, starts it as a serverless service, and gives it a public URL.

Because Cloud Run runs inside Google's own infrastructure, it gets its own Google identity automatically. It handles all the authentication on behalf of the Android app. The app never touches a credential. It just calls the proxy URL and gets a response back.

The full project walkthrough, including the Android app, the proxy code, and the agent architecture, is on the project page here.

See the Code

If you want to look at the Android code, the Cloud Run proxy, or the ADK agent system, it is all in one repo:

Personal Shopper on GitHub

Multi-agent systems changed how I think about building software. Most systems are a chain: one thing calls the next. A multi-agent system is a team: you assign work, everyone operates independently, and you collect what they found. The mental model shift is bigger than the technical shift.

Frameworks like Google ADK, AWS Bedrock Agents, and LangGraph are making this pattern genuinely accessible. You are not building the orchestration yourself. The hard part now is designing the right agents and prompting them with precision.

The gap between "running on my laptop" and "live on Google Cloud, called from an Android app" is smaller than it looks. And closing that gap is where things get interesting.