Skip to main content

Posts

🎤 SpeakWise: Build an AI Public Speaking Coach with Gemma 3n

SpeakWise: Build an AI Public Speaking Coach with Gemma 3n Project by: Rabimba Showcase for: Google Developer Expert (GDE) AI Sprint Imagine having a private, real-time AI coach that watches your presentations, listens to your speech, analyzes your slides and gestures, and provides actionable feedback to help you improve confidently. Using Google’s new Gemma 3n, we built exactly that: SpeakWise , an AI-powered public speaking coach that leverages multimodal understanding to transcribe, analyze, and critique your talks—all while keeping your data private. Github Code. 🚀 Why Gemma 3n? Gemma 3n is Google’s open multimodal model designed for on-device, privacy-preserving, real-time AI applications . It is uniquely capable of: 📡 Simultaneously processing audio, image, and text, forming a holistic understanding of your talk. 🗂️ Following advanced instructions (“Act as a world-class presentation coach” and structuring output into clear, actionable insights). ...
Recent posts

From Models to Agents: Shipping Enterprise AI Faster with Google’s MCP Toolbox & Agent Development Kit

This article is an expanded write-up of the talk I recently delivered as a Google Developer Expert during a talk in Denver. The full slide deck is embedded below for easy reference. Why another “agent framework”? Large-language models (LLMs) are superb at generating prose, but production-grade systems need agents that can reason, plan, call tools, and respect enterprise guard-rails. Traditionally, that means: Hand-rolling connectors to databases & APIs Adding authentication, rate-limits, and connection pools Patching in tracing & metrics later Hoping your YAML jungle survives the next refactor Google’s new duo— MCP Toolbox and the Agent Development Kit (ADK) —eliminates that toil so you can treat agent development like ordinary software engineering. MCP Toolbox in one minute ⏳ What Why it matters Open-source MCP server Implements the emerging Model Context Protocol ; any compliant age...

Deep Dive into the Google Agent Development Kit (ADK): Features and Code Examples

In our previous overview, we introduced the Google Agent Development Kit (ADK) as a powerful Python framework for building sophisticated AI agents. Now, let's dive deeper into some of the specific features that make ADK a compelling choice for developers looking to create agents that can reason, plan, use tools, and interact effectively with the world. 1. The Core: Configuring the `LlmAgent` The heart of most ADK applications is the LlmAgent (aliased as Agent for convenience). This agent uses a Large Language Model (LLM) for its core reasoning and decision-making. Configuring it effectively is key: name (str): A unique identifier for your agent within the application. model (str | BaseLlm): Specify the LLM to use. You can provide a model name string (like 'gemini-1.5-flash') or an instance of a model class (e.g., Gemini() ). ADK resolves string names using its registry. instruction (str | Callable): This is crucial for guiding the agent's be...

Build Smarter AI Agents Faster: Introducing the Google Agent Development Kit (ADK)

The world is buzzing about AI agents – intelligent entities that can understand goals, make plans, use tools, and interact with the world to get things done. But building truly capable agents that go beyond simple chatbots can be complex. You need to handle Large Language Model (LLM) interactions, manage conversation state, give the agent access to tools (like APIs or code execution), orchestrate complex workflows, and much more. Introducing the Google Agent Development Kit (ADK) , a comprehensive Python framework from Google designed to significantly simplify the process of building, testing, deploying, and managing sophisticated AI agents. Whether you're building a customer service assistant that interacts with your internal APIs, a research agent that can browse the web and summarize findings, or a home automation hub, ADK provides the building blocks you need. Core Concepts: What Makes ADK Tick? ADK is built around several key concepts that make agent development more s...

My Google I/O 2024 Adventure: A GDE's Front-Row Seat to the Gemini Era

Hey tech enthusiasts! Rabimba Karanjai here, your friendly neighborhood Google Developer Expert (GDE), back from an exhilarating whirlwind tour of Google I/O 2024. Let me tell you, this wasn't just your average tech conference – it was an AI-infused extravaganza that left me utterly mind-blown! And you know what made it even sweeter? I had front-row seats, baby! Huge shoutout to the GDE program for this incredible opportunity. Feeling grateful and a tad spoiled, I must admit. 😉 Gemini: The AI Marvel That's Stealing the Show Now, let's dive into the star of the show: Gemini . This ain't your grandpa's AI model – it's the multimodal powerhouse that's set to redefine how we interact with technology. Imagine an AI that doesn't just understand text, but images, videos, code, and even your wacky doodles. Yep, that's Gemini for you! Google's been cooking up this AI masterpiece, and boy, did they deliver! The keynote demo had us all gawk...

MovieBuff: Dive Deeper into Movies with Generative AI

MovieBuff: Dive Deeper into Movies Before You Watch MovieBuff: Dive Deeper into Movies Before You Watch Have you ever spent two hours watching a movie only to be disappointed? MovieBuff is here to help! This Streamlit application leverages the power of Google's Generative AI, specifically the Gemini-Pro model, to provide you with detailed information about movies and TV series before you invest your precious time. Motivation Choosing a movie can be overwhelming. With countless options available, it's hard to know which ones are worth watching. MovieBuff aims to solve this problem by offering a quick and easy way to explore movies based on your interests. How it Works MovieBuff is incredibly user-friendly. You can either: Enter the movie title and year: Simply type the name of the movie you're interested in, and MovieBuff will fetch relevant information like plot summaries, directors, genres, themes, main conflicts, settings, character descriptions, tr...

Curious case of Cisco AnyConnect and WSL2

One thing Covid has taught me is the importance of VPN. Also one other thing COVID has taught me while I work from home  is that your Windows Machine can be brilliant  as long as you have WSL2 configured in it. So imagine my dismay when I realized I cannot access my University resources while being inside the University provided VPN client. Both of the institutions I have affiliation with, requires me to use VPN software which messes up WSL2 configuration (which of course I realized at 1:30 AM). Don't get me wrong, I have faced this multiple times last two years (when I was stuck in India), and mostly I have been lazy and bypassed the actual problem by side-stepping with my not-so-noble  alternatives, which mostly include one of the following: Connect to a physical machine exposed to the internet and do an ssh tunnel from there (not so reliable since this is my actual box sitting at lab desk, also not secure enough) Create a poor man's socks proxy in that same box to have...