Skip to main content

Google Audio Indexing

GAudi is short for Google Audio Indexing, and it’s the latest experiment to be added to the Google Labs. This service lets you do a text search for spoken words in videos of YouTube’s US politicians channels. The functionality was previously already available as part of an iGoogle gadget.

Will Google ever manage to apply this technology to all YouTube videos – and perhaps even videos not contained on YouTube, but available from other video upload platforms? Google in their FAQ write that the aim of GAudi “on Google Labs is broader and the US election is just a first step. We see it as an experiment platform where we can learn what features make the best user experience for people looking for spoken content on the Web.”

Comments

Popular posts from this blog

Curious case of Cisco AnyConnect and WSL2

One thing Covid has taught me is the importance of VPN. Also one other thing COVID has taught me while I work from home  is that your Windows Machine can be brilliant  as long as you have WSL2 configured in it. So imagine my dismay when I realized I cannot access my University resources while being inside the University provided VPN client. Both of the institutions I have affiliation with, requires me to use VPN software which messes up WSL2 configuration (which of course I realized at 1:30 AM). Don't get me wrong, I have faced this multiple times last two years (when I was stuck in India), and mostly I have been lazy and bypassed the actual problem by side-stepping with my not-so-noble  alternatives, which mostly include one of the following: Connect to a physical machine exposed to the internet and do an ssh tunnel from there (not so reliable since this is my actual box sitting at lab desk, also not secure enough) Create a poor man's socks proxy in that same box to have...

My Google I/O 2024 Adventure: A GDE's Front-Row Seat to the Gemini Era

Hey tech enthusiasts! Rabimba Karanjai here, your friendly neighborhood Google Developer Expert (GDE), back from an exhilarating whirlwind tour of Google I/O 2024. Let me tell you, this wasn't just your average tech conference – it was an AI-infused extravaganza that left me utterly mind-blown! And you know what made it even sweeter? I had front-row seats, baby! Huge shoutout to the GDE program for this incredible opportunity. Feeling grateful and a tad spoiled, I must admit. 😉 Gemini: The AI Marvel That's Stealing the Show Now, let's dive into the star of the show: Gemini . This ain't your grandpa's AI model – it's the multimodal powerhouse that's set to redefine how we interact with technology. Imagine an AI that doesn't just understand text, but images, videos, code, and even your wacky doodles. Yep, that's Gemini for you! Google's been cooking up this AI masterpiece, and boy, did they deliver! The keynote demo had us all gawk...

Deep Dive into the Google Agent Development Kit (ADK): Features and Code Examples

In our previous overview, we introduced the Google Agent Development Kit (ADK) as a powerful Python framework for building sophisticated AI agents. Now, let's dive deeper into some of the specific features that make ADK a compelling choice for developers looking to create agents that can reason, plan, use tools, and interact effectively with the world. 1. The Core: Configuring the `LlmAgent` The heart of most ADK applications is the LlmAgent (aliased as Agent for convenience). This agent uses a Large Language Model (LLM) for its core reasoning and decision-making. Configuring it effectively is key: name (str): A unique identifier for your agent within the application. model (str | BaseLlm): Specify the LLM to use. You can provide a model name string (like 'gemini-1.5-flash') or an instance of a model class (e.g., Gemini() ). ADK resolves string names using its registry. instruction (str | Callable): This is crucial for guiding the agent's be...