338: T5Gemma Says "AI’ll Be Back”

Welcome to episode 338 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, Matt, and Jonathan are in the studio today to bring you all the latest in cloud and AI news, including a bit of a buying spree (inlcuding whole power companies) Veo 3.1, Cowork, and more – today in the cloud!

Titles we almost went with this week

❄️ Snowflake’s Ironic Timing: Buying Downtime Prevention Tool While Experiencing Downtime
🤪 Flexera Buys ProsperOps and Chaos Genius, Promises Less Chaos and More Prosperity
🛒 Flexera Goes Shopping: Two FinOps Acquisitions to Prosper and Reduce Chaos
🪙 Token of Appreciation: Gemini CLI Now Tracks Every Penny of Your AI Spend
🌨️ Snowflake Buys Observe to Stop Its Own Services from Melting Down
📱 Google’s Veo 3.1 Goes Vertical: Finally Understanding How People Actually Hold Their Phones
🪫 Alphabet’s New Power Move: Buying the Company That Literally Powers Data Centers
🔍 Dashboard Confessional: Gemini CLI Gets Transparent About Its Usage
🥸 Microsoft’s New Agent Works 24/7 and Never Asks for a Raise
🤖From Robot Vacuums That Climb Stairs to TVs You Can’t Feel: CES Gets Weird
🛍️ Agent Shopping: When Your AI Has Better Taste Than You Do
👎 The cloudpod hosts do not like any stories this week
🛌 AWS took a nap on announcements this week
🧑‍💼 Claude is my new co-worker
📰 Wake up, AWS, and give us some fun news
🖥️ The $200 Assistant: Is Cowork the End of Workplace Admins?
😦 Azure has more interesting announcements than AWS oh noooo
👜 If you can’t beat them in AI, just acquire everyone
📊 Notebook LM turns the Data Tables on you

AI Is Going Great – Or How ML Makes Money

01:11 Anthropic launches Cowork, a Claude Code-like for general computing – Ars Technica

Anthropic launches Cowork, a new feature in the macOS Claude desktop app that extends Claude Code‘s agentic capabilities to general office work tasks.
Users can grant Claude access to specific folders and use plain language instructions to automate tasks like filling expense reports from receipt photos, writing reports from notes, or reorganizing files.
Cowork lowers the technical barrier compared to Claude Code by making AI-assisted file operations accessible to non-developer knowledge workers, including marketers and office staff.
The feature was developed after Anthropic observed users already applying Claude Code to general knowledge work despite its developer-focused positioning.
The tool provides similar functionality to what was possible through Model Context Protocol integrations, but offers a more streamlined interface with Claude Code-style usability improvements.
Users can submit new requests or modifications to ongoing tasks without waiting for the initial assignment to complete.
Cowork represents a strategic expansion of Anthropic’s agentic AI approach beyond software development into broader productivity workflows. The feature demonstrates how AI agents with file system access can automate routine knowledge work tasks that previously required manual processing of documents and data.

02:15 📢 Ryan – “This week is the first time I actually tried to use AI to generate a PowerPoint presentation. It did not go well. It did generate some cool images, though.”

07:42 Enhanced Veo 3.1 capabilities are now available in the Gemini API.

Google has released Veo 3.1 updates in the Gemini API and Google AI Studio, adding enhanced Ingredients to Video capabilities that maintain character identity and background consistency across generated videos.
The model now supports native 9:16 vertical format generation optimized for mobile-first applications, eliminating the need to crop from landscape orientation.
The updated model delivers professional-grade output with new 4K resolution support and improved 1080p quality using state-of-the-art enhancement techniques. All generated videos include SynthID digital watermarking for content provenance tracking.
These capabilities are available today through the Gemini API for developers and Vertex AI for enterprise customers. Google AI Studio provides a demo app for testing the new features at ai.studio/apps/bundled/veo_studio.
The vertical video format addresses the growing demand for social media content creation, while the 4K output positions Veo 3.1 for professional video production workflows. The character consistency improvements reduce the need for manual editing and post-processing in multi-shot video projects.

08:20 📢 Justin – “Don’t make the same mistakes that I do, and go try this and then get a $35 bill, which I did the first time I tried Veo out. So, do be cautious with this one!”

11:08 Snowflake Announces Intent to Acquire Observe to Deliver AI-Powered Observability

Snowflake is acquiring Observe to integrate AI-powered observability directly into its data platform, allowing customers to analyze telemetry data like logs, metrics, and traces alongside their business data.
This consolidation eliminates the need for separate observability tools and reduces data movement between systems.
The acquisition addresses the growing challenge of managing observability data at scale, which has become increasingly expensive and complex as organizations generate massive volumes of telemetry information.
Observe’s approach stores data in a structured format that enables more efficient querying and analysis compared to traditional observability platforms.
By bringing observability into Snowflake’s platform, customers can correlate operational metrics with business outcomes using the same SQL-based tools they already use for analytics.
This unified approach should help teams identify how application performance issues directly impact revenue, customer experience, and other business metrics.
The deal positions Snowflake to compete more directly with observability vendors like Datadog, Splunk, and New Relic by offering native capabilities rather than requiring third-party integrations.
Organizations already using Snowflake for data warehousing can now consolidate their observability spend and simplify their tool stack.

12:08📢 Ryan – “I don’t know how to feel about this; I feel like Snowflake is a part of an application, but it’s not the entirety of an application. I definitely see a use for this for data warehousing and visualizing, but I don’t think it replaces your traditional observability tools because you have too many data sources that are outside of Snowflake.”

Cloud Tools

13:58 Flexera acquires ProsperOps and Chaos Genius to expand its FinOps solution with agentic and AI-enabled cost optimization

Flexera acquires two FinOps companies to add autonomous AI-driven cost optimization across major cloud platforms and data analytics services: ProsperOps brings automated commitment management for AWS, Azure, and Google Cloud with over $6B in annual cloud usage under management, while Chaos Genius focuses specifically on Snowflake and Databricks optimization with reported cost reductions up to 30%.
The acquisitions shift Flexera’s FinOps approach from passive recommendations to active autonomous execution through agentic AI.
This means the platform can automatically purchase and manage cloud commitments and optimize data workloads without requiring manual human intervention, addressing the challenge of dynamic cloud usage patterns that don’t align well with static commitment purchases.
ProsperOps will continue operating as a separate brand while integrating with Flexera’s existing FinOps capabilities. The company was growing at over 90% and has generated more than $3 billion in lifetime savings for customers, suggesting strong market demand for automated rate optimization solutions.
The Chaos Genius acquisition specifically targets the emerging problem of runaway costs in data analytics platforms like Snowflake and Databricks as AI workloads scale.
This addresses a gap in traditional FinOps tools that primarily focused on compute and storage optimization but lacked specialized capabilities for modern data cloud platforms.
These moves position Flexera to cover the complete FinOps Framework defined by the FinOps Foundation, combining cost visibility, workload optimization, and rate optimization in a single platform.
This matters for enterprises struggling to manage costs across an increasingly complex mix of traditional cloud services, AI infrastructure, and specialized data platforms.

15:35 📢 Matt – “It definitely needs some pretty strong guardrails of what your business objective is, like don’t go over 90% savings plan or look at the secondary market for short term if you see a random burst for a few months. But it’s not a terrible idea…”

AWS

19:12 Weirdly enough, there are no AWS stories this week.

GCP

20:06 Instant insights: Gemini CLI’s New Pre-Configured Monitoring Dashboards | Google Cloud Blog

Google has added pre-configured monitoring dashboards to Gemini CLI that provide immediate visibility into usage metrics like monthly active users, token consumption, and code changes without requiring custom query writing.
The dashboards integrate with Google Cloud Monitoring and use OpenTelemetry for standardized data collection, allowing teams to track CLI adoption and performance across their organization.
The implementation uses direct GCP exporters that bypass intermediate OTLP collector configurations, simplifying setup to three steps: setting the project ID, authenticating with proper IAM roles, and updating the settings.json file. This reduces infrastructure complexity compared to traditional OpenTelemetry deployments that require separate collector services.
Organizations can analyze raw OpenTelemetry logs and metrics to answer specific questions like identifying power users by token consumption, tracking budget allocation by command type, and monitoring tool reliability through status codes. The data follows GenAI OpenTelemetry conventions, ensuring compatibility with other observability backends like Prometheus, Jaeger, or Datadog if teams want to switch platforms.
The feature targets development teams using Gemini CLI who need to understand tool adoption patterns and justify AI tooling investments through concrete usage metrics.
Engineering managers can track which developers benefit most from AI assistance and where token budgets are being allocated across different command types

21:55 📢 Ryan – “As long as there’s no metric for how stupid a question is, because that. That I don’t want.”

22:40 We’re advancing U.S. energy innovation with Intersect.

Alphabet announced a definitive agreement to acquire Intersect, a company specializing in data center and energy infrastructure solutions.
This acquisition aims to accelerate the deployment of data center capacity and energy generation infrastructure in the United States.
The deal addresses a critical bottleneck in AI and cloud infrastructure expansion by bringing expertise in energy development and data center deployment under Alphabet’s umbrella. Intersect’s capabilities will help Google bring more computing capacity online faster, which is essential given the substantial power requirements of AI workloads and hyperscale cloud operations.
This acquisition reflects the growing importance of energy infrastructure as a limiting factor for cloud providers, particularly as AI training and inference workloads drive unprecedented power demands. By acquiring energy infrastructure expertise, Google positions itself to better control the full stack from power generation through data center operations.
The announcement provides limited technical details about integration timelines or specific projects, but signals Google’s commitment to vertical integration in the infrastructure space. This move follows similar investments by other hyperscalers in power generation and energy partnerships to support their expanding data center footprints.

22:50 📢 Justin – “If you can’t get the capacity from the vendor, just buy them – and then force them to do it. Good move!”

25:00 Google’s NotebookLM introduces Data Tables feature

NotebookLM now includes Data Tables, a feature that automatically synthesizes information from multiple sources into structured tables that can be exported directly to Google Sheets.
The feature is available today for Pro and Ultra users, with rollout to all users planned for the coming weeks.
The feature addresses a common workflow challenge where valuable information is scattered across multiple documents, requiring manual compilation. Data Tables automates this process by extracting and organizing key facts into clean, structured formats without manual data entry.
Use cases span professional and personal applications, including converting meeting transcripts into action item tables with owners and priorities, synthesizing research data like clinical trial outcomes across multiple papers, creating competitor analysis tables with pricing and strategy comparisons, and building study guides organized by relevant categories.
The feature represents Google’s continued integration of AI capabilities into productivity tools, positioning NotebookLM as a research and synthesis tool rather than just a note-taking application.
This builds on NotebookLM’s existing source analysis capabilities by adding structured data output.
The tiered rollout strategy, with Pro and Ultra users receiving immediate access, suggests Google is testing the feature with power users before broader deployment, likely to gather usage patterns and refine the table generation algorithms.

25:52 📢 Justin – “I love creating spreadsheets; my budgets, all of my tracking of things, tasks I’m doing, vacation planning – it all lives in spreadsheets. And you’re going to take that away from me, Google? How dare you. AI is coming for my passion for spreadsheets.”

29:53 T5Gemma 2: The next generation of encoder-decoder models

Google releases T5Gemma 2, a new generation of encoder-decoder models based on Gemma 3, available now in pre-trained checkpoints at three sizes: 270M-270M (370M total), 1B-1B (1.7B total), and 4B-4B (7B total) parameters. The models use tied word embeddings and merged decoder attention to reduce parameter count while maintaining capabilities, making them suitable for on-device applications and rapid experimentation.
T5Gemma 2 adds multimodal vision capabilities using an efficient vision encoder for visual question answering and reasoning tasks, extends context windows to 128K tokens using Gemma 3’s alternating local and global attention mechanism, and supports over 140 languages out of the box.
These represent the first multi-modal and long-context encoder-decoder models in the Gemma family.
The architecture merges decoder self-attention and cross-attention into a single unified layer, reducing model complexity and improving parallelization for better inference performance.
This structural change, combined with tied embeddings, allows more active capabilities within the same memory footprint compared to the original T5Gemma.
Benchmarks show T5Gemma 2 outperforms Gemma 3 on several multimodal tasks, delivers substantial quality gains on long-context problems compared to both Gemma 3 and T5Gemma, and shows improved performance on coding, reasoning, and multilingual tasks. Post-training results indicate better performance than decoder-only counterparts, making these models suitable for both research and production applications.
The models are designed for developers to post-train for specific tasks before deployment, continuing the approach from the original T5Gemma of adapting pre-trained decoder-only models into an encoder-decoder architecture without the computational cost of training from scratch.
Pre-trained checkpoints are available across multiple platforms for broad developer access.

31:14 📢 Jonathan – “I’m actually looking forward to playing with the T5Gemma model because the encoder part of it is what’s going to make it really special. Transformers have always had these two halves, encoder and decoder, and most LMs only use the decoder. And what that means is that as the attention is calculated for each token in the context window, it only ever attends to previous tokens in the message. So if you have a word, that word can only ever be related to something that you’ve already said in the conversation. But people aren’t like that. People go back and forth, and they refer back to things they said… people just suck at communication most of the time. And so what the encoder model does is it looks at the entire message holistically. It doesn’t only look at the last word by the time it gets to the last word, it looks at everything and encodes the meaning of the entire text. And then from there, it passes it to the decoder, and the decoder starts generating text based on the entire knowledge of the whole thing.”

33:39 New tech and tools for retailers to succeed in an agentic shopping era

Google launches Universal Commerce Protocol (UCP), an open standard for agentic commerce co-developed with Shopify, Etsy, Wayfair, Target, and Walmart.
UCP enables AI agents to interact across the entire shopping journey from discovery to post-purchase support, working alongside existing protocols like A2A, AP2, and MCP. The protocol is endorsed by over 20 companies, including Adyen, American Express, Mastercard, Stripe, and Visa.
New agentic checkout feature goes live in AI Mode in Search and Gemini app, allowing shoppers to purchase from eligible U.S. retailers directly within Google’s AI surfaces.
The integration uses Google Pay and PayPal for payments, with retailers maintaining seller of record status and the ability to customize the implementation. Global expansion and additional capabilities like loyalty rewards and product discovery are planned for the coming months.
Business Agent launches tomorrow as a branded AI assistant that appears directly in Search results for retailers like Lowe’s, Michaels, Poshmark, and Reebok. U.S. retailers can activate and customize this agent through Merchant Center, with future capabilities including training on retailer data, customer insights, product offers, and direct agentic checkout within the chat experience.
Google introduces Direct Offers pilot in AI Mode, allowing advertisers to present exclusive discounts and deals to shoppers during AI-powered searches. The system uses AI to determine when offers are relevant to display, initially focusing on discounts with plans to expand to bundles and free shipping. Early partners include Petco, e.l.f. Cosmetics, Samsonite, Rugs USA, and Shopify merchants.
Merchant Center adds dozens of new data attributes designed for conversational commerce discovery across AI Mode, Gemini, and Business Agent. These attributes extend beyond traditional keywords to include product Q&A, compatible accessories, and substitutes, rolling out first to a small group of retailers before broader expansion.

35:20📢 Ryan – “I think it’s important to standardize. In a web transaction where you’re doing shopping, there’s so many handoffs to different things, I can see, as more and more AI and agent-based or agent-assisted transactions happen, being able to talk a common language is super important.”

33:38 Read Sundar Pichai’s remarks at the 2026 National Retail Federation

Google announced Universal Commerce Protocol (UCP), an open standard for agentic commerce built with Shopify, Etsy, Wayfair, Target, and Walmart. The protocol enables native checkout directly in Google Search AI Mode and Gemini, allowing retailers to maintain merchant of record status and own customer relationships while offering personalized pricing and loyalty enrollment at checkout.
Gemini Enterprise for Customer Experience is now available in preview, providing retailers with integrated shopping assistants, support bots, and agentic search capabilities.
The Home Depot and McDonald’s are already using these agents for customer service, while Kroger is testing a shopping agent that brings AI Mode functionality directly into retailer apps.
Google processed over 90 trillion tokens through its API in December 2025, representing an 11x increase from 8.3 trillion tokens in December 2024. This growth demonstrates the rapid adoption of AI capabilities by retailers and the scale at which Google’s infrastructure is supporting commercial AI applications.
Wing delivery service expanded to Houston, with Orlando, Tampa, and Charlotte coming soon, after doubling deliveries in existing markets during 2025 through its Walmart partnership.
The expansion addresses the high cost and logistical challenges of last-mile delivery for retailers.

38:35 📢 Jonathan – “So is this how Google is going to make money in the future? Because obviously serving ads through AI is both controversial and a very lame customer experience. Are they going to start skimming off a percentage of sales for sales they direct to these retailers through their AI interface?”

Azure

39:58 Announcing public preview: Uncovering hidden threats with the Dynamic Threat Detection Agent | Microsoft Community Hub

Microsoft launches the Dynamic Threat Detection Agent in public preview, an AI-powered backend service that runs continuously within Defender to identify hidden threats across Defender and Sentinel environments.
The agent operates autonomously with no setup required, automatically generating alerts with natural language explanations, MITRE technique mappings, and remediation steps directly into existing XDR workflows.
The agent achieves over 85% precision across thousands of alerts and 28 threat types by combining adaptive GenAI detection with hyperscale threat intelligence from TITAN and UEBA behavioral analytics.
It runs a five-step investigation loop at machine scale, starting from high-priority incidents, building unified activity timelines, testing hypotheses through automated Q&A, and closing detection gaps with explainable alerts that include transparent reasoning traces.
Public preview is free for Security Copilot customers and enabled by default for eligible organizations, with general availability planned for late 2026 when it transitions to Security Copilot’s SCU-based consumption model.
Starting July 2026, the agent will be included with Microsoft 365 E5 licenses that have Security Copilot entitlement, and customers can disable it or monitor usage through detailed consumption reporting at any time.
The service respects data residency by running region-local and integrates deeply with the Microsoft security ecosystem, using Sentinel to correlate third-party and native telemetry while surfacing Copilot-sourced detections in Defender.
Built on Azure Synapse for massive scale, it can run thousands of parallel investigations and deliver near-real-time detections while continuously learning from analyst feedback to improve detection quality and reduce alert noise.

43:54 📢 Jonathan – “You don’t want to block a potential customer who’s about to press a button to spend tens of thousands of dollars either. guess false positives are almost as bad as false negatives.”

45:26 Generally Available: Geo-Replication for Azure Service Bus Premium

Azure Service Bus Premium now includes generally available Geo-Replication, allowing customers to replicate messaging infrastructure across regions for disaster recovery.
This addresses a critical need for enterprises running mission-critical messaging workloads that require protection against regional outages.
The feature provides active replication of Service Bus entities, including queues, topics, and subscriptions, between paired regions, maintaining message ordering and metadata consistency.
Organizations can now implement cross-region failover strategies without building custom replication logic or managing multiple Service Bus namespaces manually.
This capability is exclusive to the Premium tier of Service Bus, which starts at approximately $677 per month for the base messaging unit. Customers should factor in additional costs for cross-region data transfer and the secondary namespace when planning their disaster recovery architecture.
The geo-replication option complements existing Service Bus disaster recovery features like Geo-Disaster Recovery (metadata-only failover), giving customers flexibility in choosing between cost-optimized metadata replication or full data replication based on their recovery time objectives.
This is particularly relevant for financial services, healthcare, and retail sectors, where message loss during regional failures is unacceptable.

46:23 📢 Justin – “I’m surprised this wasn’t already part of premium, but I’m also sort of intrigued that they think people’s messaging strategies only involve two regions, because some of the cost architectures I’ve seen are like multiple regions with active replication across these things for geodistributed applications that need to have globally low latency for user populations everywhere – and I guess I just can’t run that on this service. So I guess, screw you? Or wait for Azure Service Bus Ultra?”

After Show

46:38 CES 2026: The best tech announced so far | The Verge

CES 2026 showcased significant infrastructure innovations, including Wi-Fi 8 routers from Asus and others, despite the standard not being finalized until 2028, plus solid-state battery breakthroughs from Donut Lab claiming 400 Wh/kg energy density that could give EVs 30 percent more range. These developments signal major shifts in networking and power infrastructure that cloud and edge computing deployments will eventually leverage.
Smart home and IoT devices are getting serious upgrades with Matter compatibility becoming standard across Ikea and Philips Hue products, while spatial awareness features like Hue’s SpatialAware use AR to map rooms for better lighting distribution. For cloud professionals, this represents the maturation of IoT protocols and edge AI processing that will drive increased demand for home automation backend services.
The display technology race is heating up with Samsung showing creaseless foldable OLED panels, Dell launching a 52-inch 6K Thunderbolt hub monitor, and LG reviving its Wallpaper TV with wireless video transmission. These advances in display tech and connectivity standards like Thunderbolt 5, delivering 120Gbps speeds, will impact how professionals design workspaces and remote work setups.
AI wearables are moving beyond glasses with Razer’s Project Motoko headphones featuring 4K cameras, on-device AI processing via Qualcomm chips, and 36-hour battery life that eclipses current smart glasses. This shift toward headphone-based AI assistants could influence how voice interfaces and edge AI applications are developed for consumer devices.
Robotics took center stage with practical home automation like Roborock’s stair-climbing Saros Rover vacuum and LG’s CLOiD dual-arm robot that can fold laundry and handle kitchen tasks. While still in development, these robots represent the convergence of computer vision, edge AI, and mechanical engineering that will require robust cloud backends for training and coordination.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

338: T5Gemma Says “AI’ll be Back”