360: And you thought AWS was out of features for S3. Surprise!

July 1, 2026 01:22:25
360: And you thought AWS was out of features for S3. Surprise!

360: And you thought AWS was out of features for S3. Surprise!

July 1, 2026 01:22:25
0:00
0:00

Download & Resources

 Welcome to episode 360 of The Cloud Pod, where the weather is always cloudy! Justin, Matt, and Jonathan (for a bit, anyway) are in the studio this week bringing you all the latest in cloud and AI news, including a bunch of analytics, some upgrades courtesy of AI agents, and some news from Kafka. There’s a lot to cover, so let’s get started! 

Titles we almost went with this week

  • 🚛 MSK Agent Skills Make Kafka Migration Less Kafkaesque
  • 🪙 One Token Pool to Rule All Claude Tools
  • 🚔 STRIDE Into Security Without Leaving Your IDE
  • 🧑‍💻 Your Code Must Be This Stable to Enter Production
  • 🏦 ChatGPT Gets a Budget So Karen Can’t Break the Bank
  • 🌋 One Platform to Train Them All and in Darkness Deploy Them
  • ❄️ Who Let the Agents Out? Snowflake Knows
  • 😮‍💨 Kafka Whisperer Now Comes With an AI Upgrade
  • 📄 Stop Reading Docs, Let MSK AI Do the Kafka Math
  • ✍️ Your AI Wrote That Pull Request, Own It
  • 🤖 Claude. Tag, you’re it! 
  • 🪣 See, there are more features that we can add to s3

A big thanks to this week’s sponsors:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our Slack channel for more info.

AI Is Going Great – or How ML Makes Money 

03:45 Claude Design now stays on brand for daily work 

  • Claude Design now integrates directly with Claude Code through two new slash commands: /design-sync pulls your design system into Claude Code, and /design lets you create and manage design projects without leaving the terminal, keeping both tools in sync throughout the workflow.
  • The rebuilt design system import supports GitHub repos, design files, and raw uploads, with Claude automatically checking its output against your components before rendering results. 
  • Enterprise admins can lock down a single approved system to enforce consistency across teams.
  • Anthropic updated the usage model, so Claude Design now shares a token pool with chat, Claude Cowork, and Claude Code rather than having separate limits, which should give most users more headroom and reduce how often they hit caps.
  • Export and integration options expanded substantially, with connectors now covering Adobe, Canva, Gamma, Lovable, Miro, Replit, Vercel, Wix, Base44, and standard PDF and PowerPoint formats, making it easier to move finished work into existing production pipelines.
  • Claude Design is available in beta on Pro, Max, Team, and Enterprise plans at claude.ai/design, with Enterprise having it disabled by default pending admin activation and output restricted to internal sharing only.

05:07 📢 Matt – “…when I’ve used it – just playing around with it, it produced really nice things. I just used half my session tokens real fast with iterations and things like that. So I would be careful using it, but it does great front-end design.”

Data+AI Summit – Top Announcements 

07:31 Introducing Lakehouse//RT: Real-Time Performance on a Unified Lakehouse

  • Lakehouse//RT is a landmark new real-time data warehouse engine (Reyden) delivering millisecond latency directly on the lakehouse, eliminating the need for separate serving layers, a major architecture.
  • Databricks announced Lakehouse//RT, a new real-time data warehouse built directly into the lakehouse, delivering millisecond query responsiveness at high concurrency without a separate serving layer or data copies.
  • Powered by a brand-new engine called Reyden, Lakehouse//RT targets operational analytics, BI and app serving, and observability workloads — the exact use cases that previously forced teams to bolt on Redis, Druid, or Pinot.
  • The core pitch is architectural simplification: eliminating the dedicated serving-layer copy means no more stale replicas, no extra governance surface, and no ETL pipeline to keep the hot tier in sync.
  • This directly challenges standalone real-time OLAP vendors (Apache Druid, ClickHouse, StarRocks) by collapsing their niche into the Databricks platform; practitioners should watch benchmark claims closely before ripping anything out.
  • No public pricing or GA date was shared in the announcement; availability details are expected to follow from the Data + AI Summit 2026 session track.

07:39 Introducing Genie ZeroOps: Put your data and AI operations on autopilot 

  • Genie ZeroOps is a genuinely novel autonomous ops agent purpose-built for data/ML pipelines, addressing a real pain point (pipeline failures, model drift) with concrete agentic automation that practit
  • Genie ZeroOps is a new autonomous background agent that continuously monitors Databricks production assets, pipelines, jobs, tables, and ML models, investigates failures before or as they occur, and proposes verified fixes.
  • The agent runs a four-step loop for every failure: detect, assess root cause, remediate with a fix, and verify the fix has no side effects, all without requiring a human to open a ticket first.
  • Databricks explicitly argues that generic coding agents (think GitHub Copilot Workspace) fall short here because they lack access to Spark logs, telemetry, and the data-lineage context needed to distinguish a code bug from an upstream schema change or late-arriving data.
  • The target pain point is real: data teams report spending the majority of their time on operational firefighting rather than building, and the proliferation of LLM-generated pipelines is making that worse, not better.
  • Because ZeroOps runs inside Databricks, it has native, governed access to Unity Catalog lineage, job run history, and cluster metrics, a meaningful advantage over external AIOps tools that need custom integrations.

07:49 Introducing OpenSharing: the Next Evolution of Delta Sharing for the Agentic Era

  • OpenSharing, as the industry’s first open protocol for sharing data, models, agents, and skills across any cloud/vendor, is a landmark open-source initiative that extends Delta Sharing’s proven adoption.
  • OpenSharing is the next evolution of Delta Sharing, repositioned as the industry’s first open protocol for sharing not just data tables and files, but also models, agents, and AI skills — across any cloud, any vendor, and any format.
  • Delta Sharing has already hit meaningful scale: 28,000+ data recipients and 33% of shares flowing cross-platform via open connectors, with adopters including SAP, Atlassian, Mercedes-Benz, S&P Global, and LSEG.
  • The key architectural expansion is moving beyond structured tables to semantic context, unstructured data, and AI artifacts, exactly what agentic workflows need to share across organizational and platform boundaries.
  • Delta Sharing is being spun out as an independent open-source project under OpenSharing, which should reduce vendor lock-in concerns and encourage broader ecosystem participation beyond the Databricks orbit.
  • For practitioners, the practical implication is that you could eventually share a fine-tuned model or an MCP-compatible agent skill with a partner org in the same zero-copy, no-replication way you share a Delta table today.

07:58 Lakeflow: A new era of agentic data engineering

  • Lakeflow Designer reaching GA and deep Genie Code integration across the full data engineering lifecycle is a significant platform milestone that consolidates ingestion, transformation, and orchestration. 
  • Databricks is rebranding and unifying its data engineering surface under the Lakeflow banner, covering ingestion, transformation, and orchestration in a single platform, all governed by Unity Catalog.
  • Lakeflow Designer is now generally available: a visual, no-code, AI-powered interface that democratizes pipeline building for non-engineers, directly competing with tools like Fivetran’s transformation UI and Azure Data Factory’s canvas.
  • Genie Code is now deeply integrated across the entire Lakeflow experience. You can use it to generate ingestion connectors, build Python/SQL pipelines, and scaffold jobs with tasks, triggers, and dependencies from natural language.
  • The agentic angle is significant because Lakeflow is unified and Unity Catalog provides end-to-end lineage; Genie Code has full context to not just build pipelines but also operate and repair them, tying directly into the ZeroOps announcement.
  • The fragmented data stack problem is real for most enterprises (Airflow + dbt + Fivetran + custom connectors), and Databricks is betting that a single governed surface will win out as AI agents become the primary pipeline operators.

08:07 Introducing Genie One, Genie Agents, and Genie Ontology 

  • Genie One, Genie Agents, and Genie Ontology represent a meaningful evolution of enterprise AI assistants with native Slack/Teams integration and governed, data-grounded answers, a significant step to
  • Genie One is the next generation of Databricks’ conversational analytics assistant, evolving from a query tool into a ‘data-smart AI coworker’ that can move users from insight to action, not just answer questions.
  • Native integrations with Slack and Microsoft Teams are launching, letting users @mention Genie in any channel or thread to get governed, data-grounded answers without switching context to the Databricks UI.
  • Genie Ontology is the new mechanism for grounding agents in enterprise business context; it aggregates meaning scattered across dashboards, queries, pipelines, wikis, and tickets so agents stop hallucinating business logic.
  • Genie Agents extends the platform so organizations can deploy custom Genie-powered agents for specific workflows, all governed through Unity Catalog, bridging the gap between conversational BI and full agentic automation.
  • The core problem being solved is that current-gen agents do iterative probing that is ‘extremely slow and costly,’ producing generic or wrong answers; Genie One’s ontology layer is Databricks’ answer to that quality gap.

08:33 Unifying Data and Governance in the Agentic Era: What’s New with Azure Databricks

  • A comprehensive Azure Databricks summit roundup introducing LTAP architecture and Lakebase GA on Azure is highly relevant for Azure-focused practitioners, though it largely aggregates other announcements.
  • Azure Databricks is introducing what it calls the first true LTAP (Lake Transactional/Analytical Processing) architecture, a unified storage layer that brings analytical data, streaming pipelines, and live application transactions onto a single shared copy on the lakehouse, eliminating the need for a separate operational side-stack.
  • Azure Databricks Lakebase is now generally available: a fully managed, serverless Postgres database purpose-built for the agent era, featuring decoupled compute and storage and instant copy-on-write database branching for safe debugging of production data.
  • The four pillars framing the Azure Databricks roadmap are Agentic Data (real-time foundation), Agentic Dev & Work (AI coworkers in productivity tools), Agentic Marketing (lakehouse-embedded personalization), and intelligent governance, giving practitioners a clear map of where the platform is heading.
  • Copy-on-write database branching is a standout feature for DevOps practitioners: it lets you spin up an isolated branch of a production Postgres database instantly, eliminating compliance risk when reproducing and debugging live data issues.
  • The Azure-specific framing matters for enterprise architects: all of these capabilities run natively on Azure infrastructure, meaning they inherit Azure’s compliance certifications and integrate with existing Azure networking and identity controls.

08:42 Announcing Lakebase Search: agent-native retrieval built into Lakebase Postgres

  • Lakebase Search bringing native hybrid vector + full-text retrieval into Postgres via Lakebase (beta on AWS and Azure) is a concrete, practitioner-relevant launch that simplifies agent retrieval archi
  • Lakebase Search is now in beta on AWS and Azure: hybrid vector and full-text retrieval built natively into Lakebase Postgres via two extensions: lakebase_vector and lakebase_text, so your entire agent loop can run against a single data backend.
  • The key insight driving the design is that agents operate search as a live read/write loop; they write new memory on one turn and need that data fully indexed and searchable on the very next turn, which traditional search engines built for read-only snapshots simply cannot support.
  • Databricks reports that agents now operate 4x more databases on Lakebase than human users do, making agent-first ergonomics a first-class design requirement rather than an afterthought.
  • The economics angle is notable: vector search causes severe data bloat (a 1 KB text file expands significantly when chunked and vectorized), and Lakebase Search is architected to handle the cold-storage economics of that pattern at scale, a direct shot at Pinecone and pgvector deployments on standard Postgres.
  • Being in beta means practitioners can start testing now, but production SLAs and pricing details are not yet locked, worth watching before committing RAG pipelines to the platform.

08:51 AI governance at Data + AI Summit 2026: What’s new with Unity AI Gateway

  • Unity AI Gateway extending governance to models, agents, MCP services, and tools with spend caps, routing, and runtime guardrails is a substantive enterprise AI governance capability that practitioners.
  • Unity AI Gateway is Databricks’ new runtime governance layer for enterprise AI, extending Unity Catalog beyond data assets to cover models, agents, MCP services, and skills, governing not just what AI accesses but what it does at runtime.
  • Cost management is a first-class feature: the gateway provides spend visibility across providers, granular attribution by team or workload, hard spend caps, and intelligent routing to balance quality against cost, critical as organizations scale from a few models to fleets of agents.
  • Security and monitoring capabilities include unified tracing across agent calls, coding agent observability, Lakewatch-powered investigations, and an open ecosystem of security and identity partners, giving security teams the audit trail they need for regulated industries.
  • The governance model shift is significant for architects: traditional catalog governance was about access control (who can query this table), but Unity AI Gateway governs behavior (what can this agent invoke, what guardrails apply at inference time).
  • Multi-model, multi-agent, multi-vendor is the explicit design target; practitioners who are already mixing OpenAI, Anthropic, and open-source models across different agent frameworks will find the unified policy layer directly addresses the sprawl problem.

08:56 Agent Bricks: Data + AI Summit 2026

  • Agent Bricks expanding into a full developer agent platform with 100k+ agents built and 1+ quadrillion tokens/year processed is a significant platform maturation announcement with real adoption metrics. 
  • Agent Bricks has hit serious scale since its launch: 100,000+ agents built on the platform and 1+ quadrillion tokens processed per year, with production deployments at AstraZeneca, 7-Eleven, Fox Corporation, and Block.
  • Databricks is repositioning Agent Bricks from an agent-building toolkit into a full developer agent platform, based on the hard-won lesson that the core agent loop is only 1% of the work; the other 99% is infrastructure: token capacity, deployment, security, evaluation, monitoring, context, and sharing.
  • The ‘hidden technical debt of agentic systems’ framing (borrowing from the classic 2015 NeurIPS ML technical debt paper) is a sharp message to practitioners who’ve built agents and then spent months building scaffolding around them.
  • The platform expansion addresses three critical challenges: giving agents the right context at scale, making agents trustworthy and governable in production, and enabling agents to be shared and composed across teams and organizations.
  • The unification of data and AI is central to the pitch: agents both consume data (via tools and context) and produce data (reasoning traces, memory, action logs), and Databricks argues that only a platform that governs both sides can manage agentic systems at enterprise scale.

What’s new with Unity Catalog at Data + AI Summit 2026

  • Unity Catalog’s evolution to a runtime governance decision-maker for agents with AI Gateway, Glossary, cross-cloud addressability, and Governance Hub is a substantive update to the industry’s most
  • Unity Catalog is evolving from a system of record into a runtime decision-maker for AI: with 14,000+ organizations already on the platform, Databricks is adding Unity AI Gateway, Glossary, Domains, and cross-cloud/cross-region addressability to meet the demands of the agentic era.
  • The new Glossary and Domains features create a governed, shared source of business meaning for both humans and agents, directly attacking the hallucination problem that occurs when agents lack enterprise context and fill gaps with inference.
  • Cross-cloud and cross-region addressability means one catalog, one set of policies, and consistent governance wherever workloads run, a significant operational simplification for enterprises with multi-cloud footprints.
  • The new Governance Hub provides a unified control plane for visibility across the entire AI estate, covering data assets, models, agents, tools, and MCP services under a single policy framework.
  • The philosophical shift Databricks is articulating from ‘govern access’ to ‘govern behavior’ is the most important architectural concept for practitioners to internalize as they design agentic systems that act autonomously on enterprise data.

Agentic AI Platform & Developer Tools

  • Agent Bricks: DAIS 2026 
    • Expands into a comprehensive agent platform for developers — with 100k+ agents already built and 1+ quadrillion tokens processed per year, it now addresses the “99% hidden debt” of agentic systems: token capacity, deployment, security, evaluation, and monitoring.
  • What’s New in Genie Code at Data + AI Summit 2026
    • Introduces a full-page command center for complex, multi-step data and ML workflows, plus scheduled tasks and production engineering upgrades — used by 90% of Databricks customers after 10x growth in the past year.
  • What’s New in the AI Platform: Agents for ML Engineering
    • Brings Genie Code support across the entire ML lifecycle — from feature engineering and experiment management to drift detection — alongside a new deep learning platform and real-time ML serving capabilities.
  • Databricks and NVIDIA: Building for the Agentic Era 
    • Deepens the partnership to embed NVIDIA Rubin GPUs, the new Vera CPU, and NVIDIA Agent Toolkit software into AI Runtime, Model Serving, and industry AI solutions on Databricks.
  • What’s coming next to Free Edition
    • Adds Genie Code, serverless GPUs, Lakebase, Agent Bricks, and Lakeflow Designer to the free tier — giving 500,000+ learners a complete end-to-end data and AI toolkit at no cost.

Real-Time Data & Lakehouse Architecture

  • Introducing Lakehouse//RT
    • Delivers millisecond-latency operational analytics directly on the lakehouse via the new Reyden engine — eliminating the need for a separate serving layer and the data copies, cost, and complexity that come with it.
  • Lakeflow: A New Era of Agentic Data Engineering
    • Unifies ingestion, transformation, and orchestration on a single platform governed by Unity Catalog, with Genie Code now deeply integrated and Lakeflow Designer reaching general availability.
  • Accelerate search queries with full-text search indexes on Databricks
    • (Now in Beta on DBR 18.2) Can speed up substring and keyword queries by 100x or more on open-format tables without changing table layouts — eliminating the need for external systems like Elasticsearch or Splunk.
  • What is data pipeline architecture?
    • Offers a timely explainer on pipeline design patterns and best practices (picking up light HN traction with 6 points), highlighting how Lakeflow collapses the old batch/streaming divide onto a single foundation.
  • Announcing Lakebase Search: agent-native retrieval built into Lakebase Postgres
    • Introduces hybrid vector and full-text retrieval via native Postgres extensions (`lakebase_vector` and `lakebase_text`), enabling agents to treat search as a live operational database with instant indexing of the latest writes.

Governance, Security & Compliance

Agentic Operations & Data Sharing

  • Introducing Genie ZeroOps
    • Puts data and AI operations on autopilot with an autonomous background agent that monitors pipelines, jobs, tables, and ML models — detecting failures, diagnosing root causes, and suggesting verified fixes without the limitations of generic coding agents.
  • Introducing OpenSharing: the Next Evolution of Delta Sharing for the Agentic Era
    • Evolves Delta Sharing into an independent open-source project that extends zero-copy sharing beyond tables and files to models, agents, and skills — across any cloud, vendor, or format.
  • Introducing OpenSharing SecureConnect
    • Adds a Databricks-managed proxy for storage access, eliminating per-recipient firewall configuration so providers can onboard new data-sharing recipients in minutes rather than weeks.
  • Introducing Genie One, Genie Agents, and Genie Ontology
    • Elevates Genie from a conversational analytics assistant to a data-smart AI coworker embedded natively in Slack and Microsoft Teams, grounded in enterprise context via Genie Ontology to eliminate hallucinations on business data questions.

Agentic Marketing, Apps & BI

  • Introducing CustomerLake: The Agentic CDP embedded in Databricks
    • Brings Customer 360, identity resolution, audience building, and campaign automation natively into the lakehouse — enabling always-on, 1:1 personalization at enterprise scale without duplicating sensitive customer data.
  • Introducing the Agentic CDP: A New Species of CDP for a New Era of Agents
    • (Co-authored by Ali Ghodsi and Reynold Xin) makes the case that traditional CDPs are architecturally incompatible with millisecond agentic buying lifecycles, and outlines the three requirements — speed, hyper-personalization, and richer context — that only a lakehouse-native CDP can meet.
  • Enabling Governed Vibe Coding for Enterprise Apps on Databricks
    • Introduces App Spaces, Genie App Builder, and Serverless Micro Apps — tripling weekly active app users — so any business analyst can safely build and deploy governed applications without burdening platform teams.
  • Announcing Apps on Databricks Marketplace
    •  (Public Preview) Let customers discover, install, and run third-party data and AI apps directly inside their secure workspaces, while giving ISVs a no-egress distribution channel to thousands of enterprises.
  • Design Beautiful Dashboards in AI/BI
    • Introduces workspace-level and dashboard-level theming so teams can apply consistent brand identity across all AI/BI dashboards — with Genie Code prompts to build them from scratch.

Ecosystem, Partners & Customer Stories

14:58 New usage analytics and updated spend controls for enterprises

  • OpenAI has released new credit usage analytics and updated spend controls for ChatGPT Enterprise, available now in the Global Admin Console
  • Admins can track credit consumption broken down by user, product, and model, and access this data programmatically via a unified Cost API.
  • The updated spend controls allow admins to set a default workspace-wide credit limit, configure limits for specific groups, and create individual overrides for high-usage employees. 
  • This replaces a one-size-fits-all approach with tiered controls that can accommodate power users without raising limits for the entire organization.
  • Employees can now view their own credit usage within workspace settings and submit requests for additional credits with context about their work, giving admins enough information to approve increases selectively rather than broadly.
  • The Cost API integration is worth noting for enterprise IT teams, as it allows organizations to pull credit usage data into their own internal systems for deeper analysis alongside other business spending data.
  • For organizations already managing cloud spend across AWS, Azure, or GCP, these controls bring ChatGPT Enterprise closer to the cost governance model those platforms offer, making it easier to treat AI usage as a managed line item rather than an untracked operational expense.

15:06 📢 Justin – “…you might think that’s very FinOps-y of them, and you’d be right;  because they worked with Aptio to basically build out this capability along with Anthropic and others in the focus groups from FinOps.”

17:06 Agent identity: a new access model for autonomous, team-wide AI 

  • Claude Tag introduces an “agent identity” access model where AI agents get their own dedicated service accounts rather than borrowing individual user credentials, allowing Claude to operate across shared Slack channels, GitHub, and data warehouses on behalf of entire teams rather than single users.
  • Permissions are scoped at the workspace and channel level rather than per-user, so admins can grant the engineering channel access to GitHub while confining CRM access to a separate private channel, with each private channel getting a distinct Claude identity that cannot cross into other channels.
  • Revoking a Claude identity immediately removes access across all connected systems simultaneously, which simplifies enterprise access management compared to auditing individual agent actions spread across dozens of user accounts.
  • Every network call, memory write, and routine executed under agent credentials is logged in an audit trail, and outbound traffic to any host not explicitly allowed by an admin is blocked, addressing common enterprise security concerns around autonomous agents.
  • Anthropic plans to add just-in-time credential grants so users can approve individual sensitive actions in the moment without permanently expanding the agent’s scope, plus an identity-aware overlay that checks both channel-level and user-level permissions before Claude acts.

18:22 📢 Justin – “It’s a little bit clunky right now, mostly tied to limitations I think of how bots interact with Slack, but I see the potential.” 

Security 

20:08 Temporary Cloudflare Accounts for AI agents

  • Cloudflare introduced temporary accounts for AI agents, allowing them to deploy Workers via the Wrangler CLI using a new, temporary flag without requiring any prior account signup or authentication flow.
  • Deployments stay live for 60 minutes, during which a human can claim the account permanently. If unclaimed, the account and all associated resources are automatically deleted.
  • The feature addresses a practical problem in agentic workflows where browser-based OAuth flows and MFA prompts create hard stops for background agents operating without a human in the loop.
  • Wrangler was updated to surface the –temporary flag in its output messages, so agents can discover and use it without explicit human instruction, enabling a write-deploy-verify loop without manual intervention.
  • Cloudflare is pairing this with broader efforts, including a Stripe partnership for agent-provisioned accounts and a WorkOS collaboration on OAuth standards for agents, suggesting a broader push toward standardizing how agents interact with cloud infrastructure.

Cloud Tools

22:59  Introducing the Cloudflare One stack: agent-powered deployment

  • Cloudflare released the Cloudflare One stack, an open-source set of agent skills hosted on GitHub that helps automate the deployment, configuration, and management of Zero Trust network environments without requiring deep prior knowledge of Cloudflare’s product suite.
  • The stack ships as two skill files: cloudflare-one for general product guidance like VPN replacement and policy management, and cloudflare-one-migration for translating configurations from vendors like Zscaler and Palo Alto Networks into equivalent Cloudflare constructs.
  • When paired with the Cloudflare code mode MCP server, agents get a typed interface to the live Cloudflare API, allowing them to query account configurations and make changes through recommended workflows rather than ad-hoc API calls, while keeping credentials out of the model context.
  • The migration logic in the stack is the same as that used in Cloudflare’s existing Descaler and Deskope programs, which have moved enterprise customers from Zscaler and Netskope to Cloudflare One in hours rather than months, and this makes that capability self-serve for any customer or partner at any time.
  • The stack also handles ongoing operational tasks like recommending security rules based on live traffic, investigating anomalies in web gateway logs, and reporting on user experience metrics through the Digital Experience Monitoring toolkit, making it useful beyond initial migration scenarios.

23:48 📢 Justin – “This announcement also just taught me that Cloudflare has a Zscalar and Netscope competitor, which I did not know.”

AWS

28:21 Amazon S3 annotations: attach rich, queryable context directly to your objects

  • S3 Annotations is a new metadata capability that lets you attach up to 1,000 named annotations per object, each up to 1 MB, totaling up to 1 GB per object, in formats like JSON, XML, YAML, or plain text. 
  • This addresses a long-standing limitation where rich object context had to live in separate databases or sidecar files, requiring complex synchronization.
  • Annotations are mutable and move automatically with objects during copy, replication, and cross-region transfers, which is a meaningful improvement over the existing 10-tag limit and 2 KB user-defined metadata headers that S3 has historically offered.
  • When S3 Metadata is enabled, annotations automatically flow into Apache Iceberg-backed annotation tables queryable via Amazon Athena, with backfill support for existing annotated objects. The tables adapt to any JSON, XML, or YAML structure without schema migrations, and you can also query them using natural language through the S3 Tables MCP server.
  • Practical use cases include media companies tracking AI-generated transcripts and content ratings, financial services attaching sentiment analysis to research documents, and life sciences teams annotating clinical trial data for compliance audits without needing to restore archived objects from S3 Glacier.
  • Annotation storage is billed at S3 Standard rates regardless of the parent object’s storage class, so teams storing annotations on Glacier objects should factor that cost difference into their planning. 
  • The feature is available today in all AWS Regions, including China Regions, with annotation tables available wherever S3 Metadata is supported.

29:57 📢 Justin – “This is a pretty handy improvement. I’d kinda got to the point where I thought S3 had all the features it coud possible have, and they just keep surprising me.”

31:15 Introducing AWS Continuum for security at machine speed

  • AWS Continuum is a new security service in gated preview that automates the full vulnerability lifecycle, from discovery and prioritization to validation and remediation, using AI agents operating within guardrails defined by the security team.
  • The service addresses a common pain point where teams already have vulnerability findings but spend significant time on manual triage, exploitability validation, and cross-team coordination before fixes are deployed. Continuum handles that middle work automatically.
  • A notable technical detail is the sandbox-based exploit validation, where Continuum builds reproducible proof of exploitability in an isolated environment before flagging a vulnerability as confirmed, reducing noise from theoretical findings.
  • Continuum integrates with existing AWS security tooling, including GuardDuty and Security Hub, and absorbs the previously separate AWS Security Agent capabilities under a unified product umbrella as Continuum penetration testing and Continuum code scanning.
  • A new threat modeling feature is also launching in preview, automatically generating STRIDE-format threat models from design documents or source code, which could reduce the manual effort typically required during architecture review processes. 
  • Pricing has not been publicly disclosed yet, and access requires requesting entry to the gated preview at aws.amazon.com/continuum.

32:01 📢 Matt – “This is extremely nice. It reminds me a lot of what GitHub did with their security feature where they’re trying to help you prioritize…So actually prioritizing vulnerabilities, because if you have a large code base, it’s going to happen. Prioritization is the real key here.” 

37:35 AWS Security Agent adds threat modeling, Kiro power and Claude Code plugin, and more

  • AWS Security Agent, now part of AWS Continuum, has expanded beyond its re:Invent 2025 preview to cover the full software development lifecycle: threat modeling and design reviews at design time, code review at development time, and penetration testing (now GA) at deployment time.
  • The new threat modeling feature (preview) uses the STRIDE framework to analyze design documents or source code, mapping data flows, trust boundaries, and attack vectors, along with prioritized mitigations, thereby reducing the manual effort typically required for security architecture reviews.
  • Code review capabilities now support GitHub, GitLab, Bitbucket (including self-hosted versions), and Confluence, with pull request scanning that validates findings in simulated environments to confirm actual exploitability rather than just flagging potential issues.
  • The Kiro power and upcoming Claude Code plugin let developers trigger security scans, generate threat models, and remediate findings directly from their IDE without context switching, using an open MCP integration that works with any AI-powered IDE.
  • AWS Security Agent offers a 2-month free trial, with full pricing details on the product page. It is available in select AWS commercial regions, with regional availability and roadmap details listed on the AWS Capabilities by Region page.

38:29 📢 Justin -” It is not terribly priced for what it does; these are tools you’re spending a lot of money on, like threat modeling. Overall, I was not too scared off by the  pricing on this one.”   

40:24 AWS DevOps Agent adds release management capabilities to assess code changes before production (preview)

  • AWS DevOps Agent now includes release management capabilities in preview, adding pre-production code review and autonomous release testing to its existing post-deployment incident investigation features, effectively covering the full software delivery lifecycle.
  • The release readiness review feature evaluates pull requests against user-defined natural language standards or general best practices, checking cross-repository dependency risks, access control changes against the Well-Architected Framework, and runs lightweight functional tests in an AWS-managed isolated environment before code enters the pipeline.
  • The autonomous release testing feature goes beyond static test suites by reasoning about what a specific code change does and generating tailored test plans covering functional correctness, behavioral regressions, and integration scenarios, producing structured artifacts including metrics, logs, and traces for each run.
  • Findings surface in multiple places, including the DevOps Agent console, GitHub and GitLab pull request comments, and directly in IDEs via Kiro or Claude Code plugins, with recommendations categorized as BLOCK, Proceed with Caution, or Safe to Release.
  • Both features are currently available at no additional cost during preview, limited to the US East N. Virginia region, with GitHub or GitLab repository connectivity required to get started; standard DevOps Agent pricing applies to other features at aws.amazon.com/devops-agent/pricing.

47:04 Introducing Amazon Bedrock Managed Knowledge Base for faster, more accurate enterprise AI applications

  • Amazon Bedrock Managed Knowledge Base is a new fully managed RAG service that handles the entire pipeline, including storage, embeddings, re-ranking, and retrieval, allowing developers to connect to enterprise data sources such as S3, SharePoint, Confluence, Google Drive, and OneDrive without building custom connectors.
  • The Agentic Retriever feature addresses a real limitation in standard RAG by automatically creating multi-step query plans for complex questions, performing multi-hop retrieval across knowledge bases rather than relying on a single retrieval pass.
  • Smart Parsing automatically selects the right parsing strategy per data type and connector, handling multimodal content like images, tables, and video without manual configuration, which reduces the typical weeks of experimentation needed to reach production-quality retrieval.
  • The service integrates with AgentCore Gateway as a native target type, exposing knowledge bases via the Model Context Protocol so frameworks like LangChain, LlamaIndex, CrewAI, and Strands Agents can discover and use them without custom integration code.
  • Pricing is usage-based with no upfront commitments, charged on indexed data storage and number of retrievals, with availability in US East, US West, Asia Pacific, Europe, and AWS GovCloud regions today. 
  • Existing Bedrock Knowledge Bases API users can migrate by pointing to a new knowledge base ID with no code changes.

48:26 📢 Justin – “They’ve been on a journey with this one, trying to get something good.” 

49:42 Amazon CloudWatch Synthetics now supports multi-location canaries

  • CloudWatch Synthetics now supports multi-location canaries, letting you manage a single canary in one primary Region while CloudWatch automatically replicates it to additional Regions, consolidating all metrics and artifacts centrally. 
  • This eliminates the previous requirement of creating separate canaries per Region, which caused configuration drift and added operational overhead.
  • The feature is particularly useful for validating third-party dependencies like CDNs and payment processors across geographic locations, and for identifying region-specific performance bottlenecks that would otherwise be invisible from single-region monitoring.
  • A notable capability is the multi-location alarm configuration, which only triggers when issues are detected from multiple locations simultaneously, reducing alert noise and helping teams distinguish real customer-impacting problems from isolated regional blips.
  • Existing single-region canaries can be upgraded to multi-location by simply adding replica Regions without recreating them, lowering the migration barrier for teams already using CloudWatch Synthetics. The feature is available across all AWS commercial Regions that support CloudWatch Synthetics.
  • Pricing is not explicitly called out in the announcement, so teams should check the CloudWatch Synthetics pricing page before scaling out replica canaries, as replication across multiple Regions will likely increase canary run costs proportionally to the number of Regions selected.

52:11 Run isolated sandboxes with full lifecycle control: AWS Lambda introduces MicroVMs 

  • AWS Lambda MicroVMs is a new serverless compute primitive built on Firecracker that provides VM-level isolation with near-instant startup, targeting multi-tenant applications that need to run user- or AI-generated code safely. It fills the gap between containers (fast but shared kernel) and full VMs (strong isolation but slow to start).
  • The image-then-launch model works by running your Dockerfile, initializing your application, and snapshotting the running memory and disk state, so every subsequent MicroVM launch resumes from that pre-initialized snapshot rather than booting cold. This means even large, stateful sessions start quickly enough to feel responsive to end users.
  • Each MicroVM supports up to 16 vCPUs, 32 GB of memory, and 32 GB of disk, with up to 8 hours of total runtime and configurable idle suspension policies that preserve full state while reducing cost. Auto-resume on incoming requests means the suspend/resume cycle is transparent to end users.
  • Practical use cases include AI coding assistants, interactive data analytics sessions, vulnerability scanners, and game servers running user-supplied scripts, all scenarios where you need per-user isolation without building custom virtualization infrastructure. Lambda Functions and Lambda MicroVMs are designed to complement each other rather than compete.
  • Lambda MicroVMs are available now in US East (N. Virginia, Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo) on ARM64 architecture. 
  • Pricing is on the AWS Lambda pricing page and follows usage-based billing, with suspended MicroVMs incurring lower costs than running ones.

53:28 📢 Matt- “This is one of those things it sounds really cool, but I don’t have a good use case to play with it yet.”

55:13 Amazon MSK now offers AI Agent Skills to help developers operate MSK efficiently and accelerate migrations to MSK

  • Amazon MSK now offers AI Agent Skills that integrate with coding assistants like Kiro, Claude Code, and Cursor to provide guided help for common Kafka operational tasks, including troubleshooting, sizing, configuring, monitoring, and cluster migration.
  • The skills are accessed through the AWS Agent Toolkit, which developers configure via the AWS CLI, then query conversationally with questions like “Is my Kafka cluster compatible with MSK Express?” turning specialized knowledge into a self-service experience.
  • A key use case is accelerating migrations from self-managed Kafka to MSK Express, which offers up to 3x more throughput per broker, 20x faster scaling, and 90% reduced recovery time compared to Standard brokers running Apache Kafka.
  • This fits into AWS’s broader Agent Toolkit ecosystem, suggesting a pattern where AWS services will increasingly expose operational knowledge as consumable skills for AI coding agents rather than relying solely on documentation or support tickets.
  • No additional pricing was announced for the AI Agent Skills themselves, though standard MSK and MSK Express cluster costs apply based on broker type, size, and usage.

56:16 📢 Justin – “I welcome this new feature. It’s great.” 

1:01:29 Amazon CloudWatch Logs supports managed syslog ingestion

  • CloudWatch Logs now supports native syslog ingestion from network devices like firewalls, routers, switches, and Linux servers via a VPC endpoint, removing the need to deploy and manage log collection agents across infrastructure.
  • The feature supports three common syslog formats, including RFC 5424, RFC 3164, and Cisco FTD/ASA, which cover a broad range of enterprise networking equipment and make adoption straightforward for teams already using Cisco gear.
  • CloudWatch automatically parses incoming syslog messages and extracts structured fields like facility, severity, hostname, and application name, so teams can immediately query logs using Logs Analytics without building custom parsing pipelines.
  • Transport options include TCP, TCP+TLS, and UDP, giving teams flexibility to match their existing device configurations while the TLS option addresses security requirements for sensitive log data in transit.
  • Standard CloudWatch Logs ingestion and storage pricing applies, and the feature is available in all commercial AWS regions except Middle East UAE, Middle East Bahrain, and Israel Tel Aviv. 
  • Documentation is available at the CloudWatch Logs docs linked in the announcement.

1:01:43 📢 Justin – “Another feature built, I imagine, by AI, because this is something I’ve asked for for years. I’ve basically just come to the conclusion that everything I’ve wanted for years that doesn’t have enough revenue… is just being written by AI Agents over there.” 

GCP

1:03:07  Google AI Studio’s Interactions API for Gemini models and agents

  • Google’s Interactions API has reached general availability and is now the primary interface for Gemini models and agents, replacing the older generateContent API as the default for Google AI Studio and all documentation. 
  • The API uses a simplified step-based schema and is available through Python and JavaScript SDKs.
  • Managed Agents is a notable addition where a single API call provisions a remote Linux sandbox capable of reasoning, executing code, browsing the web, and managing files. 
  • Developers can use the default Antigravity agent or define custom agents with their own instructions, skills, and data sources.
  • Background execution lets developers set background=True on any call to run interactions asynchronously, which is useful for long-running tasks. The API also supports mixing built-in tools like Google Search and Google Maps with custom functions in a single request.
  • On the cost side, Flex inference offers a 50% cost reduction compared to Priority tier, giving developers a way to trade latency for lower pricing. 
  • Paid tier users also get 55-day retention on past interactions, which is useful for stateful agentic workflows.
  • The legacy generateContent API remains supported and will continue receiving new mainline Gemini models, but Google has signaled that frontier capabilities for long-running and agentic use cases will land exclusively on the Interactions API going forward. 
  • A migration guide is available at ai.google.dev/gemini-api/docs/migrate-to-interactions for teams planning to transition.

1:04:20 📢 Justin – “Well, that’s nice that Google didn’t kill the legacy one.” 

1:05:15 Query logs and traces with SQL in Observability Analytics

  • Log Analytics has been renamed Observability Analytics and now includes generally available support for querying trace data alongside logs using SQL, all within Cloud Logging without needing to move or duplicate data.
  • The core capability lets you write SQL queries that JOIN log and trace data together, enabling analysis like finding checkout requests over 5 seconds and identifying which microservice caused the slowdown, or calculating P95 latency across thousands of AI agent tool calls.
  • A notable use case is AI agent observability, where teams can run aggregate queries across millions of span events to calculate failure rates and latency percentiles per tool, then drill down by joining trace spans with logs to extract the exact LLM prompts that led to failures.
  • The Observability API is now GA, allowing teams to create linked BigQuery datasets from their observability buckets so AI agents and analytical workloads can query telemetry programmatically via standard BigQuery APIs, which is useful for automated monitoring pipelines.
  • Pricing is not explicitly detailed in the announcement, so teams should check Cloud Logging and BigQuery pricing pages directly, though the in-place analysis approach is positioned as a way to reduce costs compared to exporting and duplicating data elsewhere. 
  • Query examples are available at github.com/GoogleCloudPlatform/observability-analytics-samples.

Azure

1:25:32 What’s new with Microsoft in open source and Kubernetes at Open Source Summit and KubeCon India 

  • AKS now offers agent pool rollback in general availability, letting operators revert both the Kubernetes version and node image with a single command across all node pool types, which reduces recovery time from bad upgrades without requiring manual reprovisioning or snapshot management.
  • Azure Kubernetes Fleet Manager now supports up to 1,000 member clusters, up from 200, and Managed Fleet Namespaces are generally available, allowing teams to define namespaces once as ARM resources and propagate them consistently across large multi-cluster estates, including Arc-enabled hybrid and multi-cloud environments.
  • GPU efficiency gets two notable additions: configurable scheduler profiles let teams pack pods more densely using the upstream Kubernetes scheduling framework without running a custom scheduler, and GPU memory profiling in preview adds function-level visibility through Prometheus and Grafana to catch memory leaks before out-of-memory crashes occur.
  • Artifact streaming from Azure Container Registry reduces pod startup for images under 10 GB from minutes to seconds by streaming only the layers needed at startup rather than pulling full images, which directly improves scale-out responsiveness for AI workloads.
  • The Azure SRE Agent now covers AKS incident scenarios in preview, automatically gathering diagnostic evidence and attributing failures to specific layers like workload, network, or cluster before proposing a next step, with writes remaining approval-gated and audited for operator control.

1:08:07 📢 Matt – “A lot of these are just nice quality of life. Being able to  provision your namespace in ARM, especially when you’re redeploying across multiple environments, is a nice quality-of-life improvement.” 

1:09:15 Rethinking cloud operations with agentic observability

  • Microsoft announced the general availability of the Azure Copilot Observability Agent, built on Azure Monitor, which correlates logs, metrics, traces, and topology signals across agents, applications, and infrastructure to help operators identify root causes faster. 
  • Pricing details were not disclosed in the announcement, so listeners should check the Azure Monitor pricing page for specifics.
  • The agent addresses a real operational pain point: a Microsoft and Material survey of 250 IT decision-makers found 84% report increased cloud complexity and 69% say it is outpacing their current operating model. The tool aims to reduce the manual effort of piecing together context across multiple monitoring tools.
  • Early customer results are notable, with KPMG reporting an estimated 250 engineering hours reclaimed monthly after adopting the capabilities. Other customers like PolicyVault and Ontinue report faster incident investigation by correlating telemetry with Azure resource health and surfacing actionable next steps.
  • The agent fits into a broader agentic operations model Microsoft is building on Azure, where systems generate signals, agents interpret and act on them, and outcomes feed back into improving future cycles. Governance features, including policy controls, auditability, and human oversight guardrails, are positioned as central to this model.
  • This is worth watching for teams running complex Azure workloads who currently rely on manual incident triage across multiple tools. The integration directly into existing workflows rather than requiring a separate platform is a practical consideration for 

Cloud Journey 

1:11:01 Running an AI-native engineering org 

  • Anthropic’s engineering director describes how agentic coding shifted the primary bottleneck from writing code to verifying it, meaning code review, security checks, and correctness validation now consume the time that implementation used to take.
  • The team replaced traditional sprint planning and design docs with just-in-time planning built around prototypes and PR discussions, reflecting that long-horizon roadmaps became obsolete when execution speed increased substantially.
  • Human review is now reserved for specific high-stakes areas like security-sensitive code, legal risk, and product judgment, while Claude handles style, linting, bug catching, and test generation automatically.
  • Role boundaries have blurred noticeably, with product managers writing more code and engineers taking on design and content work, which has practical implications for how teams hire and structure responsibilities.
  • The article suggests engineering leaders track three metrics as they adopt agentic workflows and cautions against treating throughput as the primary success measure, since the real goal is solving the underlying problem faster, not just generating more output.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0:00
0:00