Welcome to episode 342 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Matt are in the studio today to bring you all the latest in cloud and AI news this week. How do you feel about ads? How do you feel about ads while using AI? We’ve got options! We’ve got a round-up of tech Super Bowl ads, AI ads, Earnings reports (who frankly need the ad revenue), and a plethora of Opus 4.6 announcements, plus more. Let’s get started!
Titles we almost went with this week
- 🧑💼 ChatGPT Goes Full Mad Men: Your AI Assistant Now Comes With Commercial Breaks
- 🆕 Heroku’s New Feature: No New Features
- 📈 AWS Gives EC2 Instances a Storage Growth Spurt: 22.8TB of Local NVMe Now Available
- 👯 Identity Crisis Averted: IAM Identity Center Learns to Replicate Itself
- 🧑⚖️ JSON Schema Enforcement: Because Your LLM Needs Structure in Its Life
- 🏃 From Zero to Admin in 480 Seconds: A Serbian Speedrun Story
- 🦞 From Proof of Concept to Proof of Claw: DigitalOcean Tames AI Agent Infrastructure
- ☁️ Azure’s Growth Hits the Clouds: Microsoft’s 39% Increase Still Not Enough for Wall Street
- 🚣 One Lake to Rule Them All: Microsoft and Snowflake Finally Stop Fighting Over Your Data
- 🍴 Free Lunch Officially Over: ChatGPT Learns That Servers Cost Money
- 💳 Claude Won’t Sell You Anything (Except Maybe Peace of Mind)
- 🗺️ IAM Identity Center Goes Multi-Regional: Because One Region to Rule Them All Wasn’t Enough
- 🏢 Databricks Takes the Base Out of Database with Lakebase GA
- 🌐 I’m a Chrome Tab hoarder
General News
01:30 Superbowl Ads of Note
- OpenAI: https://www.youtube.com/watch?v=aCN9iCXNJqQ
- Microsoft CoPilot: https://www.youtube.com/watch?v=Ndj9Jk-tGKo
- Base44?: https://www.youtube.com/watch?v=iKEUWtqvsis
- Gemini: https://www.youtube.com/watch?v=Z1yGy9fELtE
- Anthropic: https://www.youtube.com/watch?v=gmnjDLwZckA
- ai.com: https://www.youtube.com/watch?v=n7I-D4YXbzg&t=3s
16:35📢 Justin -If you ever want to knowif there’s a bubble, spending dumb money on the Super Bowl on an ad that makes no sense is probably your number one clue.”
16:53 It’s Earnings Time!
Microsoft (MSFT) Q2 earnings report 2026
- Microsoft Q2 2026 earnings show Azure cloud growth slowing to 39% from 40% in the prior quarter, missing analyst expectations of 39.4% and causing shares to drop 7% in after-hours trading.
- The company’s gross margin hit a three-year low at 68% due to substantial AI infrastructure investments totaling $37.5 billion in capital expenditures, up 66% year over year.
- OpenAI now represents 45% of Microsoft’s $625 billion remaining commercial performance obligation after the company committed to a $250 billion cloud services deal during the quarter.
- This concentration raises questions about revenue dependence on a single customer, though Microsoft maintains that the remaining backlog is still larger and more diversified than most competitors, with 28% growth.
- Microsoft 365 Copilot adoption reached 15 million seats out of 450 million total paid commercial seats, representing only 3.3% penetration.
- The company plans to raise prices on commercial Office subscriptions in July to help offset AI infrastructure costs and improve margins, while Q3 guidance projects Azure growth of 37-38% at constant currency.
- The More Personal Computing segment declined 3%, with gaming revenue down 9.5% due to an unspecified impairment charge, reflecting ongoing challenges in the Xbox division.
- Microsoft added nearly one gigawatt of data center capacity in the quarter alone, but continues to face supply constraints that cannot keep pace with customer demand for AI services.
20:27 Alphabet (GOOGL) Q4 2025 earnings
- Alphabet plans to spend between $175 billion and $185 billion on capital expenditures in 2026, more than double its 2025 spending, primarily targeting AI compute capacity for DeepMind and meeting cloud customer demand.
- This represents one of the largest infrastructure investments in tech history and signals the scale of resources required to compete in enterprise AI.
- Google Cloud revenue grew 48% year-over-year to $17.66 billion and beat analyst expectations, with backlog reaching $240 billion after increasing 55% sequentially.
- The cloud division’s performance demonstrates strong enterprise adoption of Google’s AI services and positions it as a more competitive alternative to AWS and Azure.
- Gemini AI now has 750 million monthly active users, up from 650 million last quarter, while Google reduced Gemini serving costs by 78% throughout 2025 through model optimizations and efficiency improvements.
- This cost reduction is critical for maintaining profitability as AI services scale to hundreds of millions of users.
- YouTube advertising revenue of $11.38 billion missed analyst expectations of $11.84 billion, which Alphabet attributed to difficult year-over-year comparisons against strong US election spending in Q4 2024.
- This shortfall highlights how political advertising cycles create volatility in digital ad revenue forecasting.
- Waymo recorded a $2.1 billion stock-based compensation charge following its $16 billion valuation fundraising round, contributing to Other Bets losses exceeding $3.6 billion despite serving 15 million autonomous rides across six US markets.
- The charge reflects the high cost of retaining talent in competitive autonomous vehicle development.
22:05 📢 Justin – “Gemini adoption must be ramping up much faster than I realized, because the fact that Microsoft was missing on earnings, and they’re the OpenAI provider for the most part… makes me question how well OpenAI is actually doing.”
22:50 AWS Q4 earnings report 2025
- AWS Q4 2025 revenue reached $35.58 billion with 24% year-over-year growth, maintaining its market leadership position, while operating margins improved to 35%.
- The cloud unit now represents 17% of Amazon’s total revenue but generates the majority of the company’s profits at $12.47 billion in operating income.
- Amazon plans to invest $200 billion in capital expenditures for 2026, primarily for AWS infrastructure, which significantly exceeds analyst expectations of $148.86 billion.
- The company added 4 gigawatts of computing capacity in 2025 and plans to double that by the end of 2027, with most investment directed toward AI workloads rather than traditional cloud services.
- AWS growth rate of 24% trails competitors Google Cloud at 48% and Azure at 39%, suggesting potential market share shifts in AI-driven cloud services. Both competitors are reporting stronger growth attributed to artificial intelligence workloads, which may indicate AWS is losing ground in the AI infrastructure race despite its overall market leadership.
- The company secured a $38 billion spending commitment from OpenAI and launched Nova Forge for advanced AI model customization at $100,000 annually.
- These moves demonstrate AWS’s strategy to compete in the generative AI training market, though the pricing and approach differ from competitors’ offerings.
- Capital expenditure guidance reveals that non-AI workloads are growing faster than anticipated, requiring additional infrastructure investment beyond AI capacity.
- This indicates traditional cloud computing demand remains strong and may be underestimated in current market analysis focused primarily on AI growth.
25:11 Capex Growth By Quarter
24:14 📢 Justin – “They also took a major write-off on Amazon Fresh, because they’re shutting that down as well. So just bad, bad all the way around for Amazon.”
29:23 An Update on Heroku
- Heroku is moving to a sustaining engineering model, meaning no new features will be developed while the platform continues to receive security patches, stability updates, and operational support.
- This represents a shift from active development to maintenance mode for the 15-year-old platform-as-a-service.
- Existing customers can continue using Heroku with no changes to pricing, billing, or service levels, and all core functionality, including applications, pipelines, teams, and add-ons, remains fully operational.
- Credit card-based accounts remain available for both current and new customers through the dashboard.
- Salesforce is ending new Enterprise Account contracts while honoring existing enterprise subscriptions and support agreements through their renewal periods. This signals a strategic pivot away from enterprise sales expansion while maintaining commitments to current large customers.
- The parent company is redirecting engineering resources toward enterprise AI capabilities rather than continuing platform-as-a-service innovation. This follows a pattern of Salesforce deprioritizing Heroku since the acquisition, including the 2022 elimination of free tiers and reduced feature velocity in recent years.
- Developers relying on Heroku for production workloads should evaluate long-term platform viability given the maintenance-only status, though no immediate migration is required.
- The announcement provides clarity for capacity planning but raises questions about the platform’s competitiveness as cloud-native alternatives continue advancing.
31:32 📢 Matt – “It’s a great platform as a service, and I’m sad to see it go, because there’s a lot of companies I’ve worked with in the past that have started there because it was just so easy. The problem for them, at least back in the day, was scaling and supporting and having a lot of other features, which meant I helped a lot of customers move from Heroku to AWS to gain other aspects of the platform that they needed. So it doesn’t really surprise me, but it was a good starting point for a lot of companies.”
35:58 AI-assisted cloud intrusion achieves admin access in 8 minutes | Sysdig
- An attacker achieved full AWS administrative access in just 8 minutes by exploiting credentials found in public S3 buckets, then used Lambda code injection to escalate privileges.
- The attack shows strong evidence of LLM assistance, including Serbian-language code comments, hallucinated AWS account IDs, and references to non-existent GitHub repositories.
- The threat actor compromised 19 different AWS principals through role chaining and cross-account access attempts, making detection difficult by distributing operations across multiple identities. They specifically targeted AI infrastructure by invoking 9 different Bedrock models and attempting to launch expensive GPU instances (p5.48xlarge and p4d.24xlarge) for potential model training or compute resale.
- The attack demonstrates how AI tools are accelerating offensive operations, with the attacker completing reconnaissance, privilege escalation, and resource abuse in under two hours.
- Organizations should implement least-privilege IAM policies, restrict Lambda UpdateFunctionCode permissions, and enable Bedrock model invocation logging to detect similar attacks.
- Critical security gaps included overly permissive Lambda execution roles with administrative access and the ReadOnlyAccess policy on the compromised user, which enabled extensive reconnaissance across all AWS services.
- The attacker also attempted to deploy a Terraform-based backdoor that would create a publicly accessible Lambda function for generating persistent Bedrock credentials.
- The use of IP rotation, role chaining, and distributed operations across multiple principals shows sophisticated evasion techniques.
- Detection requires behavioral analytics that can identify patterns like rapid enumeration across services, unusual Bedrock model invocations, and Lambda code modifications rather than relying on single-event alerts.
34:24 📢 Ryan – “These are the types of examples I use when trying to talk to people about least privileged development and how, even in your lower environments where you think you’re safe, and you’re trying to develop things it’s really not okay to start not using least privileged access because there’s very creative ways in which you can do privilege escalation – this lambda attack is a very good example. And now it’s going to be so easy because AI will just do it for you, and this really demonstrates it.”
AI Is Going Great – Or How ML Makes Money
37:09 Claude is a space to think | Anthropic \ Anthropic
- Anthropic commits to keeping Claude ad-free, stating that advertising would be incompatible with Claude’s role as a trusted assistant for work and deep thinking.
- The company will continue its subscription and enterprise-based revenue model rather than introducing sponsored content or product placements in conversations.
- Analysis of Claude conversations shows a substantial portion involves sensitive personal topics or complex technical work where ads would be inappropriate. Anthropic argues that AI conversations differ from search or social media because users share more context, and the open-ended format makes them more susceptible to commercial influence.
- The company identifies specific risks with ad-supported AI models, including unpredictable behavior changes when advertising incentives are introduced. For example, a user asking about sleep problems might receive recommendations influenced by commercial motives rather than purely helpful advice, making it difficult to distinguish genuine assistance from monetization attempts.
- Anthropic will support commerce through user-initiated interactions like agentic commerce, where Claude handles purchases on behalf of users, and third-party tool integrations with services like Figma and Asana.
- The key distinction is that these features are triggered by user requests rather than advertiser interests.
- The decision has clear tradeoffs for business model scalability compared to ad-supported competitors.
- Anthropic is addressing access through educational partnerships in 60+ countries, nonprofit discounts, and maintaining frontier-level intelligence in free tiers rather than monetizing user attention.
37:22 Claude Opus 4.6 \ Anthropic
- Claude Opus 4.6 is now generally available with a 1M token context window in beta, marking the first time an Opus-class model has offered this extended context capability.
- The model maintains $5/$25 per million token pricing, with premium pricing of $10/$37.50 for prompts exceeding 200k tokens.
- The model introduces adaptive thinking and four effort levels (low, medium, high, max) that let developers control how deeply Claude reasons through problems, balancing intelligence against speed and cost. Context compaction automatically summarizes older conversation history when approaching limits, enabling longer-running agentic tasks without hitting context windows.
- Opus 4.6 achieves state-of-the-art performance on Terminal-Bench 2.0 for agentic coding and outperforms GPT-5.2 by 144 Elo points on GDPval-AA, an evaluation of economically valuable knowledge work tasks.
- On the 8-needle 1M variant of MRCR v2, it scores 76% compared to Sonnet 4.5’s 18.5%, demonstrating substantially improved long-context retrieval without degradation.
- New product features include agent teams in Claude Code that work in parallel and coordinate autonomously, plus Claude in PowerPoint (research preview) and upgraded Claude in Excel for handling multi-step data processing and presentation tasks. The model also supports 128k output tokens and US-only inference at 1.1x pricing for compliance-sensitive workloads.
- Safety evaluations show Opus 4.6 maintains alignment comparable to its predecessor while exhibiting the lowest over-refusal rate of any recent Claude model.
- Anthropic developed six new cybersecurity probes to monitor potential misuse given the model’s enhanced security capabilities, and is using the model to find and patch vulnerabilities in open-source software.
34:24 📢 Ryan – “One of the things that I’m constantly dabbling with is the context windows, and so I’m not so sure the context compaction works the way it’s advertised, because every time I go through a process like that, you lose so much.”
43:18 Introducing OpenAI Frontier | OpenAI
- OpenAI launches Frontier, an enterprise platform for building, deploying, and managing AI agents across existing infrastructure without requiring replatforming.
- The platform provides agents with shared business context by connecting siloed data warehouses, CRM systems, and internal applications, plus includes identity management, permissions, and governance controls for regulated environments.
- Frontier includes an agent execution environment where AI coworkers can reason over data, work with files, run code, and use tools while building memory from past interactions to improve performance.
- The platform works across local environments, enterprise cloud infrastructure, and OpenAI-hosted runtimes, with built-in evaluation and optimization capabilities to help agents learn what good performance looks like over time.
- OpenAI pairs Forward Deployed Engineers with customer teams to help develop best practices for production agent deployments, creating a feedback loop between business problems, deployment, and OpenAI Research. Early adopters include HP, Intuit, Oracle, State Farm, Thermo Fisher, and Uber, with existing customers like BBVA, Cisco, and T-Mobile piloting the platform.
- The platform uses open standards to integrate with existing systems and applications, allowing third-party agent apps to access shared business context without lengthy custom integrations. OpenAI is working with Frontier Partners including Abridge, Clay, Ambience, Decagon, Harvey, and Sierra, to design and support enterprise AI solutions on the platform.
- Frontier is currently available to a limited set of customers with broader availability planned over the next few months.
- OpenAI cites customer results, including a manufacturer reducing production optimization from six weeks to one day and a hardware company cutting test failure debugging from four hours to minutes.
44:35 📢 Ryan – “I think they’re extremely late to the market with this. AWS was too early, and they botched it. Gemini seems to be in the sweet spot, and OpenAI – it’s still not ready yet.
46:28 Introducing GPT-5.3-Codex | OpenAI
- OpenAI released GPT-5.3-Codex, their most capable agentic coding model that combines the frontier coding performance of GPT-5.2-Codex with the reasoning capabilities of GPT-5.2, while running 25% faster.
- The model achieves state-of-the-art results on SWE-Bench Pro and Terminal-Bench 2.0 benchmarks, using fewer tokens than previous models, and can autonomously iterate on complex projects over millions of tokens spanning days.
- GPT-5.3-Codex represents the first self-improving model at OpenAI, where the Codex team used early versions to debug its own training, manage deployment, and diagnose test results.
- Internal teams report their work has fundamentally changed in the past two months, with researchers using Codex to monitor training runs, engineers using it to optimize harnesses and scale GPU clusters, and data scientists building custom pipelines and visualizations in under three minutes.
- The model extends beyond code generation to full computer operation, showing strong performance on OSWorld (visual desktop environment tasks) and matching GPT-5.2 on GDPval, which measures knowledge work across 44 occupations, including presentations, spreadsheets, and other professional deliverables.
- The Codex app now provides real-time updates and interactive steering, allowing users to direct and supervise multiple agents working in parallel.
- OpenAI classifies GPT-5.3-Codex as having high capability for cybersecurity under their Preparedness Framework, marking the first model directly trained to identify software vulnerabilities.
- They are deploying Trusted Access for Cyber, expanding the Aardvark security research agent beta, and committing 10 million dollars in API credits through their Cybersecurity Grant Program for open source and critical infrastructure defense.
- GPT-5.3-Codex is available now with paid ChatGPT plans across the Codex app, CLI, IDE extension, and web, with API access coming soon.
- The model was co-designed for and trained on NVIDIA GB200 NVL72 systems, with infrastructure improvements delivering the 25% speed increase for all Codex users.
47:48 📢 Ryan – “I’m surprised this is the first self-improving model.”
48:43 Testing ads in ChatGPT | OpenAI
- OpenAI is launching ads in ChatGPT for free and Go tier users in the US, while Plus, Pro, Business, Enterprise, and Education subscribers remain ad-free. Users can opt out of ads on the free tier in exchange for reduced daily message limits.
- Ads are contextually matched to conversation topics and chat history but do not influence ChatGPT responses, which remain independent. Advertisers receive only aggregate performance metrics like views and clicks, with no access to individual chats, memories, or personal details.
- The ad program excludes users under 18 and blocks ads near sensitive topics, including health, mental health, and politics. Users can dismiss ads, provide feedback, delete ad data with one tap, and manage personalization settings at any time.
- OpenAI positions this as infrastructure funding to maintain free tier performance and quality while supporting development of more powerful features.
- The company plans to expand ad formats, objectives, and buying models over time based on test results and user feedback.
49:45 Announcing Claude Opus 4.6 on Snowflake Cortex AI
- Snowflake Cortex AI now offers Claude Opus 4.6, Anthropic’s most capable model, providing enhanced reasoning and complex task handling directly within Snowflake’s data platform.
- This integration allows enterprises to process sensitive data without moving it outside their Snowflake environment, maintaining data governance and security controls.
- Claude Opus 4.6 delivers improved performance on coding tasks, mathematical reasoning, and multilingual capabilities compared to previous versions. The model excels at nuanced instructions and can handle sophisticated analysis workflows while operating on structured and unstructured data within Snowflake.
- Cortex AI’s serverless architecture means customers pay only for actual model usage without managing infrastructure or dealing with capacity planning.
- The integration supports both SQL and Python interfaces, enabling data teams to build AI applications using familiar tools and existing Snowflake data pipelines.
- Organizations can now combine Claude Opus 4.6 with Snowflake’s data clean rooms and governance features for compliant AI deployments in regulated industries.
- This addresses enterprise concerns about data residency and privacy while enabling advanced AI capabilities on proprietary datasets.
49:57 📢 Justin – “And just because we’re already 50 minutes into this, I will tell you we’re also getting Claude Opus 4.6 on multiple other providers, including Bedrock, Kiro, Vertex AI, and we’re getting it on Azure, in the Moicrosift Foundry App, as well as some of the smaller cloud providers, like DataBricks and DigitalOcean.”
50:45 Agent Bricks Supervisor Agent is Now GA: Orchestrate Enterprise Agents | Databricks Blog
- Databricks Agent Bricks Supervisor Agent is now Generally Available, providing a managed orchestration layer that coordinates multiple specialized agents through Unity Catalog governance.
- The supervisor uses dynamic routing to analyze user intent and delegate tasks between Genie Spaces for structured data queries, Knowledge Assistant agents for unstructured data, and MCP servers for tool execution.
- The platform implements On-Behalf-Of authentication where the supervisor acts as a transparent proxy, validating every data fetch and tool execution against the end user’s existing Unity Catalog permissions.
- This eliminates the common security gap where agents access data through broad service accounts that users themselves aren’t authorized to see.
- Agent Learning on Human Feedback is built directly into the Supervisor Agent, allowing teams to add questions and guidelines that improve routing decisions and response quality over time.
- Franklin Templeton reports reducing fund analysis tasks from days to seconds while maintaining compliance, and Zapier uses ALHF to refine orchestration between different Genie spaces without hard-coding routing logic.
- The system addresses enterprise agent sprawl, where teams toggle between dozens of specialized bots and duplicate work by creating agents that already exist.
- Supervisor Agent provides a single entry point that reasons about intent and coordinates specialized agents while maintaining full MLflow experiment tracking for measurable performance monitoring.
51:40📢 Ryan – “It just goes to show you, depending on who your provider is, this is the type of platform you’re going to need, right? So if you already are using a whole bunch of AI execution on Snowflake, or if you’re only using it on OpenAI’s platform, you’re just going to need to sign on to the platform that’s already there.”
Cloud Tools
52:09 Introducing HashiCorp Agent Skills
- HashiCorp launches Agent Skills, an open-standard repository that packages domain expertise into portable instructions for AI assistants working with Terraform and Packer.
- These skills provide AI tools like Claude with specialized HashiCorp product knowledge, schema definitions, and best practices to reduce hallucinations and ensure code follows proper conventions.
- The initial skills pack addresses common DevOps challenges, including building and maintaining Terraform providers, generating style-compliant Terraform code, refactoring monolithic configurations into modules, and creating machine images with Packer across AWS, Azure, and Windows.
- HashiCorp partnered with Tessl to evaluate skill effectiveness using review and task-based evaluations against Anthropic’s best practices.
- Agent Skills differ from Model Context Protocol (MCP) as complementary technologies – MCP is the data pipe connecting information to AI, while Agent Skills are the knowledge textbooks. Installation takes seconds using npx, Tessl CLI, or Claude Code’s plugin marketplace with simple one-line commands.
- The skills solve a fundamental problem where AI assistants lack a specific technical context for complex infrastructure tasks, particularly around HashiCorp’s plugin framework architectures and coding conventions.
- This prevents AI from suggesting outdated practices or generating code that doesn’t follow established patterns from official documentation.
- HashiCorp plans to expand beyond Terraform and Packer to cover additional products and welcomes community contributions through its GitHub repository.
- The open-standard format means these skills are portable and reusable across different AI assistants that support the Agent Skills specification.
53:17📢 Justin – “I love this, because how many times I pointed Claude or others to the documentation, and said ‘I’m pretty sure you’re wrong, this is how it’s supposed to be done, here’s the doc.’ And it comes back and goes, you’re right, Justin, because you’re a genius. That’s what it always tells me.”
AWS
56:10 Amazon EC2 C8id, M8id, and R8id instances with up to 22.8 TB local NVMe storage are generally available
- In “instances so big we don’t know what to do with them,” may we present…
- AWS launches C8id, M8id, and R8id EC2 instances with up to 22.8TB of local NVMe storage, triple the capacity of sixth-generation instances.
- These new instances scale up to 96xlarge with 384 vCPUs and 3TiB of memory, delivering up to 43% higher compute performance and 3.3x more memory bandwidth than previous generation instances.
- The instances use custom Intel Xeon 6 processors exclusive to AWS, running at a 3.9 GHz sustained all-core turbo frequency. Performance improvements include up to 46% better I/O intensive database workload performance and 30% faster query results for real-time data analytics compared to sixth-generation instances.
- Instance Bandwidth Configuration feature allows customers to dynamically allocate resources between network and EBS bandwidth by 25%, optimizing for specific workload requirements.
- The local NVMe storage is hardware-encrypted with XTS-AES-256 and ephemeral, meaning data is lost when instances stop or terminate.
- Currently available in US East N. Virginia, US East, Ohio, US West, Oregon, and Europe, Frankfurt regions, with additional regions planned.
- Instances can be purchased as On-Demand, Savings Plans, Spot Instances, Dedicated Instances, or Dedicated Hosts, with pricing varying by region and purchase model.
56:47📢 Matt – “If it’s all core turbo, is it really turbo at that point?”
58:45 AWS IAM Identity Center now supports multi-Region replication for AWS account access and application use
- AWS IAM Identity Center now supports multi-Region replication, allowing organizations to replicate workforce identities, permission sets, and metadata from a primary Region to additional Regions for improved resiliency and disaster recovery.
- This means if the primary Region experiences a service disruption, users can still access AWS accounts through an active access portal endpoint in a secondary Region using their existing permissions.
- The feature requires using an organization instance of IAM Identity Center connected to an external IdP like Microsoft Entra ID or Okta, and you must first configure multi-Region customer-managed KMS keys before replicating to additional Regions.
- The primary Region remains the central management point for all configurations, while additional Regions provide read-only console access except for application management and user session revocation.
- Organizations can now deploy AWS managed applications closer to users and datasets to meet data residency requirements or improve performance, with applications accessing replicated workforce identities locally in each Region. This addresses compliance scenarios where datasets must remain in specific Regions while still providing centralized identity management.
- The feature is available at no additional cost in 17 enabled-by-default commercial AWS Regions, with only standard AWS KMS charges applying for customer-managed keys.
- All workforce actions are logged in CloudTrail in the Region where they occur, maintaining audit trails across multiple Regions for security and compliance monitoring.
59:32 📢 Justin – “I recently set up IAM Identity Center for the first time, and I was surprised that it was US East 1 only, so I’m pleased to see this is now available.”
1:00:25 Amazon ECS adds Network Load Balancer support for Linear and Canary deployments
- ECS now supports linear and canary deployment strategies natively with Network Load Balancers, bringing managed traffic shifting to TCP/UDP workloads that previously required custom solutions or third-party tools.
- This fills a deployment gap for applications needing NLB features like static IPs, long-lived connections, and low latency.
- The feature integrates with CloudWatch alarms for automatic rollback if deployment issues are detected, providing safety guardrails for production updates.
- Teams can shift traffic incrementally (linear) or start with a small percentage for validation (canary) before completing rollouts.
- Primary beneficiaries are latency-sensitive and connection-oriented workloads such as online gaming backends, financial transaction systems, and real-time messaging services that depend on NLB’s Layer 4 capabilities.
- These applications can now use the same deployment patterns ALB users have had access to for years.
- Available immediately in all AWS commercial and GovCloud US regions for both new and existing ECS services.
- Configuration is accessible through the AWS Console, CLI, and Infrastructure-as-Code tools with no additional cost beyond standard ECS and NLB pricing.
- This brings ECS deployment parity between ALB and NLB, eliminating a common pain point.
1:01:19 📢 Ryan – “This is one of those rough edges that you hit unexpectedly. You want to use a network load balancer, typically because you have to. It’s easier to set up an application load balancer. You’re only using a network load balancer when it’s not your choice, but then you can’t deploy this app safely without lots of interruption or risk, and it’s kind of a problem.”
1:02:12 Structured outputs now available in Amazon Bedrock
- Amazon Bedrock now enforces JSON schema compliance at the model level, eliminating the need for custom validation logic and retry mechanisms when extracting structured data from foundation models.
- This addresses a common production pain point where formatting errors in LLM responses break downstream API integrations and automated workflows.
- The feature works in two modes: custom JSON schema definitions for response formatting, or strict tool definitions that ensure model tool calls match exact specifications.
- This reduces operational overhead by preventing malformed outputs before they reach application code, making AI integrations more reliable for production use cases like data extraction, form processing, and API orchestration.
- Available now for Anthropic Claude 4.5 models and select open-weight models across all commercial AWS Regions where Bedrock operates.
- The capability works with Converse, ConverseStream, InvokeModel, and InvokeModelWithResponseStream APIs, providing flexibility for both synchronous and streaming applications.
- The practical benefit is fewer failed requests and reduced engineering time spent on output parsing and error handling.
- Organizations building production AI applications that feed into existing systems or databases can now rely on consistent, machine-readable responses without building extensive validation layers.
- int where teams had to choose between advanced deployment strategies and NLB’s technical requirements. Documentation available at https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-type-linear.html
FYI Claude Opus 4.6 now available in Amazon Bedrock
- Claude Opus 4.6 is now available in Amazon Bedrock, positioning itself as Anthropic’s most capable model with particular strength in coding, agentic workflows, and enterprise applications.
- The model supports both 200K and 1M context windows in preview, enabling analysis of large codebases and extensive document sets without chunking.
- The model’s agentic capabilities allow it to manage complex multi-step tasks across dozens of tools with reduced oversight, including the ability to autonomously spin up subagents for task decomposition.
- This makes it suitable for enterprise workflows like financial analysis that would typically require days of manual work, cybersecurity threat detection, and cross-application data movement.
- For developers, Opus 4.6 handles full software lifecycle management from requirements gathering through implementation and maintenance, particularly for long-horizon projects and large-scale codebases.
- The model’s deep reasoning capabilities make it applicable to professional work requiring sophisticated multi-step orchestration.
- Regional availability varies by deployment, with specific regions listed in the AWS Bedrock documentation. Pricing follows Bedrock’s standard model-based pricing structure, though specific costs for Opus 4.6 are not detailed in the announcement and should be verified in the Bedrock console.
FYI Opus 4.6 is now available in Kiro
- Kiro has released Claude Opus 4.6 integration in their IDE and CLI, marking Anthropic’s newest state-of-the-art model that claims to be the world’s best for coding.
- The model is available to Kiro Pro, Pro+, and Power customers in AWS US-East-1 region with a 2.2x credit multiplier, same as Opus 4.5.
- Opus 4.6 targets production code and sophisticated agents, with particular strength in large-scale codebases and long-horizon projects.
- Anthropic positions it as capable of helping senior engineers complete multi-day projects in hours through task delegation with reduced oversight requirements.
- The model integrates with Kiro’s spec-driven development workflows, enabling detailed but precise specifications on large existing projects and surgical precision updates with minimal user input.
- This represents a shift toward AI-assisted development at enterprise scale rather than simple code completion.
- Access requires authentication through Google, GitHub, AWS BuilderID, or AWS IAM Identity Center, with experimental support currently limited to the Northern Virginia region. Users can access the model immediately by downloading or restarting the Kiro app or CLI.
1:03:55 Amazon Redshift now supports allocating extra compute for automatic optimizations
- Amazon Redshift now allows database administrators to allocate dedicated compute resources specifically for automatic optimization tasks like table optimization, sorting, vacuuming, and analysis.
- This prevents maintenance operations from competing with user queries during peak usage periods, addressing a common pain point where DBAs had to manually schedule these tasks during off-hours.
- The feature includes cost controls for provisioned clusters, letting administrators cap the amount of extra compute resources that autonomics can consume. This prevents runaway costs while still enabling continuous optimization, and works alongside the new SYS_AUTOMATIC_OPTIMIZATION system table that provides visibility into what optimization operations are running and their resource consumption.
- This enhancement is available across all AWS Regions where Redshift operates, supporting both provisioned clusters and serverless workgroups.
- The feature essentially decouples database maintenance from query performance, which is particularly valuable for organizations running 24/7 analytics workloads that previously had no maintenance windows.
- The practical benefit is that Redshift databases can now stay optimized continuously without manual intervention or performance degradation during business hours.
- Organizations with high-concurrency analytics workloads or those operating across multiple time zones will see the most immediate value from this capability.
1:04:35 📢 Justin – “This is why I wanted a managed service from you, Amazon, so I didn’t have to think about this. This is you failing me.”
GCP
1:05:26 Introducing the Developer Knowledge API and MCP Server
- Google launches the Developer Knowledge API and Model Context Protocol server to provide AI assistants with programmatic access to official Google developer documentation as machine-readable Markdown.
- This addresses the problem of LLMs relying on outdated training data or web scraping when helping developers build with Google technologies like Firebase, Android, and Google Cloud.
- The MCP server implements the open Model Context Protocol standard, allowing popular AI assistants and IDEs to directly query Google’s documentation for real-time answers about API changes, code examples, and best practices. Developers can enable it through gcloud CLI and configure it in their AI assistant settings, with support for tools like Claude Desktop and various IDE extensions.
- The service is currently in public preview with free access through standard Google Cloud API quotas.
- Future plans include adding structured content support for code samples and API reference entities, expanding the documentation corpus, and reducing re-indexing latency before general availability.
- This integration benefits developers using AI coding assistants by ensuring responses reference current Google documentation rather than potentially stale information from model training cutoffs. The approach provides a canonical source of truth that updates as Google’s documentation changes.
- The Developer Knowledge API requires a Google Cloud project with the API enabled through gcloud beta services, and detailed configuration instructions are available in the official documentation at developers.google.com/knowledge/api and developers.google.com/knowledge/mcp.
1:04:35 📢 Ryan – “This won’t fix the fact that Google documentation is awful, but this will make it at least better.”
1:12:17 Delivering a secure, open, and sovereign digital world
- Google Cloud expands its Sovereign Cloud portfolio with three tiers – Data Boundary, Dedicated, and Air-Gapped – designed to meet varying data sovereignty requirements.
- Air-Gapped operates completely disconnected from Google Cloud and the internet, with no remote access possible by Google, while Dedicated allows partners to monitor and block updates with up to 12 months of independent operation if disconnected.
- The company announces substantial infrastructure investments across all continents, including new cloud regions in Thailand, Malaysia, and Sweden, plus subsea cables like TalayLink and Dhivaru for Asia-Pacific connectivity.
- Google commits to legal resistance against government shutdown orders and will enable qualified third parties to operate Google Cloud using Google’s code if Google becomes unable to continue operations.
- External Key Management lets customers store encryption keys outside Google Cloud with detailed access justifications required, while client-side encryption for Workspace ensures Google cannot read customer collaboration data.
- Google eliminated data transfer fees for customers migrating off the platform and expanded local ML processing for select Gemini models to 11 countries, including Australia, Brazil, Canada, France, Germany, India, Japan, Singapore, South Korea, and the UK.
- Notable sovereign cloud deployments include NATO Communication and Information Agency, German Armed Forces, UK Ministry of Defence, and Singapore government agencies using Air-Gapped, while France’s S3NS offers Premi3NS built on Dedicated with SecNumCloud 3.2 qualification from ANSSI.
- The portfolio targets highly regulated sectors like defense, government, banking, and healthcare, requiring strict data residency and operational independence guarantees.
FYI Expanding Vertex AI with Claude Opus 4.6.
- Google Cloud adds Anthropic’s Claude Opus 4.6 to Vertex AI, positioning it as its most powerful model for enterprise workflows, including document generation, financial analysis, and complex coding tasks.
- The model excels at multi-step agentic workflows and can handle tasks like creating production-ready spreadsheets and presentations with fewer revision cycles, particularly valuable for finance and legal verticals requiring precision.
- Vertex AI provides a complete agentic stack beyond just model access, including Agent Development Kit for rapid prototyping, Agent Engine for serverless deployment, and Memory Bank for persistent context across interactions.
- Cost optimization features include provisioned throughput for fixed pricing, prompt caching with flexible TTL, batch predictions, and a 1M token context window in preview for Claude Opus 4.6.
- The platform integrates with Google Cloud’s security infrastructure, including Model Armor for protection against prompt injection and tool poisoning, plus Security Command Center for AI threat detection.
- Customer implementations show practical results, with Palo Alto Networks reporting a 20-30% increase in code development velocity and companies like Shopify, TELUS, and Replit using Claude on Vertex AI for production workloads.
- Claude Opus 4.6 is generally available on Vertex AI with deployment options through Google Cloud Marketplace for streamlined procurement. Regional availability and specific pricing details are documented at cloud.google.com/vertex-ai/generative-ai/pricing#claude-models, with the model accessible through the Vertex AI console and sample notebooks available on GitHub.
1:13:15 GEAR program now available
- Google launches GEAR (Gemini Enterprise Agent Ready) as a specialized learning program within the Google Developer Program to help developers build production-ready AI agents.
- The program provides 35 monthly learning credits on the Google Skills platform for sandbox testing and lab access at no cost to participants.
- The program offers two main learning paths: Introduction to Agents for understanding agent architecture and integration with Gemini Enterprise, and Develop Agents with Agent Development Kit (ADK) for building agents with reasoning loops.
- Both paths focus on moving developers from experimentation to production-grade implementations using Google’s open-source ADK.
- GEAR includes a credential system with completion badges on Google Developer profiles and skill badges for intermediate and advanced expertise.
- For Google Cloud customers, a separate Get Certified cohort-based program offers instructor-led training and technical mentorship to prepare for industry-recognized certifications.
- The program addresses the shift toward agentic AI, where software can reason, plan, and execute complex workflows autonomously. Access requires creating or signing into a Google Developer Program profile and claiming the GEAR badge at developers.google.com/program/gear.
1:14:27 📢 Ryan – “I still think there’s a very large amount of people who don’t really understand sort of putting an agentic workflow in place to do what they want, right? It’s still pretty much fire-and-forget chat operations. And so there’s a lot of power in the tool once you know how to use it, but it is sort of less than straightforward, so I think this is a great course.”
Azure
1:15:26 Updates in two of our core priorities
- Microsoft announces major security leadership change with Hayete Gallot returning as EVP of Security, reporting directly to CEO Satya Nadella, while Charlie Bell transitions from leading security to focus on engineering quality as an individual contributor.
- This organizational shift reflects Microsoft’s continued emphasis on security as a top priority following recent Security Copilot and Purview adoption momentum.
- Gallot brings 15-plus years of Microsoft experience building Windows and Office franchises, plus recent Google Cloud customer experience leadership, positioning her to connect product development with customer value realization across Microsoft’s security portfolio.
- Her appointment comes as Microsoft integrates security into its new commercial cohorts operating model announced during recent earnings.
- Charlie Bell’s move from organizational leadership to an individual contributor engineering role is notable for a senior executive, with his new focus on Quality Excellence Initiative to improve engineering standards and product durability across Microsoft’s global scale operations. He will partner with Azure leadership, including Scott Guthrie, on quality improvements.
- Ales Holecek takes on the Chief Architect for Security role to bring platform architecture expertise to security products and connect them with Microsoft’s existing scale businesses and the Agent Platform. This architectural focus suggests deeper integration between security services and Microsoft’s broader cloud infrastructure.
- The timing aligns with Microsoft’s recent earnings report, highlighting security business growth and the company’s broader reorganization around commercial cohorts, indicating security will have dedicated product development rhythms separate from other business units. No specific pricing or feature changes were announced as part of this leadership transition.
1:17:19 📢 Justin – “I think this is them recreating the engineering operations review at Amazon at Azure. I think he is basically building a weekly program team that is going to be running the wheel, if you’re familiar with Amazon’s wheel thing, where basically you – as a service owner – can be called on at any time and you have to deep dive into all your KPIs, how your system’s operating, service operations, recent incidents, and you have to answer that at Amazon. They do it every week.”
1:19:23 Enhanced storage resiliency with Azure NetApp Files – Elastic zone-redundant service
- Azure NetApp Files Elastic ZRS introduces synchronous replication across three or more availability zones within a region with automatic service-managed failover, maintaining the same mount target and endpoint during zone failures.
- This eliminates the need for customers to manage HA clusters or VM-level failover while guaranteeing zero data loss for mission-critical workloads.
- The service costs less than running three separate ANF volumes with cross-zone replication while providing the same multi-AZ high availability in a single volume. Volumes can be created as small as 1 GiB, offering flexibility for workloads of any size with support for both NFS and SMB protocols independently.
- ANF Elastic ZRS delivers enterprise data management capabilities, including instant snapshots, clones, tiering, and backup integration powered by NetApp ONTAP, plus efficient metadata operations through a shared QoS architecture that dynamically allocates IOPS.
- The service is particularly suited for healthcare, financial services, and other regulated industries requiring continuous uptime and compliance.
- The service is currently available in select Azure regions with rapid expansion planned, and future capabilities will include simultaneous multi-protocol access (NFS, SMB, and Object REST API), custom region pairs for cross-region replication, and a migration assistant for moving data from on-premises ONTAP systems.
- This represents a clear migration path for existing NetApp on-premises customers looking to modernize without re-architecting applications.
1:21:21 PostgreSQL on Azure supercharged for AI
- Microsoft has enhanced Azure Database for PostgreSQL with native AI capabilities, including direct integration with Microsoft Foundry for in-database LLM operations like embeddings and semantic search.
- The service now supports DiskANN vector indexing for high-performance similarity search and includes a new PostgreSQL extension for Visual Studio Code that enables database provisioning directly from the IDE with built-in Entra ID authentication.
- The platform introduces zero-ETL real-time analytics through Microsoft Fabric mirroring and native Parquet file support via the Azure Storage Extension, allowing direct read/write operations to Azure Storage using SQL commands. PostgreSQL 18 is now generally available on Azure with new V6 compute SKUs that deliver improved I/O performance and lower latency, while Elastic Clusters enable horizontal scaling for multi-tenant workloads.
- Azure HorizonDB was announced at Ignite as a new PostgreSQL-compatible service in private preview, designed specifically for AI-native workloads with scale-out compute and sub-millisecond latency.
- This positions Azure to support both traditional PostgreSQL workloads and next-generation AI applications requiring ultra-low latency and horizontal scale.
- The GitHub Copilot integration provides schema-aware SQL assistance within Visual Studio Code, while the new Model Context Protocol server for PostgreSQL enables direct agent framework connections in Microsoft Foundry.
- Nasdaq demonstrated a production use case with their Boardvantage platform, using Azure Database for PostgreSQL and Microsoft Foundry to add AI-powered document analysis and summarization to their board governance system serving nearly half of the Fortune 500.
1:22:49 📢 Matt – “Nothing I like better than an LLM inside my database!”
FYI Claude Opus 4.6: Anthropic’s powerful model for coding, agents, and enterprise workflows is now available in Microsoft Foundry
- Claude Opus 4.6 is now available in Microsoft Foundry on Azure, bringing Anthropic’s most advanced reasoning model to enterprise customers with a 1M token context window in beta and 128K max output tokens.
- The model targets complex coding tasks, agentic workflows, and knowledge work across finance, legal, and cybersecurity domains, with new API features including adaptive thinking that dynamically adjusts reasoning depth and context compaction for long-running conversations.
- The integration connects Claude Opus 4.6 with Foundry IQ, enabling access to data across Microsoft 365, Fabric, and web sources within Azure’s governance and compliance framework.
- Customers like Adobe, Dentons, and Macroscope are using the model for code review, legal drafting, and document generation, with deployment available through both Microsoft Foundry and Copilot Studio for no-code agent building.
- Technical improvements include enhanced computer use capabilities for navigating interfaces and automating multi-application workflows, plus a new max effort control level that joins existing high, medium, and low settings for finer token allocation. The model handles large codebases effectively for refactoring and bug detection, with companies like Momentic AI processing millions of tokens per hour using the Azure infrastructure.
- Pricing follows a premium model beyond 200K tokens for the 1M context window beta, though specific per-token costs were not disclosed in the announcement.
- The focus is on production-grade deployments where Azure’s managed infrastructure and operational controls help compress development timelines from days to hours while maintaining enterprise security requirements.
1:23:45 Microsoft OneLake and Snowflake interoperability (Generally Available) | Microsoft Fabric Blog
- Microsoft OneLake and Snowflake now offer bidirectional Iceberg table interoperability in general availability, allowing customers to store and access data across both platforms without duplication.
- Changes made in one platform automatically reflect in the other, eliminating the need for traditional copy-heavy data integration approaches.
- Snowflake-managed Iceberg tables can now be natively stored in Microsoft OneLake, while Fabric data automatically converts to Iceberg format for direct Snowflake access.
- This addresses the challenge of enterprise data living across fragmented systems by providing a single copy of data accessible through either platform’s analytical engines.
- New UI elements launching next week include a Snowflake item in OneLake for simplified access without complex configurations, plus Snowflake UI that pushes managed Iceberg tables directly into Fabric as discoverable OneLake items. The integration also supports OneLake table APIs working with Snowflake’s catalog-linked database feature.
- The target use case centers on data teams managing analytics and AI workloads across multiple platforms who want to avoid vendor lock-in and proprietary formats. Organizations can now choose the optimal storage location and analytical engine for each project while maintaining a unified data estate without operational overhead from data duplication.
- No specific pricing details were provided in the announcement, though the integration leverages existing OneLake and Snowflake licensing models. Customers can access quickstart guides and documentation through Microsoft Learn and Snowflake’s resources, with hands-on training available at FabCon and SQLCon 2026 in Atlanta from March 16-20.
1:24:44 📢 Ryan – “This is kind of neat. I mean, it’s unexpected because it is data, and the amount of data and what you’d have in a data lake is usually one of those elements that makes using a service very sticky, so providing sort of an easy way to get out of that is a surprise to me, but it’s also – from a customer perspective – if you’ve got data across both, like how fantastic is that? To be able to use it. I like it.”
1:25:52 Generally Available: Azure Container Storage v2.1.0 now with Elastic SAN integration and on-demand installation
- Azure Container Storage v2.1.0 brings native Elastic SAN integration, allowing Kubernetes workloads to leverage Azure’s shared block storage service for high-performance persistent volumes.
- This integration provides an alternative to existing Azure Disk and ephemeral disk options, particularly beneficial for workloads requiring shared storage across multiple pods.
- The release introduces an on-demand installation model that reduces the deployment footprint and operational overhead compared to previous versions. Instead of pre-installing all storage components, the system now deploys only the necessary drivers and resources when specific storage types are requested, streamlining cluster management.
- Elastic SAN support targets enterprise customers running stateful containerized applications that need consistent low-latency performance and the ability to scale storage independently from compute.
- Common use cases include database workloads, analytics platforms, and applications requiring shared persistent volumes across multiple container instances.
- The lightweight installation approach addresses a common pain point where organizations previously had to deploy full storage stacks even when using only a subset of available storage options.
- This change reduces resource consumption on AKS clusters and simplifies troubleshooting by limiting the number of active storage components
1:26:26 📢 Justin – “The amount of SAN investment they’ve done in the last year is crazy to me.”
1:27:20 Five Reasons to attend SQLCon | Microsoft Fabric Blog
- SQLCon is a new SQL-focused conference co-located with FabCon in Atlanta, March 16-20, offering dual access with a single registration.
- The event features 50 SQL sessions covering SQL Server, Azure SQL, and SQL database in Fabric, with hands-on workshops Monday-Tuesday and conference sessions Wednesday-Friday.
- Microsoft is sending over 30 SQL product team members to deliver engineering insights, roadmap announcements, and live demos of upcoming capabilities, including SSMS and VS Code extensions, Copilot integrations, and Fabric SQL experiences. This provides direct access to product teams for technical questions and future planning.
- The combined conference format allows attendees to mix deep SQL technical sessions with broader Fabric, Power BI, data engineering, and AI content throughout the week.
- This structure benefits both specialists needing deep technical content and cross-functional teams building shared understanding across data platforms.
- Registration includes access to both conferences, hands-on workshops, Ask-the-Experts sessions with MVPs and engineers, and an attendee party at the Georgia Aquarium.
- Early-bird pricing and team discounts are available, with promo code SQLCMTY200 offering $200 off registration.
- The event targets DBAs, developers, data engineers, architects, and data team leaders working with SQL Server, Azure SQL, or SQL database in Fabric who need practical migration, modernization, performance tuning, and AI integration guidance.
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

Leave a Reply