343: AWS CloudWatch Finally Hits Snooze

Welcome to episode 343 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Matt are in the studio this week bringing you all the latest in Cloud and AI news, including some of the smaller clouds like Cloudflare and Crusoe Cloud, as well as announcements from the big guys like Google’s Gemini DeepThink, Anthropic’s big pay day, and Microsoft’s Notepad problem. We’ve got all this plus Matt screwing up his outro AGAIN, so let’s get started!

Titles we almost went with this week

🏋️ Chrome’s WebMCP Protocol: Teaching AI Agents to Stop Doom-Scrolling the DOM and Actually Get Work Done
🤳Claude Enterprise Self-Service: Because Sometimes You Just Want to Buy AI Without Small Talk
💔 AWS EC2 Goes Inception Mode: Now You Can Virtualize Your Virtualization Without Going Broke
📠 Amazon EC2 Nested Virtualization: Because Your Virtual Machine Was Lonely and Needed Its Own Virtual Machine
👏 CloudWatch Alarm Mute Rules: Because Your Deployment Doesn’t Need a Standing Ovation at 3 AM
🤑 Anthropic’s $380 Billion Valuation Proves AI Funding Has Gone Claude Nine
AWS EC2 Nested Virtualization Finally Escapes the Expensive Hardware Jail
🧑‍🏫 Cloudflare Teaches AI Agents the Magic Words: Accept text/markdown and Save 13,000 Tokens
👷 Crusoe Cloud’s MCP Server: Teaching AI Assistants to Stop Asking for the Manager and Just Fix Your Infrastructure
⌨️ Azure’s New Agentic Copilot: Because Manually Clicking Through Dashboards Was So 2023
🗺️ Chrome’s WebMCP Gives AI Agents a GPS for Websites Because Apparently They’ve Been Lost in the HTML This Whole Time
🧑‍💼 Anthropic Cuts Out the Middleman: Claude Enterprise Now Available Without the Enterprise Sales Dance
⏰ AWS Gives CloudWatch the Silent Treatment: New Mute Rules Let Alarms Sleep Through Maintenance Windows
📲 AWS CloudWatch Hits Snooze: Mute Rules End On-Call Nightmares
🔇 AWS Gives CloudWatch the Silent Treatment

General News

00:45 Bloat Risk? Microsoft’s Notepad Upgrade Also Introduced a Vulnerability | PCMag

Microsoft’s recent Notepad modernization introduced CVE-2026-20841, a vulnerability in the new Markdown support feature that allows malicious links in files to execute remote code.
The flaw has been patched in the February 2026 security updates, but it highlights the security trade-offs when adding features to historically simple applications.
The vulnerability exploits Notepad’s Markdown rendering capability, which Microsoft added in May to support lightweight markup language formatting. When Notepad opens a specially crafted Markdown file, embedded malicious links can trigger unverified protocols that load and execute remote files on the system.
This incident raises questions about feature bloat in core Windows utilities, particularly as Microsoft continues adding network-dependent capabilities like AI-powered text writing to Notepad. Security researchers are debating whether basic text editors should have network functionality at all, given the expanded attack surface.
The vulnerability demonstrates how modernization efforts can introduce security risks in previously low-risk applications.
Organizations using Windows need to ensure their systems receive the February 2026 security updates to address this specific flaw in Notepad’s Markdown implementation.

02:04 📢 Matt – “I’m just confused why they didn’t use Copilot on their pull request in order to identify this as a potential bug. I feel like it should have found it. Just sayin’…”

03:13 WebMCP is available for early preview

Chrome is introducing WebMCP, a standardized protocol that lets websites expose structured tools and actions directly to AI agents, eliminating the need for agents to parse raw HTML and DOM elements.
This addresses a key reliability problem in agentic workflows where AI agents currently struggle with inconsistent web interactions.
The protocol offers two interaction modes: a declarative API for simple HTML form-based actions and an imperative API for complex JavaScript-driven workflows. This dual approach lets websites define exactly how agents should interact with features like booking systems, support ticket forms, and checkout processes.
Early use cases focus on high-value transactional workflows, including e-commerce product configuration, travel booking with complex filtering requirements, and automated customer support ticket creation with technical details. These scenarios benefit most from structured interactions versus unreliable DOM manipulation.
The early preview program requires sign-up for access to documentation and demos, indicating this is still in experimental stages.
Developers interested in making their sites agent-ready will need to implement these new APIs to participate in the agentic web ecosystem Chrome is building.
This represents Chrome’s attempt to standardize how AI agents interact with websites before the market fragments with competing approaches. Sites that adopt WebMCP early may gain advantages as browser-based AI agents become more prevalent.
Interested in signing up for the preview? You can do that here.

04:41 📢 Ryan – “It makes a lot of sense why they want to standardize on a specific protocol, but I can’t help but feel like this is the beginning of the end of human interaction; where you’re going to have an AI agent-to-agent protocol.”

AI Is Going Great – Or How ML Makes Money

07:27 Anthropic raises $30 billion in Series G funding at $380 billion post-money valuation \ Anthropic

Anthropic closed a $30 billion Series G at a $380 billion post-money valuation, reaching $14 billion in run-rate revenue with 10x annual growth for three consecutive years.
The company now serves eight of the Fortune 10, with over 500 customers spending more than $1 million annually.
Claude Code, made generally available in May 2025, has grown to $2.5 billion in run-rate revenue and now accounts for 4% of all public GitHub commits worldwide. Business subscriptions quadrupled since early 2026, with enterprise customers representing over half of Claude Code’s revenue.
Opus 4.6 launched last week as the latest model release, leading the GDPval-AA benchmark for economically valuable knowledge work in finance and legal domains. The model powers agents capable of generating professional documents, spreadsheets, and presentations autonomously.
Anthropic expanded its product portfolio in January with over thirty launches, including Cowork, which extends Claude Code capabilities to broader knowledge work with eleven open-source plugins for specialized roles.
Claude for Enterprise is now HIPAA-compliant and available for healthcare and life sciences organizations.
Claude remains the only frontier AI model available across all three major cloud platforms through AWS Bedrock, Google Cloud Vertex AI, and Microsoft Azure Foundry.
The company trains on diversified hardware, including AWS Trainium, Google TPUs, and NVIDIA GPUs, to optimize workload performance and resilience.

08:10📢 Matt – “Those numbers are insane. I just want to make sure we’re all clear about that.”

15:16 Introducing Sonnet 4.6 \ Anthropic

Claude Sonnet 4.6 is now generally available across all Claude plans, API, and major cloud platforms at the same pricing as Sonnet 4.5 ($3/$15 per million tokens), with a 1M token context window in beta.
The model now serves as the default for Free and Pro plan users, bringing Opus-class performance to a mid-tier price point.
Computer use capabilities have improved substantially, with Sonnet 4.6 scoring 94% on insurance benchmarks and showing human-level performance on tasks like navigating complex spreadsheets and multi-step web forms.
The model demonstrates better resistance to prompt injection attacks compared to Sonnet 4.5 and performs similarly to Opus 4.6 on safety evaluations.
Coding performance has advanced significantly, with early users preferring Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time and even choosing it over Opus 4.5 59% of the time.
Users report better instruction following, less overengineering, fewer hallucinations, and more consistent follow-through on multi-step tasks, with one customer reporting an 80.2% score on SWE-bench Verified.
Several features have reached general availability on the API, including code execution, memory, programmatic tool calling, tool search, and tool use examples.
Web search and fetch tools now automatically write and execute code to filter search results, improving response quality and token efficiency.
The model supports both adaptive thinking and extended thinking modes, with context compaction in beta that automatically summarizes older context as conversations approach limits.
Claude in Excel now supports MCP connectors, allowing users to pull data from external sources like S&P Global, LSEG, and PitchBook directly within spreadsheets.

17:42📢 Ryan – “I haven’t played with Sonnet because it’s just released, but playing around with Opus, you can see that it’s another major improvement in these steps, and it is pretty fantastic to use.”

19:44 Token Anxiety – by Nikunj Kothari – Balancing Act

This article describes a cultural shift in San Francisco’s tech scene where developers are prioritizing AI agent management over social activities, with people leaving parties early to check on overnight code generation and spending weekends running 12-hour build sessions with AI assistants like Claude and Codex.
The piece highlights how AI coding tools have created a new productivity anxiety where developers feel compelled to keep agents running continuously, even during sleep, to maximize output and stay competitive as new model capabilities and context windows are released weekly.
Developers are adopting new vocabulary around AI models, discussing them like sommeliers evaluate wine and using animal training metaphors like keeping Claude on a tight leash for code review while giving it more slack for creative work.
The constant stream of benchmark improvements and new AI capabilities is creating pressure to continuously optimize workflows, as each advancement makes previous methods feel outdated and multiplies the sense that competitors are already leveraging these improvements.
This represents a broader shift in developer culture where traditional leisure activities are being replaced by AI-assisted building, with the primary social metric changing from what you accomplished to how many agents you have running in parallel.

24:25📢 Ryan – “I still don’t know how everyone has these overnight workloads; I guess I don’t trust AI at all; I’m not going to let it run unsupervised.”

31:48 Alibaba Launches New LLM as China’s AI Battle Heats Up

Qwen 3.5 is out. No industry freakouts (like with DeepSeek) so far

33:06 Seed News – ByteDance Seed Team

ByteDance officially launched Seedance 2.0, a next-generation video creation model with a unified multimodal audio-video architecture supporting text, image, audio, and video inputs.
The model can process up to 9 images, 3 video clips, 3 audio clips, and natural language instructions simultaneously for comprehensive content referencing and editing.
The model delivers substantial improvements in complex motion rendering and physical accuracy, particularly excelling at multi-subject interactions like competitive figure skating with synchronized movements, mid-air spins, and precise landings that follow real-world physics.
Industry evaluations show Seedance 2.0 achieves leading performance in motion stability, instruction following, and visual aesthetics compared to competing models.
Seedance 2.0 introduces dual-channel stereo audio generation with multi-track parallel output for background music, ambient effects, and voiceovers synchronized to visual rhythm.
The model supports 15-second high-quality multi-shot audio-video output suitable for commercial advertising, film VFX, game animations, and explainer videos.
New video editing capabilities allow targeted modifications to specific clips, characters, actions, and storylines, plus video extension functionality for generating continuous shots based on user prompts.
The model demonstrates improved instruction-following for complex scripts and open-ended prompts while maintaining subject consistency across extended sequences.
The unified multimodal architecture enables professional-grade content creation workflows where users can reference composition, motion, camera movement, visual effects, and audio elements from input assets, significantly lowering barriers to industrial-level video production without requiring specialized technical expertise.
https://www.instagram.com/reel/DUm4zSvEn76/ – John Wick cat video as mentioned.

34:53📢 Justin – “I’m surprised Hollywood stock didn’t crash today over this; very very impressive. Crazily so.”

AWS

36:47 Announcing new Amazon EC2 general purpose M8azn instances

AWS launches M8azn instances powered by fifth-generation AMD EPYC Turin processors running at 5GHz, the highest CPU frequency available in the cloud. These general-purpose instances deliver 2x compute performance over M5zn and 24% better performance than M8a instances, with 4.3x higher memory bandwidth and 10x larger L3 cache.
The instances target latency-sensitive workloads like high-frequency trading, real-time financial analytics, and simulation modeling for automotive and aerospace industries.
Built on sixth-generation Nitro Cards, they provide 2x networking throughput and 3x EBS throughput compared to M5zn instances.
M8azn instances come in nine sizes from 2 to 96 vCPUs with up to 384 GiB memory at a 4:1 memory-to-vCPU ratio, including two bare metal variants. Available in US East Virginia, US West Oregon, Tokyo, and Frankfurt regions through On-Demand, Spot, and Savings Plans pricing models.
The high-frequency positioning fills a specific niche for workloads requiring maximum single-threaded performance rather than just core count.
This complements AWS’s broader M8a lineup by offering customers a choice between standard frequency instances and these premium high-frequency variants for specialized use cases.

37:03 Announcing Amazon EC2 C8i, M8i, and R8i instances on second-generation AWS Outposts racks

AWS is bringing C8i, M8i, and R8i instances to second-generation Outposts racks, delivering 20% better performance and 2.5x more memory bandwidth compared to the previous C7i, M7i, and R7i generation. These instances also provide 20% more compute capacity within the same physical rack space and power consumption, improving density for on-premises deployments.
The new instances run on custom Intel Xeon 6 processors exclusive to AWS and target workloads that need enhanced on-premises performance, including large databases, memory-intensive applications, real-time analytics, high-performance video encoding, and CPU-based ML inference.
This addresses the gap for customers who need cloud-class compute but must keep workloads on-premises due to latency, data residency, or regulatory requirements.
Second-generation Outposts racks continue AWS’s hybrid cloud strategy by extending the latest EC2 instance types to customer data centers with the same APIs and tooling as the public cloud.
The availability varies by region, so customers should check the Outposts rack FAQs page for current country and territory support before planning deployments.
The performance improvements come primarily from the memory bandwidth increase and processor generation upgrade, which should benefit database operations, in-memory caching, and data-intensive applications that previously hit memory bottlenecks on Outposts.
The power and space efficiency gains matter for customers with constrained data center capacity or energy budgets.

37:08 Amazon EC2 Hpc8a Instances powered by 5th Gen AMD EPYC processors are now available

AWS launches Hpc8a instances powered by 5th Gen AMD EPYC processors, delivering 40% higher performance and 42% greater memory bandwidth than the previous Hpc7a generation, while offering up to 25% better price-performance for tightly coupled HPC workloads like computational fluid dynamics and weather modeling.
The instances come in a single 96xlarge size with 192 cores, 768 GiB memory, and 300 Gbps Elastic Fabric Adapter networking, featuring customizable core counts at launch and sixth-generation AWS Nitro cards for offloaded virtualization functions. Simultaneous Multithreading is disabled by default to optimize HPC performance.
Available now in US East Ohio and Europe Stockholm regions, with support for AWS ParallelCluster, AWS Parallel Computing Service, and Amazon FSx for Lustre integration to simplify cluster management and provide sub-millisecond storage latencies. Customers can purchase as On-Demand Instances or through Savings Plans, with specific pricing available on the EC2 pricing page.
The 1:4 core-to-memory ratio and high core density target compute-intensive simulation workloads requiring rapid time-to-results, including crash simulations and high-resolution weather modeling within tight operational windows. The customizable core count feature allows right-sizing based on specific HPC workload requirements without paying for unused capacity.

39:20 📢 Ryan – “I’m sure they use a subcontractor for actual maintenance, things. But I’m sure that you have to give them access and manage them just like you would any other remote hands for your data center.”

39:37 MSK simplifies Kafka topic management with new APIs and console integration

Amazon MSK now provides native AWS APIs for Kafka topic management, eliminating the need to set up and maintain separate Kafka admin clients. The three new APIs (CreateTopic, UpdateTopic, and DeleteTopic) work alongside existing ListTopics and DescribeTopic APIs through AWS CLI, SDKs, and CloudFormation, letting teams manage topics using standard AWS tooling and IAM permissions.
The MSK console now consolidates all topic operations in one interface with guided defaults for creating and updating topics. Users can configure properties like replication factor, partition count, retention policies, and cleanup settings while viewing comprehensive partition-level metrics and configuration details directly in the console.
These capabilities are available at no additional cost for MSK provisioned clusters running Kafka version 3.6 and above across all regions where MSK is offered. Organizations need to configure appropriate IAM permissions to use the new APIs, with setup instructions available in the MSK Developer Guide.
The update addresses a common operational pain point where teams previously had to maintain separate Kafka admin tooling outside the AWS ecosystem. This integration brings Kafka topic management into standard AWS workflows, improving consistency with existing infrastructure-as-code practices and centralized access control through IAM.

40:47 📢 Ryan – “I suspect this has more to do with Kafka than AWS because Kafka is notoriously hard to administer, so in a lot of cases there’s just not the ability…so I’m really happy to see this.”

42:40 Amazon Bedrock adds support for six fully-managed open weights models

Amazon Bedrock now supports six new open weights models, including DeepSeek V3.2, MiniMax M2.1, GLM 4.7, GLM 4.7 Flash, Kimi K2.5, and Qwen3 Coder Next, providing frontier-class performance at lower inference costs than proprietary alternatives.
These models cover different enterprise needs from advanced reasoning and agentic tasks to autonomous coding with large output windows and lightweight production deployments.
The models run on Project Mantle, a new distributed inference engine that accelerates model onboarding to Bedrock while providing serverless inference with quality of service controls and automated capacity management. Project Mantle includes native OpenAI API compatibility, allowing customers to switch from OpenAI endpoints without code changes.
The addition of these open weights models gives AWS customers more flexibility in model selection based on specific workload requirements and cost constraints.
DeepSeek V3.2 and Kimi K2.5 handle complex reasoning tasks, while GLM 4.7 and MiniMax 2.1 support coding workflows with extended context windows, and Qwen3 Coder Next and GLM 4.7 Flash offer cost-efficient options for high-volume production use.
Project Mantle’s unified capacity pools and higher default quotas address common scaling challenges customers face when deploying large language models.
The serverless architecture eliminates infrastructure management overhead, while the automated capacity management helps prevent quota limitations during peak usage periods.

44:05 📢 Matt – “I like how they made it all compatible with OpenAI. It’s kind of like S3 compatibility; I feel like we’re slowly kind of coming to a standard, which means you can go play with it and see which model makes sense.”

46:02 Amazon EKS Auto Mode Announces Enhanced Logging for its Managed Kubernetes Capabilities

EKS Auto Mode now integrates with CloudWatch Vended Logs to automatically collect logs from its managed Kubernetes capabilities, including compute autoscaling, block storage, load balancing, and pod networking.
This gives customers centralized visibility into Auto Mode’s infrastructure management operations without manual configuration.
The integration uses CloudWatch Vended Logs, which provides lower pricing than standard CloudWatch Logs while maintaining built-in AWS authentication and authorization.
Customers can route logs to CloudWatch Logs, S3, or Kinesis Data Firehose, depending on their retention and analysis requirements, with standard destination charges applying.
Each Auto Mode capability can be configured independently as a log delivery source through CloudWatch APIs or the AWS Console.
This granular control allows teams to monitor specific components like the Karpenter-based autoscaler or VPC CNI networking without collecting unnecessary log data.
The feature addresses a common operational challenge where Auto Mode’s automated infrastructure management previously operated as a black box. DevOps teams can now troubleshoot issues like pod scheduling failures, storage provisioning problems, or load balancer configuration errors by examining the actual logs from Auto Mode’s control plane operations.
Available immediately in all regions where EKS Auto Mode operates, this logging capability helps bridge the observability gap between customer workloads and AWS-managed Kubernetes infrastructure components.

47:05 📢 Justin – “All I have to say is, some lovely CloudWatch PM just made their bonus this year by turning this one, as this is a lot of logging context that you now need to parse and pay for.”

49:26 AWS CloudWatch Alarm Mute Rules eliminate alert fatigue

CloudWatch Alarm Mute Rules let you temporarily silence alarm notifications during planned maintenance windows, deployments, or off-hours without disabling the underlying monitoring.
The feature supports up to 100 alarms per rule with one-time or recurring schedules, and automatically triggers any suppressed actions once the mute period ends if the alarm state persists.
This addresses a common operational pain point where teams either ignore alerts during maintenance windows or use risky script-based workarounds that can be forgotten and leave monitoring disabled.
The native integration eliminates the need for custom automation to manage notification states during planned activities.
The feature is available today across all AWS regions that support CloudWatch alarms at no additional cost beyond standard CloudWatch pricing.
Configuration is done through the CloudWatch console or API, with support for all alarm states, including OK, ALARM, and INSUFFICIENT_DATA.
Primary use cases include silencing non-critical alerts during scheduled deployments, muting development environment alarms outside business hours, and suppressing known issues during maintenance windows.
This helps reduce alert fatigue while maintaining full visibility into system state and metrics collection.
The automatic re-triggering of muted actions ensures teams don’t miss persistent issues that started during a mute window, providing a safety mechanism that manual notification management typically lacks.

50:49 📢 Ryan – “This is much nicer. Basically, set it for ignore for an hour and then have it kick back in. Glad to see this, but strange that it took this long.”

52:48 Amazon EC2 supports nested virtualization on virtual Amazon EC2 instances

AWS now supports nested virtualization on standard EC2 instances, not just bare metal, allowing customers to run KVM or Hyper-V hypervisors inside virtual machines. This expands flexibility for development and testing scenarios that previously required more expensive bare metal instances.
The feature launches on the latest generation C8i, M8i, and R8i instance families across all commercial AWS regions.
Customers can now run mobile app emulators, automotive hardware simulators, and Windows Subsystem for Linux on Windows workstations directly on virtual instances.
This capability addresses a long-standing limitation where nested virtualization required bare metal instances, which carry higher costs and longer provisioning times compared to standard virtual instances.
The change makes nested environments more accessible for development teams and testing workflows.
Common use cases include software vendors who need to test their products across multiple operating systems, automotive companies simulating vehicle hardware environments, and mobile developers running Android or iOS emulators at scale.
These workloads can now run on more cost-effective instance types with faster deployment.
The feature requires enabling hardware virtualization extensions in the instance configuration, with full documentation available in the EC2 user guide. Pricing follows standard EC2 rates for the C8i, M8i, and R8i instance families without additional charges for the nested virtualization capability itself.

54:13 📢 Ryan – “These kinds of announcements are usually preceded or quickly followed with Nitro…and it’s neat. It’s neat how they isolate the hardware layer to match these workloads.”

54:50 Announcing Amazon SageMaker Inference for custom Amazon Nova models

AWS now lets customers deploy custom-trained Amazon Nova models on SageMaker Inference with production-grade controls over instance types, auto-scaling, context length, and concurrency settings.
This addresses customer requests for the same deployment flexibility they get with open-weight models, enabling full-rank customized Nova Micro, Nova Lite, and Nova 2 Lite models trained via SageMaker Training Jobs or HyperPod.
The service reduces inference costs by supporting more cost-effective EC2 G5 and G6 instances instead of requiring P5 instances, with auto-scaling based on 5-minute usage patterns and configurable inference parameters.
Customers pay only for compute instances used with per-hour billing and no minimum commitments, following standard SageMaker pricing.
Deployment works through SageMaker Studio UI or SDK, supporting both real-time streaming and asynchronous batch inference modes.
The service includes advanced configuration options for context length up to 8000 tokens, max concurrency settings, and inference parameters like temperature and top-p for optimizing latency-cost-accuracy tradeoffs.
Currently available in US East N. Virginia and the US West Oregon regions, with support for Nova models with reasoning capabilities.
Instance type requirements vary by model size, with Nova Micro supporting g5.12xlarge and up, Nova Lite requiring g5.48xlarge minimum, and Nova 2 Lite needing p5.48xlarge instances.

56:47 📢 Ryan – “It’s not an open-source model, and so it is kind of crazy that Nova offers that customization.”

GCP

57:25 Gemini 3 Deep Think: AI model update designed for science

Google has released a major update to Gemini 3 Deep Think, a specialized reasoning mode designed for complex scientific and engineering problems where data is messy or incomplete, and solutions aren’t straightforward.
The model achieved notable benchmark results, including 48.4% on Humanity’s Last Exam, 84.6% on ARC-AGI-2, and gold medal performance on the 2025 International Math, Physics, and Chemistry Olympiads.
Early adopters are using Deep Think for practical applications like identifying logical flaws in peer-reviewed mathematics papers, optimizing semiconductor crystal growth fabrication methods, and converting sketches into 3D-printable files with generated code.
The model combines deep scientific knowledge with engineering utility to move beyond theoretical work into applied research.
The updated Deep Think is available now to Google AI Ultra subscribers through the Gemini app, with pricing following the existing Ultra subscription model.
For the first time, Google is offering API access through an early access program for select researchers, engineers, and enterprises who can apply through a Google form.
The release targets scientific research institutions and engineering teams working on complex problems in physics, chemistry, materials science, and advanced mathematics, where traditional AI models struggle with ambiguous requirements.
Deep Think’s ability to work with incomplete data and generate executable code for physical modeling makes it particularly relevant for R&D workflows.

1:00:19 New global queries in BigQuery span data from multiple regions

BigQuery global queries now allow users to run a single SQL statement across datasets stored in multiple geographic regions without requiring ETL pipelines or data replication.
The feature automatically handles cross-region data movement in the background while respecting existing security controls like VPC Service Controls and requiring explicit opt-in at both the project and user level.
The primary use case targets multinational organizations that need to analyze distributed data for compliance or performance reasons, such as joining US customer data with European transaction logs and Asian operational data in one query.
EssilorLuxottica is using this to perform cross-region aggregated analysis while maintaining data residency requirements for security and compliance. (DOES IT THOUGH?)
Users maintain control over where queries execute and can specify the processing location to meet data residency requirements, though cross-region data transfers will incur additional egress costs that organizations need to factor into their analytics budgets.
The feature is currently in preview with documentation available here.
This addresses a longstanding limitation in cloud data warehousing, where geographic data distribution required complex engineering solutions, now replaced by standard SQL queries that any authorized analyst can run directly from the BigQuery console. The feature respects governance controls by default and prevents accidental data movement through required permissions and explicit enablement.

1:01:36 📢 Matt – “I feel l ike it is compliant… if you’re running local and you’re not collecting anything that could be confidential. So it depends on how your lawyer at your company interprets it.”

Azure

1:03:47 Agentic cloud operations and Azure Copilot for AI‑driven workloads

Microsoft introduces agentic cloud operations through Azure Copilot, which uses AI agents to automate and coordinate cloud management tasks across the full infrastructure lifecycle. Instead of adding another dashboard, Azure Copilot provides a unified interface accessible through natural language, chat, console, or CLI that connects directly to a customer’s actual Azure environment, including subscriptions, resources, and policies.
Azure Copilot includes six specialized agents that handle migration discovery and dependency mapping, deployment with infrastructure-as-code generation, continuous observability across the full stack, cost and performance optimization with carbon impact analysis, resiliency management including ransomware protection, and troubleshooting with root cause diagnosis.
These agents work as a connected system rather than isolated tools, correlating signals and taking action within existing RBAC and policy controls.
The service maintains governance through built-in oversight features, including Bring Your Own Storage for conversation history, which keeps operational data within the customer’s Azure environment for compliance and sovereignty requirements.
All agent-initiated actions are reviewable, traceable, and auditable while respecting existing security policies and role-based access controls.
Target customers are organizations running modern applications and AI workloads at scale, where traditional manual operations cannot keep pace with rapid deployment cycles and infrastructure changes.
The approach addresses environments where workloads move from experimentation to production in weeks and where telemetry streams continuously from every layer of the stack.
Pricing details were not disclosed in the announcement, though the service builds on existing Azure Copilot capabilities introduced at Microsoft Ignite. Organizations can access resources and get started at azure.microsoft.com/products/copilot.

1:05:39 📢 Matt – “Also, a developer actually understanding what they want and telling you what they want and actually being useful? I would love to see too, because how many times have we built something, deployed it, day before the release – we actually need these 16 other things that we didn’t tell you about that we manually did in our dev environment, which is why it’s working… and the release is tomorrow. Good luck. Why is it not done yet?”

1:06:18 General Availability: Instant access support for incremental snapshots of Azure Premium SSD v2 and Ultra Disk

Azure now offers instant access to incremental snapshots for Premium SSD v2 and Ultra Disk storage, eliminating the previous wait time when restoring disks from snapshots.
This addresses a significant operational pain point for customers running high-performance workloads that require rapid disaster recovery or quick environment provisioning.
The feature specifically targets enterprise customers using Azure’s highest-tier storage options, Premium SSD v2 and Ultra Disk, which are typically deployed for mission-critical databases, SAP HANA, and other latency-sensitive applications.
Previously, customers had to wait for snapshot data to fully hydrate before using restored disks, creating delays in recovery scenarios.
Incremental snapshots only capture changes since the last snapshot, reducing storage costs and backup windows compared to full snapshots.
With instant access now available, customers can immediately mount and use restored disks while background hydration completes, improving recovery time objectives for business continuity planning.
This capability brings Premium SSD v2 and Ultra Disk snapshot functionality closer to parity with standard Azure managed disk snapshots.
The feature is now generally available across Azure regions where Premium SSD v2 and Ultra Disk are supported, though specific pricing for snapshot storage follows existing Azure snapshot pricing models based on stored data volume.

1:06:25 📢 Justin – “Welcome to what Amazon and Google have been doing for quite a while, so thanks, Azure!

Emerging Clouds

1:08:16 Introducing the Crusoe Cloud MCP server

Crusoe Cloud released an MCP server that connects AI coding assistants like Claude Code and Cursor directly to cloud infrastructure, but unlike typical API wrappers, it returns filtered responses designed specifically for LLM consumption to avoid flooding context windows with unnecessary data.
The server includes composite tools like get_resource_relationships that map entire infrastructure topologies in a single call by fetching 11 resource types in parallel and resolving cross-references, something that doesn’t exist in their CLI or any single API endpoint.
The cluster_health_check tool provides pre-analyzed node-level health metrics organized by InfiniBand pod placement, returning structured summaries with problem nodes flagged rather than raw metric time series that would require additional processing.
This approach addresses a key limitation of AI agents working with cloud infrastructure: most MCP implementations just wrap CLI commands and return the same JSON a human would see, forcing the AI to parse through irrelevant metadata and empty fields.
The implementation reflects a broader trend of cloud providers releasing MCP servers, but Crusoe’s focus on response filtering and burst-heavy access patterns specific to AI agents suggests infrastructure management tools are being redesigned around LLM capabilities rather than human interaction patterns. For developers already using AI coding assistants, this enables natural language infrastructure queries and troubleshooting without manual scripting or console navigation.

1:10:16 📢 Ryan – “This is gonna be chaos.”

1:10:21 Introducing Markdown for Agents

Cloudflare now automatically converts HTML to markdown for AI agents using content negotiation headers, reducing token usage by up to 80 percent.
When agents request pages with Accept: text/markdown, Cloudflare’s network performs real-time conversion at the edge, eliminating the need for downstream processing and reducing costs for AI systems.
The feature addresses a fundamental inefficiency where AI agents waste tokens parsing HTML markup, navigation elements, and styling that have no semantic value.
A simple heading that costs 3 tokens in markdown can consume 12-15 tokens in HTML, and this blog post example shows 16,180 tokens in HTML versus 3,150 in markdown.
Cloudflare includes an x-markdown-tokens header with converted responses to help developers calculate context window sizes and chunking strategies. The service also automatically adds Content-Signal headers indicating the content can be used for AI training, search results, and agentic use, integrating with their Content Signals framework from Birthday Week.
The feature is available in beta at no cost for Pro, Business, and Enterprise plans, with Cloudflare already enabling it on their own blog and developer documentation.
Popular coding agents like Claude Code and OpenCode already send the appropriate accept headers, positioning this as infrastructure for the shift from traditional SEO to AI-driven content discovery.
Cloudflare Radar now tracks content type distribution for AI bot traffic, allowing analysis of how different agents consume web content over time. This data is accessible through public APIs and shows early adoption patterns like OAI-Searchbot requesting markdown content.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

343: AWS CloudWatch Finally Hits Snooze