Welcome to episode 350 of The Cloud Pod, where the weather is always cloudy! Justin, Jonathan, and Matt are this week’s hosts, and they’ve scoured the clouds for all the latest news and announcements, including that Mythos drop. Is it the AI apocalypse that everyone is claiming? We’ve also got news from DigitalOcean, an email from Space, Claude and even some Guardrails. There’s a lot to cover, so let’s get started!
Titles we almost went with this week
- 🎙️ Two AIs Walk Into a Studio and Actually Sound Good
- 🧑💻 No More Idle GPUs Twiddling Their Tensor Cores
- 🗺️ When AWS Availability Zones Become Unavailability Zones
- 🪙 Token by Token Codex Pricing Finally Makes Cents
- 💸 Just Ask AWS Where All Your Money Went
- 🔐 You’ve Got mTLS: Amazon SES Locks Down Email Security
- 🗣️ Cost Explorer Finally Speaks Plain English
- 🗾 Missiles Make AWS Multi-Region Strategy Mandatory
- 🐚 Shell Yeah Your Agent State Now Persists
- 🪣 S3 Files Finally Lets You ls Your Bucket
- 🚀 Claude Found Your Zero-Day Before Lunch
- 🧱 One Guardrail to Rule All Your AWS Accounts
- Premium SSD Wins Azure VDI but Your Wallet Cries
- 😢 No More Amnesia: Your Bedrock Agent Keeps Its Memories
- 🦞 Pay Per Claw Anthropic Sharpens Its Pricing Policy
- 🧑🚀 Even Astronauts Need IT Support for Microsoft Outlook
- ❓ AWS still can’t answer the question of what EC2 Other is
- 📢 AWS announces several new Unavailability Zones
A big thanks to this week’s sponsor:
There are a lot of cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, but it’s really simple. Archera gives you the cost savings of a 1 or 3-year AWS Savings Plan with a commitment as short as 30 days. If you do not use all the cloud resources you have committed to, Archera will literally cover the difference. Other cost management tools may say they offer “insured commitments”, but remember to ask: Will you actually give me my rebate? Because Archera will.
Check out thecloudpod.net/archera to schedule a demo today.
Follow Up
00:45 Ground control to Microsoft: Artemis 2 astronauts deal with Outlook hiccup in deep space
- Artemis 2 astronauts aboard NASA’s Orion spacecraft encountered a common Outlook configuration issue on their first day in space, requiring remote IT support from Mission Control to resolve it by reloading the commander’s files.
- NASA uses commercial off-the-shelf software like Microsoft Outlook for crew scheduling and personal communications, while keeping primary flight systems on separate radiation-hardened hardware, illustrating a practical separation of concerns in mission-critical environments.
- The Outlook issue stemmed from the app having configuration problems when no direct network connection is available, which the flight director noted is not uncommon, raising questions about offline-readiness for software deployed in connectivity-constrained environments.
- This incident is a useful reminder for cloud and enterprise software users that applications heavily dependent on network connectivity can behave unpredictably in low or no-connectivity scenarios, and offline mode reliability remains an important consideration for software selection.
- Microsoft has not issued a public comment, but the episode highlights how widely deployed enterprise software is, reaching use cases well beyond what vendors typically design or test for.
03:31 Iran declared AWS, Google, and Microsoft data centers military targets. The Legal and strategic fallout is just beginning
- Iran’s April 2025 declaration named the Joint Warfighting Cloud Capability (JWCC) contract specifically, arguing that AWS, Google, Microsoft, and Oracle data centers hosting Pentagon AI and intelligence workloads have lost civilian status under the Geneva Conventions principle of distinction.
- The legal argument centers on the fact that classified military workloads share physical infrastructure with banking, healthcare, and consumer services.
- The JWCC contract, worth up to $9 billion, was deliberately designed to distribute military workloads across multiple commercial providers to avoid vendor lock-in, but this decision inadvertently spread the targeting problem across every major hyperscaler simultaneously rather than containing it to a single provider.
- Northern Virginia, the densest data center concentration on Earth, sits near the Pentagon’s most sensitive cloud workloads, meaning a single facility in Ashburn could simultaneously process classified Pentagon data, hospital records, and financial transactions with no practical way to separate them once a conflict begins.
- Insurance and operational costs are already responding to this risk, with businesses in geopolitically sensitive regions facing substantially higher premiums for multi-region redundancy and war-risk coverage, costs that will eventually pass through to end customers regardless of whether any strike occurs.
- The article identifies three structural fixes: DoD physically isolating JWCC workloads from civilian infrastructure, Congress updating defense cloud procurement rules to account for civilian collateral risk, and hyperscalers disclosing to commercial customers whether their specific facilities host military workloads.
- None of these changes are currently underway at a meaningful scale.
04:05 📢 Justin – “In the case of FedRAMP and JWCC, those are typically in the FedRAMP data centers in the US, so it’s a little bit of an interesting distinction, but there’s no guarantee that they’re not putting FedRAMP-type workloads into regions closer to the war zone. There’s no conversation about that, so I can see Iran’s point in this. And this will definitely make insurance and operating in clouds more expensive for companies who are very politically sensitive.”
AI Is Going Great – Or How ML Makes Money
08:07 Codex now offers pay-as-you-go pricing for teams
- OpenAI is introducing pay-as-you-go pricing for Codex-only seats within ChatGPT Business and Enterprise workspaces, billing on token consumption with no rate limits instead of a fixed per-seat fee, giving teams more cost visibility across workflows.
- ChatGPT Business annual pricing drops from $25 to $20 per seat for teams that want standard ChatGPT access with Codex usage limits included, while the new Codex-only seat option serves teams that want dedicated coding agent access without the broader ChatGPT bundle.
- OpenAI is offering eligible Business workspaces $100 in credits per new Codex-only team member added, capped at $500 per team for a limited time, which lowers the barrier for initial pilots.
- Codex now supports Plugins and Automations through its macOS and Windows apps, allowing teams to connect the coding agent to existing internal systems and tooling rather than treating it as a standalone tool.
- OpenAI reports over 2 million weekly active Codex builders and a 6x growth in Codex users within Business and Enterprise accounts since January, with named customers including Notion, Ramp, and Braintrust using it to standardize engineering workflows.
09:31 📢 Jonathan – “I think if you want the best performance, you’re going to have to pay for what you use. I think anyone that’s paying for a bundle is always going to be second class.”
13:57 Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage
- Starting April 4, Anthropic is requiring Claude Code subscribers to pay separately on a pay-as-you-go basis for usage through third-party tools like OpenClaw, rather than drawing from their existing subscription limits.
- This affects all third-party harnesses, with more platforms to follow.
- Anthropic’s head of Claude Code cited infrastructure constraints and unsustainable usage patterns from third-party tools as the reason for the change, and noted the company is offering full refunds to subscribers who were unaware of the policy shift.
- The timing is notable given that OpenClaw’s creator, Peter Steinberger, recently joined OpenAI, and OpenClaw continues as an open source project with OpenAI backing. Steinberger publicly stated he attempted to negotiate with Anthropic and only managed to delay the pricing change by one week.
- For developers building on or using AI coding assistants through third-party integrations, this signals a broader industry pattern where AI providers may separate subscription pricing from API-level or harness-level consumption, adding cost complexity for teams relying on open source tooling around proprietary models.
- OpenAI recently shut down its Sora app to reallocate compute resources, reflecting that both major AI providers are actively managing infrastructure capacity as demand from software engineering use cases like Claude Code continues to grow.
14:59 📢 Jonathan – “I understand *why* they’re doing it, because there’s a big difference between somebody having a conversation or somebody doing coding, where you are mostly using cache hits for the majority of the work, versus OpenClaw where the context changes constantly, and making calls every 60 seconds. It is a completely different type of workload. At the same time, I’m paying $200 a month…”
16:38 Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute
- Anthropic has signed a multi-gigawatt TPU capacity agreement with Google and Broadcom, with infrastructure expected to come online starting in 2027. This builds on an existing October 2025 TPU expansion and deepens Anthropic’s reliance on Google Cloud alongside AWS and NVIDIA hardware.
- Anthropic’s run-rate revenue has grown from roughly $9 billion at the end of 2025 to over $30 billion, with enterprise customers spending over $1 million annually, doubling from 500 to over 1,000 in under two months. The compute expansion is a direct response to this accelerating demand.
- Anthropic continues a multi-cloud hardware strategy, running Claude on AWS Trainium, Google TPUs, and NVIDIA GPUs to match workloads to appropriate chips. Amazon remains the primary cloud and training partner, with ongoing work on Project Rainier.
- Claude is currently the only frontier AI model available across all three major cloud platforms: AWS Bedrock, Google Cloud Vertex AI, and Microsoft Azure Foundry.
- This broad availability has practical implications for enterprises already committed to any of the three major cloud providers.
- The majority of new compute will be US-based, extending Anthropic’s November 2025 pledge to invest $50 billion in American AI infrastructure.
- For cloud practitioners, this signals continued long-term capacity constraints driving large-scale, multi-year infrastructure commitments across the industry.
08:24 Project Glasswing: Securing critical software for the AI era
- Anthropic announced Project Glasswing, a coalition including AWS, Google, Microsoft, Apple, Cisco, NVIDIA, and others, built around a new unreleased model called Claude Mythos Preview that is focused specifically on finding and fixing software vulnerabilities in critical infrastructure.
- Mythos Preview has already identified thousands of high-severity vulnerabilities autonomously, including a 27-year-old flaw in OpenBSD, a 16-year-old bug in FFmpeg that survived 5 million automated test runs, and a Linux kernel privilege escalation chain, all of which have since been patched.
- The model will not be generally available, but partners can access it via Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry at $25 per million input tokens and $125 per million output tokens after an initial period covered by $100M in Anthropic usage credits.
- Anthropic is donating $4M to open-source security organizations, including Alpha-Omega, OpenSSF through the Linux Foundation, and the Apache Software Foundation, to help maintainers respond to vulnerabilities the model surfaces.
- The initiative signals a shift in how AI safety and capability tradeoffs are being handled in practice, with Anthropic planning to test new cybersecurity safeguards on an upcoming Claude Opus model before considering any broader deployment of Mythos-class capabilities.
20:44 📢 Justin – “…is it probably really great at finding stuff? Is it really good at chaining things together to find these attacks? Yes. Is it as scary as they may get out to be? Maybe, maybe not. I don’t know; time will tell. I’m not going to be spending money on Mythos tokens to find out, but I am curious to see what people are coming out with now that it’s out in the wild.”
AWS
21:56 Announcing managed daemon support for Amazon ECS Managed Instances
- ECS Managed Daemons lets platform engineers deploy and update monitoring, logging, and tracing agents independently from application teams, eliminating the need to coordinate task definition changes or service redeployments across hundreds of services.
- Daemons are guaranteed to start before application tasks and drain last, ensuring operational tooling like the CloudWatch Agent is always available throughout the application lifecycle, including during rolling updates.
- A new daemon_bridge network mode keeps daemon containers isolated from application networking while still allowing communication, and daemons support privileged container access and host filesystem mounts for deep system-level visibility.
- Each instance runs exactly one daemon copy shared across all application tasks on that instance, which optimizes resource utilization and allows CPU and memory parameters to be managed centrally without rebuilding AMIs or modifying application task definitions.
- The feature is available now in all AWS regions at no additional charge beyond standard compute costs for the daemon tasks themselves, and can be configured through the ECS console or the new managed daemons API documented here.
22:55 📢 Jonathan – “What a useful feature!”
23:36 Amazon SES Mail Manager adds new features for enhanced security and email processing
- Amazon SES Mail Manager now supports optional STARTTLS configuration, allowing legacy systems that lack full STARTTLS support to still connect to Mail Manager without requiring a full infrastructure overhaul.
- Mutual TLS (mTLS) adds certificate-based authentication at the Ingress Endpoint level, giving organizations a stronger identity verification layer for inbound email connections beyond standard encryption.
- Two new rule actions expand email processing flexibility: Invoke Lambda lets you trigger custom code directly from rule sets for advanced routing or transformation logic, while the Bounce action sends RFC-compliant SMTP rejection responses back to sending servers.
- These features are available now across most SES Mail Manager regions, with the notable exception of Middle East UAE and Middle East Bahrain, so customers in those regions will need to wait for expansion.
- Pricing for SES Mail Manager follows existing SES usage-based pricing, so the cost impact of these new features will depend on Lambda invocation volume and overall email processing scale rather than any new flat fees.
24:38 📢 Justin – “They’ve actually added a lot of features to Mail Manager recently; the fact that it now can handle bounce protection and handles all of that stuff that you used to have to build your own toil for, it’s nice that that stuff is now there.”
26:07 Amazon Bedrock Guardrails supports cross-account safeguards with centralized control and management
- Amazon Bedrock Guardrails now supports cross-account safeguards in general availability, letting organizations enforce a single guardrail policy across all AWS accounts and organizational units from a central management account, covering every Bedrock model invocation automatically.
- There are two enforcement levels: organization-level enforcement applies guardrails via AWS Organizations policies to all member accounts and OUs, while account-level enforcement applies guardrails to all Bedrock inference calls within a single account, giving teams flexibility to layer controls.
- A notable configuration option lets admins choose between Comprehensive mode, which enforces guardrails on all content regardless of caller tagging, and Selective mode, which only applies guardrails to content that callers explicitly tag, useful for mixed workloads with pre-validated and user-generated content.
- One practical gotcha worth flagging: specifying an incorrect guardrail ARN in the policy does not just fail silently; it blocks all Bedrock model inference for affected accounts, so ARN accuracy is critical before attaching policies to production OUs.
- The feature is available now across all AWS commercial and GovCloud regions where Bedrock Guardrails is supported, with pricing tied to each enforced guardrail based on its configured safeguards per the Amazon Bedrock pricing page.
- Automated Reasoning checks are not supported with this capability.
27:02 📢 Justin – “If you are not careful, you can lock yourself out of your domain.”
28:09 AWS Cost Explorer launches Natural Language Query capabilities powered by Amazon Q
- AWS Cost Explorer now supports natural language queries powered by Amazon Q Developer, letting users ask plain-English questions like “Show me my top spending services this month” and receive both written insights and automatically updated charts, filters, and groupings simultaneously.
- The feature supports conversational follow-up questions with maintained context, meaning users can move from a quick cost check to a detailed investigation without switching tools or manually reconfiguring visualizations.
- When Amazon Q pulls from additional datasets beyond raw cost and usage data, such as pricing catalogs or anomaly detection, those results appear in a separate artifacts panel rather than the main Cost Explorer view, which is a useful distinction to understand when interpreting outputs.
- This is available at no additional charge across all commercial AWS Regions today, making it accessible without budget justification for teams already using Cost Explorer.
- The practical impact is that non-technical stakeholders, like finance or product teams, can now query AWS spend directly without needing to understand Cost Explorer’s filter and grouping mechanics, potentially reducing the bottleneck on cloud or DevOps teams for routine cost reporting.
28:44 📢 Justin – “I did play with, this because I was curious and I’ve done a lot of really cool things with AI for cost management recently. It’s not very good. Like most Amazon Q things, it’s not great.”
31:38 Building real-time conversational podcasts with Amazon Nova 2 Sonic
- Amazon Nova 2 Sonic is a speech-to-speech model available through Amazon Bedrock that handles real-time conversational AI with support for seven languages and a 1 million token context window, making it practical for voice-first applications like customer support and interactive learning.
- The AWS blog post demonstrates a proof-of-concept podcast generator that uses two Nova Sonic instances to simulate a host-and-expert dialogue, streaming audio in real time using a Flask and AsyncIO architecture with RxPy for reactive event processing.
- A notable technical detail is the stage-aware content filtering system, which distinguishes between SPECULATIVE and FINAL generation stages to eliminate duplicate audio chunks and prevent artifacts, using a combination of interruption markers, text deduplication, and audio hash fingerprinting.
- The architecture captures audio at 16kHz PCM input and returns synthesized speech at 24kHz PCM output through a bidirectional event stream brokered by Amazon Bedrock, with the blog noting that PyAudio is suitable for server-side demos, but production deployments should use Web Audio API or WebRTC for browser clients.
- Practical use cases beyond podcasting include multilingual content localization, ecommerce product commentary, and enterprise training content, with pricing tied to Amazon Bedrock consumption-based rates rather than a fixed subscription, so costs scale with actual usage volume.
32:32 📢 Matt – “Still not out of a podcasting job yet. Got it.”
33:40 Launching S3 Files, making S3 buckets accessible as file systems
- Amazon S3 Files lets you mount any general-purpose S3 bucket as a native NFS v4.1 file system on EC2, ECS, EKS, and Lambda, meaning you can use standard file commands like ls, cp, and echo while changes sync back to S3 within minutes.
- This eliminates the longstanding tradeoff between S3’s durability and cost versus a file system’s interactive capabilities.
- Under the hood, S3 Files is built on EFS and delivers approximately 1ms latency for active data, with intelligent pre-fetching and byte-range reads to minimize unnecessary data transfer and costs.
- Files not on high-performance storage are served directly from S3 to maximize throughput for large sequential reads.
- The feature is positioned specifically for workloads where multiple compute resources need shared, concurrent access to the same data without duplication, including agentic AI systems using file-based Python tools and ML training pipelines. It supports NFS close-to-open consistency for collaborative workloads.
- Pricing is based on data stored in the file system, small file reads, write operations, and S3 requests during synchronization, so costs will vary significantly by access pattern and workload type.
- Full pricing details are on the S3 pricing page, and the service is available now in all commercial AWS regions.
- AWS is careful to position S3 Files alongside rather than replacing EFS and FSx, noting FSx remains the better choice for on-premises NAS migrations, HPC workloads with Lustre, and workloads requiring NetApp ONTAP or Windows File Server compatibility.
- Last Week in AWS blog: S3 Is Not a Filesystem (But Now There’s One In Front of It)
35:32 📢 Jonathan – “In other possibly related news, the NetApp stock price is down from $104 to $96 in the past 24 hours, because that’s basically what NetApp tiered storage does.”
GCP
39:37 Unifying real-time and async inference with GKE Inference Gateway
- GKE Inference Gateway now supports both real-time and async inference workloads on the same shared GPU/TPU accelerator pool, eliminating the need to maintain separate clusters for each traffic type.
- This addresses a common infrastructure inefficiency where real-time clusters sit idle during off-peak hours while async jobs run on underutilized secondary hardware.
- The async component works by integrating a Batch Processing Agent with Cloud Pub/Sub, where latency-tolerant requests are pulled from a queue and routed to the Inference Gateway as lower-priority “sheddable” traffic that fills unused compute cycles between real-time spikes.
- Testing showed that without the Async Processor Agent, unmanaged multiplexing of low-priority requests caused 99% message drop, while using the agent resulted in 100% of latency-tolerant requests being served during available capacity windows.
- This demonstrates that the priority enforcement mechanism is doing meaningful work, not just theoretical traffic shaping.
- The project is open source and available on GitHub at github.com/llm-d-incubation/llm-d-async, meaning teams can use it across multiple cloud environments rather than being locked into GKE specifically. Pricing would follow standard GKE and Pub/Sub usage costs with no separate charge for the gateway component itself.
- The next development phase will add deadline-aware scheduling, letting users set soft completion windows for batch jobs so the system can make more informed decisions about when to process filler traffic relative to real-time demand.
40:48 📢 Jonathan – “…that’s very cool; especially works in the interest of the cloud vendors now who can maximize their utilization of the GPUs. There’s still a lot CPUs sitting there idle though, like 60 to 70% CPU idle while those GPUs are full on.”
41:04 Improve coding agents’ performance with Gemini API Docs MCP and Agent Skills.
- Google released two tools to address a core limitation of coding agents: outdated API knowledge due to training data cutoffs.
- The Gemini API Docs MCP connects agents to live Gemini API documentation via the Model Context Protocol, while Gemini API Developer Skills adds best-practice patterns and SDK guidance.
- Using both tools together shows measurable improvements in evals, achieving a 96.3% pass rate with 63% fewer tokens per correct answer compared to standard prompting.
- The token reduction is worth noting for developers concerned about cost and latency in agentic workflows.
- The MCP server is accessible at gemini-api-docs-mcp.dev and works with any MCP-compatible coding agent, making it broadly applicable beyond just Google-native tooling. Setup documentation is available at ai.google.dev/gemini-api/docs/coding-agents.
- This approach of pairing a live documentation server with a skills layer is a practical pattern that other API providers could adopt, and it highlights a growing need for real-time context injection as AI coding tools become more common in developer workflows.
Azure
44:09 Public Preview: Rule impact analysis on Azure Network Watcher
- Azure Network Watcher now offers a public preview feature called Rule Impact Analysis, which lets network admins simulate the effect of security admin rules before actually applying them to their environment, reducing the risk of unintended connectivity disruptions.
- The feature is particularly useful for teams managing Azure Virtual Network Manager security configurations, as it helps identify rule conflicts and validate that connectivity requirements are met before deployment.
- This addresses a common operational pain point where applying network security rules in production environments can cause outages or unexpected behavior that is difficult to roll back quickly.
- Target users are network and security engineers in organizations with complex Azure networking topologies who need a safer change management process for security policy updates.
- The feature is currently in public preview, which typically means no additional cost beyond standard Network Watcher pricing, though customers should verify final pricing at general availability via the Azure pricing calculator at azure.microsoft.com/pricing.
44:35 📢 Justin – “2026 and we still dealing with rule conflicts and firewalls.”
48:01 Azure VDI Storage Benchmark: Premium SSD vs Standard SSD Performance and Cost Breakdown
- GO-EUC’s benchmark research comparing Premium SSD and Standard SSD for Azure VDI workloads found that Premium SSD delivers up to 8 times higher IOPS and 80-90% lower latency than Standard SSD, with the performance gap widening as disk size increases.
- Standard SSD shows a fixed performance ceiling of roughly 850-980 IOPS regardless of disk size, while Premium SSD scales from about 1800 IOPS at 128GB up to 8100 IOPS at 2048GB, making disk sizing a meaningful architectural lever only for Premium SSD.
- The cost comparison is less straightforward than it appears because Standard SSD carries transaction fees that can push its total cost close to Premium SSD pricing under heavy VDI workloads, making Premium SSD a more predictable cost option despite its higher base price.
- The 2048GB Premium SSD at $284.94 per month emerges as the recommended sweet spot, since moving to 4096GB costs $545.10 with only marginal performance gains, and at 2500-seat scale that sizing decision translates to over $7.8 million in annual cost difference.
- The research used synthetic DiskSpd testing rather than real user load simulation, so results reflect maximum disk capabilities under controlled conditions and may differ from production environments, with GO-EUC noting a load simulation follow-up is planned.
49:25 📢 Justin – “It’s not Microsoft, and it’s not something Microsoft paid for and was something done independently, so I approve.”
Emerging Clouds
51:26 Now Available: DigitalOcean Cloud Security Posture Management (CSPM)
- DigitalOcean has launched a native Cloud Security Posture Management tool that continuously evaluates resources like Droplets and Databases for misconfigurations without requiring agents or third-party tools, making it accessible to smaller teams without dedicated security staff.
- The tool is built directly into the DigitalOcean dashboard and API, addressing a common pain point where security visibility requires separate tooling and context switching across platforms.
- Unlimited free scans are available to all DigitalOcean customers, with advanced rules, automated guidance, and API integrations on upgraded plans, lowering the barrier to entry for basic security posture monitoring.
- A feature called Security Advisor adds an AI layer that summarizes findings and surfaces high-priority risks, helping teams focus on the most impactful issues first and reducing alert fatigue.
- This offering is positioned toward startups and SMBs running production workloads, including AI inference, who may lack the resources to implement enterprise-grade security tooling but still need consistent visibility into infrastructure risk.
52:23 📢 Matt – “It’s definitely a nice feature to give the general developer or security person that might not know the intricacies of DigitalOcean a ‘here’s a read flag go look at this’.”
After Show
54:04 How Microsoft Vaporized a Trillion Dollars
- The author, a senior Microsoft engineer who rejoined Azure Core in May 2023, discovered on his first day that a 122-person org was seriously planning to port large portions of Windows to a tiny, low-power ARM chip on the Azure Boost accelerator card — a plan he immediately recognized as physically impossible given the hardware constraints.
- Nobody at Microsoft could explain why up to 173 agents were needed to manage each Azure node, what they all did, or how they interacted — a sprawl that created enormous fragility in the system orchestrating VMs for OpenAI, government clouds, and other mission-critical workloads.
- After the elimination of dedicated testers in 2014 and a talent exodus of original Azure architects, much of the org was staffed by junior engineers with 1–2 years of experience, led by managers without deep systems backgrounds, creating a persistent gap in senior technical leadership.
- The node management stack suffered millions of unattributed crashes per month, memory leaks, resource leaks, and “zombie VMs,” with each monthly release introducing more bugs than it fixed and most rollouts ending in panicked rollbacks.
- A publicly exposed web server (WireServer) running on the secure host OS held unencrypted tenant data from multiple customers in shared memory caches — a serious security liability in a hostile multi-tenancy environment — while crashing 300,000–500,000 times per month fleet-wide.
- Despite public claims at Ignite conferences from 2023–2025 that key components had been offloaded to Azure Boost and rewritten in Rust, the author states that as of late 2024, zero of 64 identified work items had been completed, and work hadn’t started on roughly 60 of them.
- “Digital escort sessions” — where $18/hour employees executed commands on production nodes under direction from overseas support staff, including from China — became routine, with nearly 200 JIT access requests per day observed over a two-month period, directly contradicting the original “no human touch” design vision.
- The author proposed an incremental componentization strategy to modernize the node stack from first principles — including a cross-platform component model, a new message bus, and security-hardened caches — but lower-level management responded with defensiveness and the org eventually terminated his employment.
- The consequences materialized: OpenAI signed an $11.9B deal with CoreWeave in March 2025 and later a $300B deal with Oracle, the Secretary of Defense publicly cited “a breach of trust” with Microsoft, and Microsoft’s stock dropped over 30% from its late-October 2025 peak, erasing more than a trillion dollars in market cap.
- The author escalated his concerns in formal letters to the Cloud + AI EVP (November 2024), the CEO (January 2025), and the Board of Directors — all sent before the public unraveling — and received no acknowledgment, reply, or request for clarification from any of them.
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

Leave a Reply