337: AWS Discovers Prices Can Go Both Ways, Raises GPU Costs 15 Percent

January 15, 2026 00:52:01
337: AWS Discovers Prices Can Go Both Ways,  Raises GPU Costs 15 Percent

337: AWS Discovers Prices Can Go Both Ways, Raises GPU Costs 15 Percent

January 15, 2026 00:52:01
0:00
0:00

 Welcome to episode 337 of The Cloud Pod, where the forecast is always cloudy! Justin, Matt, and Ryan have hit the recording studio to bring you all the latest in cloud and AI news, from acquisitions and price hikes to new tools that Ryan somehow loves but also hates? We don’t understand either… but let’s get started! 

Titles we almost went with this week

  • 🧑‍🔬Prompt Engineering Our Way Into Trouble
  • 💻The Demo Worked Yesterday, We Swear
  • ⚖️It Scales Horizontally, Trust Us
  • ✍️Responsible AI But Terrible Copy (Marketing Edition)

General News 

00:58 Watch ‘The Thinking Game’ documentary for free on YouTube

  • Google DeepMind is releasing the “The Thinking Game” documentary for free on YouTube starting November 25, marking the fifth anniversary of AlphaFold
  • The feature-length film provides behind-the-scenes access to the AI lab and documents the team’s work toward artificial general intelligence over five years.
  • The documentary captures the moment when the AlphaFold team learned they had solved the 50-year protein folding problem in biology, a scientific achievement that recently earned Demis Hassabis and John Jumper the Nobel Prize in Chemistry
  • This represents one of the most significant practical applications of deep learning to fundamental scientific research.
  • The film was produced by the same award-winning team that created the AlphaGo documentary, which chronicled DeepMind’s earlier achievement in mastering the game of Go. For cloud and AI practitioners, this offers insight into how Google DeepMind approaches complex AI research problems and the development process behind their models.
  • While this is primarily a documentary release rather than a technical product announcement, it provides context for understanding Google’s broader AI strategy and the research foundation underlying its cloud AI services. The AlphaFold model itself is available through Google Cloud for protein structure prediction workloads.

01:54 📢 Justin – “If you’re not into technology, don’t care about any of that, and don’t care about AI and how they built all the AI models that are now powering the world of LLMs we have, you will not like this documentary.” 

04:22 ServiceNow to buy Armis in $7.7 billion security deal • The Register

  • ServiceNow is acquiring Armis for $7.75 billion to integrate real-time security intelligence with its Configuration Management Database, allowing customers to identify vulnerabilities across IT, OT, and medical devices and remediate them through automated workflows. 
  • The deal is expected to close in the second half of 2026 and aims to triple ServiceNow’s current $1 billion annual security revenue.
  • The acquisition represents a strategic data play when combined with ServiceNow’s recent purchase of Data.World, giving the company both massive volumes of security asset data from Armis and the governance tools to make that data searchable and usable with AI. 
  • This combination enhances ServiceNow’s CMDB capabilities by an order of magnitude, according to Forrester analysts.
  • ServiceNow has completed six acquisitions this year, including Armis, Veza for identity access management, and Data.World for data governance, signaling an aggressive expansion strategy focused on security and data management. 
  • The company’s integration approach will be critical as customers watch how well these separate platforms merge into ServiceNow’s unified platform.
  • The deal positions ServiceNow to eliminate the patchwork of security tools organizations currently use by embedding security capabilities directly into its AI platform. 
  • Armis brings 950 employees, $340 million in annual recurring revenue, and recognition as a Gartner leader in cyber-physical systems protection.
  • Despite Salesforce entering the ITSM market, analysts assess ServiceNow maintains a five-year development lead in the space, though successful integration of multiple acquisitions remains the key challenge for maintaining that advantage.

05:49 📢 Ryan – “Is this security tooling that you use for analysis or threat hunting? Or is this something that they’re adding to their existing tooling, so it’s more of an integration?” 

📝Listener Note: If you have any idea what this company does, let us know! 

Cloud Tools

08:38 TOON vs. JSON | DigitalOcean

  • TOON (Token Oriented Object Notation) is a new data format designed to replace JSON in LLM prompts, claiming to reduce input token usage by approximately 40% while maintaining or improving accuracy. 
  • The format works by eliminating verbose JSON syntax and repeated tokens, converting structured data into a more compact representation that LLMs can still interpret effectively.
  • DigitalOcean released a Python library (toon-python) that automatically converts JSON datasets to TOON format before sending them to LLM endpoints. In their testing example, a JSON dataset using 172 tokens was reduced to 71 tokens in TOON format (59% reduction) while producing identical query results across multiple model providers, including Mistral 3.
  • TOON is specifically designed for input context containing structured data from databases or other sources, not for replacing plain text prompts or LLM outputs. Studies show that converting plain text instructions to structured formats like JSON doesn’t consistently improve accuracy, so TOON’s value proposition is primarily for applications already using JSON-formatted datasets in their prompts.
  • The format has limitations, including a lack of proven effectiveness for model outputs, potential compatibility issues with models that haven’t been trained on TOON examples, and the need for application-specific testing to verify accuracy and token savings. Function calling, parsing, and other use cases requiring JSON outputs should continue using JSON rather than attempting TOON conversions.
  • For cost-conscious LLM applications processing large structured datasets, TOON represents a practical optimization that could reduce token costs by 40% without requiring changes to model architecture or training. The token savings become more significant at scale, particularly for applications making frequent API calls with substantial context data.

09:16 📢 Justin – “I’d almost argue that TOON is more of what I would have wanted; very simple comma-separated values… so maybe LLMs will finally solve all my JSON complaints…but maybe not.” 

10:40 Google 2025 recap: Research breakthroughs of the year

  • Google released Gemini 3 Pro in November 2025 and Gemini 3 Flash in December 2025, with Gemini 3 Pro topping the LMArena Leaderboard and achieving 23.4% on MathArena Apex benchmark. 
  • Gemini 3 Flash delivers Pro-grade reasoning at Flash-level latency and cost, continuing Google’s trend where each generation’s Flash model surpasses the previous generation’s Pro model in quality while being substantially cheaper and faster.
  • The company introduced several specialized AI models, including Nano Banana Pro for native image generation and editing, Veo 3.1 for video generation, and Imagen 4 for image creation. 
  • Google also launched developer tools like Google Antigravity for AI-assisted software development and Jules, an asynchronous coding agent that acts as a collaborative partner for developers.
  • Google’s AlphaFold celebrated its 5th anniversary with over 3 million researchers across 190+ countries using the Nobel Prize-winning protein folding system, including 1 million users in low and middle-income countries. 
  • New AI tools for genomics include AlphaGenome for genome understanding and DeepSomatic for identifying genetic variants in tumors, moving beyond sequencing to the interpretation of complex genomic data.
  • Google’s quantum computing work achieved recognition with Googler Michel Devoret receiving the 2025 Nobel Prize in Physics, while the Quantum Echoes algorithm demonstrated progress toward real-world quantum applications. 
  • The company also introduced Ironwood, a new TPU designed for inference workloads using the AlphaChip design method, and launched WeatherNext 2, which generates weather forecasts 8x faster with up to 1-hour resolution covering flood predictions for 2 billion people across 150 countries.
  • Google formed the Agentic AI Foundation with other AI labs to establish open standards for agentic AI interoperability and announced Model Context Protocol support for Google services. 
  • The company also partnered with the US Department of Energy’s 17 national laboratories on the Genesis project to transform scientific research and expand educational AI initiatives with school districts like Miami-Dade County.

11:52 Meta acquires intelligent agent firm Manus, capping a year of aggressive AI moves

  • Meta acquired Singapore-based AI agent firm Manus for over $2 billion, bringing on board a company that claims $125 million in revenue run rate just eight months after launching its general-purpose AI agent. 
  • Manus will continue operating its subscription service while its team joins Meta to enhance automation across consumer products like Meta AI assistant and business tools.
  • Manus offers AI agents capable of executing complex tasks, including market research, coding, and data analysis, having processed over 147 trillion tokens and supported 80 million virtual computers to date. 
  • The platform provides both free and paid subscription tiers and has already been tested by Microsoft in Windows 11 PCs for tasks like creating websites from local files.
  • The acquisition represents Meta’s continued strategy of acquiring specialized AI startups to accelerate its AI capabilities and Llama large language model development. 
  • This follows Meta’s $14.3 billion investment in Scale AI in June and its acquisition of AI-wearables startup Limitless earlier this month, demonstrating an aggressive talent and technology acquisition approach.
  • Manus originated as a product of Chinese startup Butterfly Effect before relocating its headquarters from Beijing to Singapore in June, backed by investors including Tencent, HongShan Capital Group, and Benchmark, which led a $75 million Series B round. The company maintains strategic partnerships with Chinese tech firms, including Alibaba’s Qwen AI team, despite its geographic shift.

13:04 📢 Ryan – “You know, the upside, if they’ve just been around for 8 months, they don’t have the terrible tech debt that all these other firms have…they have 8 months of it.” 

AWS

15:59 Security Hub CSPM automation rule migration to Security Hub | AWS Security Blog

  • AWS has split Security Hub into two services: the new Security Hub with enhanced capabilities using the Open Cybersecurity Schema Framework (OCSF), and Security Hub CSPM, which continues as a separate service focused on cloud security posture management. 
  • The schema change from AWS Security Finding Format (ASFF) to OCSF means existing automation rules need migration to work with the new service.
  • AWS released an open-source Python migration tool on GitHub that automatically discovers Security Hub CSPM automation rules, transforms them to OCSF schema, and generates CloudFormation templates for deployment. 
  • The tool handles Regional differences intelligently, supporting both home Region deployments where rules apply across linked Regions and Region-by-Region deployments for unlinked Regions.
  • Not all automation rules can be fully migrated due to schema differences between ASFF and OCSF. The tool generates a migration report identifying rules that cannot be migrated or are only partially migrated, and creates all new rules in a disabled state by default so administrators can validate them before enabling.
  • The migration tool preserves the original order of automation rules, which matters when multiple rules operate on the same findings or fields. 
  • For organizations using a delegated administrator account with AWS Organizations, rules must be created in that account’s home Region, and the tool is designed to work with this model while also supporting single-account deployments.
  • This migration capability is included in the Security Hub essentials plan at no additional cost beyond standard Security Hub pricing
  • Organizations should review the ASFF to OCSF field mapping tables in the documentation before migration, as some criteria fields, like ComplianceAssociatedStandardsId and ProductName have no OCSF equivalents and require manual rule redesign.

18:21 📢 Matt – “The problem I always have with CPAMs – and this is a larger rant or conversation we can have – is there’s no interoperability. So if you have a CPAM and you want to then set up a GRC tool, or your other security tool can also run it, there’s no interoperability. So you then have to acknowledge things in three different spots, and there’s no single source of truth.

20:08 Proactive Amazon EKS monitoring with Amazon CloudWatch Operator and AWS Control Plane metrics | Containers

  • EKS clusters running version 1.28 and above now automatically send control plane metrics to CloudWatch at no extra cost, covering API server health, scheduler performance, and etcd database status. 
  • The new CloudWatch Observability Operator add-on extends this with Container Insights and Application Signals for deeper visibility into workloads and applications without code changes.
  • The enhanced monitoring addresses common operational challenges like detecting pod scheduling bottlenecks through metrics such as scheduler_pending_pods and scheduler_schedule_attempts_UNSCHEDULABLE, which help identify under-resourced worker nodes. API server throttling issues become visible through apiserver_request_total_429 metrics, showing when the default 600 in-flight request limit is approached.
  • Critical infrastructure components like admission webhooks, which power AWS Load Balancer Controller and IRSA functionality, can now be monitored for failures and latency issues. The apiserver_admission_webhook_rejection_count metric helps catch silent webhook failures that could prevent deployments, with CloudWatch Log Insights providing correlated log data for troubleshooting.
  • The etcd database monitoring is particularly important since EKS has an 8 GB recommended limit, and exceeding it makes clusters read-only. CloudWatch alarms can trigger at 80 percent capacity (6.4 GB) using the apiserver_storage_size_bytes metric, giving teams time to clean up unnecessary resources before hitting the limit.
  • Application Signals provides automatic instrumentation for Java applications with pre-built dashboards tracking traffic, latency, and availability at a 5 percent sampling rate. 
  • The feature integrates with CloudWatch anomaly detection using machine learning to identify unusual patterns in metrics like node_cpu_utilization without manual threshold configuration.

21:15 📢 Ryan – “I like this, except for the fact that it’s an operator…I don’t understand why this isn’t just configuration options in your cluster.” 

21:59 Amazon ECS Managed Instances now supports Amazon EC2 Spot Instances

  • ECS Managed Instances now supports EC2 Spot capacity, allowing customers to run fault-tolerant containerized workloads at up to 90% discount compared to On-Demand pricing while AWS handles all infrastructure management. 
  • You configure a new capacityOptionType parameter as spot or on-demand in your capacity provider settings.
  • This extends ECS Managed Instances beyond its existing capabilities of automatic provisioning, dynamic scaling, and cost-optimized task placement. AWS still handles the infrastructure operations through AWS-controlled access in your account, but now you can choose between spot and on-demand capacity types alongside existing options for GPU, network-optimized, and burstable instance families.
  • The feature is available in all AWS Regions where ECS Managed Instances currently operate. Pricing includes both the spot EC2 instance costs and an additional management fee for the compute provisioning service, though specific management costs are not disclosed in the announcement.
  • This targets customers running stateless or fault-tolerant containerized applications like batch processing, CI/CD pipelines, or web services that can handle interruptions. 
  • The combination of managed infrastructure and spot pricing addresses a common challenge where teams want cost savings from spot instances but lack resources to manage the complexity of spot interruptions and capacity management.

24:07 Enhance Amazon EKS network security posture with DNS and admin network policies | Containers

  • Amazon EKS now supports DNS-based and Admin network policies, allowing teams to control pod traffic using stable domain names instead of constantly changing IP addresses. 
  • This eliminates the operational overhead of maintaining IP allowlists for AWS services, on-premises systems, and third-party APIs while providing centralized policy management across multiple namespaces.
  • Admin network policies operate in two tiers with hierarchical enforcement that cannot be overridden by namespace-level policies, enabling platform teams to enforce mandatory security controls like blocking access to EC2 Instance Metadata Service at 169.254.169.254. 
  • The policies use label-based segmentation to apply security standards across multiple namespaces simultaneously, reducing the need for per-namespace policy management.
  • DNS-based policies are available in EKS Auto mode clusters version 1.29 and later, while Admin policies work in both EKS Auto mode and EC2-based clusters running VPC CNI version 1.21.1 or later. 
  • The feature removes the need for third-party network policy tools and integrates with existing Kubernetes NetworkPolicy resources for defense-in-depth security.
  • The policy evaluation order follows a strict hierarchy: Admin tier Deny rules take precedence over everything, followed by Admin Allow rules, then namespace-scoped policies, and finally Baseline tier policies. 
  • This ensures security teams can enforce organization-wide controls while still allowing application teams flexibility for namespace-specific requirements.
  • Real-world applications include multi-tenant environments where different applications need controlled access to specific AWS services like S3 or DynamoDB using patterns like asterisk.s3.amazonaws.com, and hybrid cloud scenarios where workloads access on-premises databases through stable DNS names that remain valid even as underlying infrastructure changes.

24:17 📢 Justin – “Thank you, Jesus.” 

27:46 📢 Ryan – “If you are a traditional engineer listening to our show, this is an example of something where you can take your skillset and add a ton of value.” 

28:00 AWS raises GPU prices 15% on a Saturday • The Register

  • AWS increased prices for EC2 Capacity Blocks for ML by approximately 15 percent over the weekend, with p5e.48xlarge instances jumping from $34.61 to $39.80 per hour in most regions. 
  • This marks a departure from AWS’s two-decade pattern of price reductions and represents one of the first straight increases to a line item not tied to regulatory requirements.
  • Capacity Blocks allow customers to reserve guaranteed GPU capacity for ML training jobs from one day to several weeks in advance with locked-in rates paid upfront. 
  • AWS attributes the increase to supply and demand patterns for this quarter, reflecting the global GPU shortage driven by increased AI workload demand across the industry.
  • The price increase creates complications for customers with Enterprise Discount Programs, as their percentage discounts remain the same, but absolute costs rise by 15 percent. 
  • This gives competitors like Azure and GCP a direct talking point for enterprise sales conversations, though whether they can absorb the demand remains uncertain given industry-wide GPU constraints.
  • The change establishes a precedent that could extend to other resource-constrained services, particularly RAM-intensive offerings that touch nearly every AWS service. 
  • The timing and execution on a Saturday with minimal announcement suggest AWS is testing customer response to price increases after conditioning the market to expect only decreases.
  • This affects primarily enterprise customers running serious ML workloads with budgets in the millions, as Capacity Block pricing targets teams that cannot afford training run interruptions. 
  • The broader concern is whether this signals a shift in AWS’s pricing strategy across other services where supply constraints or cost increases exist.

29:31 📢 Matt – “I don’t think it’s a broader concern; but I think it’s the first real time you’re seeing a dramatic increase, and it’s been a fear for many companies for many years…what if they raise the prices and there’s nothing we can do because we’re already there? And they’re doing it, and there’s not much you CAN do.” 

30:51 EC2 Capacity Manager now includes Spot interruption metrics

  • EC2 Capacity Manager adds three new Spot interruption metrics at no additional cost across all commercial AWS regions. 
  • The metrics track total Spot instance count, interruption counts, and interruption rates across regions, availability zones, and accounts to help optimize Spot placement strategies.
  • The new visibility helps customers make data-driven decisions about Spot instance diversification by identifying patterns in interruptions. 
  • Organizations can use this data to determine which availability zones or instance types experience fewer interruptions and adjust their Spot strategies accordingly.
  • This enhancement integrates with existing Spot placement score functionality to provide a complete picture of Spot capacity management. 
  • Customers can now correlate predicted availability scores with actual interruption data to validate and refine their capacity planning decisions.
  • The metrics are particularly valuable for organizations running large-scale Spot fleets where even small improvements in interruption rates translate to meaningful cost savings. 
  • By tracking interruption rates over time, teams can measure the effectiveness of their diversification strategies and identify opportunities to expand into more stable capacity pools.

31:11 📢 Justin – “Or…you could just make this a service I could subscribe to.”

GCP

32:50 Looker self-service Explores, tabbed dashboards, custom themes | Google Cloud Blog

  • Y’all can thank Ryan if you’re not into this particular story. Hit him up on Slack and let him know your thoughts. 
  • Looker now allows users to upload CSV and spreadsheet files directly into the platform through a drag-and-drop interface in the new self-service Explores feature, currently in Public Preview. 
  • This bridges the gap between governed data models and ad-hoc analysis by letting users combine local files with existing Looker data while maintaining administrator oversight on uploads and permissions.
  • The new tabbed dashboard feature helps organize complex dashboards into logical sections with automatic filter propagation across tabs, reducing visual clutter by showing only relevant filters per view. 
  • Users can share specific tab URLs and export entire multi-tab dashboards as single PDF documents, making it easier to present cohesive data narratives.
  • Internal dashboard theming is now available in Public Preview, enabling organizations to customize tile styles, colors, fonts, and formatting to match corporate branding within the Looker application. 
  • Administrators can create reusable theme templates and set default themes across entire instances to ensure consistency.
  • A new content certification flow helps distinguish between ad-hoc experiments and vetted data sources, addressing governance concerns when users upload their own datasets. 
  • This feature works alongside administrator controls to maintain data quality standards while enabling self-service capabilities.
  • These features are available starting with Looker version 25.20 and can be enabled through the Admin Labs page, with no specific pricing changes announced as they appear to be included in existing Looker subscriptions.

34:06 📢 Ryan – “For everyone that has to supply you with pretty graphs and pictures, this is very important. It is very difficult to sort of modify and work with existing data sets in any BI tool, and so this is another knob that you can put. And I could use something like this for just uploading a very easy CSV of like product names or usernames or something that’s just a list, versus having to parse that out of a very large data set, which may have a combination of structured and unstructured data or just bad schema adherence. And so this is sort of a nice tool for being able to create those types of things.” 

35:28 Optimizing AlloyDB AI text-to-SQL accuracy | Google Cloud Blog

  • AlloyDB AI’s natural language API, currently in preview, enables developers to build agentic applications that translate natural language questions into SQL queries with near-100% accuracy. 
  • The system uses descriptive context like table descriptions, prescriptive context including SQL templates and facets for complex conditions, and a value index to disambiguate database-specific terms that foundation models wouldn’t recognize.
  • The API addresses a critical business need where 80-90% accuracy isn’t sufficient, particularly in industries like real estate search and retail, where poor query interpretation directly impacts conversions and revenue. 
  • Users can iteratively improve accuracy through a hill-climbing approach, starting with out-of-the-box capabilities and progressively adding context to handle nuanced questions like “homes near good schools” that require specific business logic for terms like “near” and “good.”
  • The system provides explainability features that show users what the API understood their question to mean, allowing agents and end users to verify the interpretation even when accuracy isn’t perfect. 
  • This transparency helps mitigate the impact of occasional misinterpretations while the system approaches 100% accuracy for specific use cases.
  • Integration options include MCP Toolbox for Databases for developers writing AI tools or Gemini Enterprise for no-code agentic programming, allowing conversational applications that combine web knowledge with database queries. The technology works across structured, unstructured, and multimodal data using AlloyDB’s vector search, text search, and AI operators like AI.IF for semantic conditions.
  • Google plans to expand this natural language capability beyond AlloyDB to a broader set of Google Cloud databases, though specific timelines and pricing details for the preview or general availability weren’t disclosed in the: announcement.

36:43 📢 Justin – “Natural language query – I am here for it.” 

37:56 New Enhanced Tool Governance in Vertex AI Agent Builder | Google Cloud Blog

  • Google introduces enhanced tool governance for Vertex AI Agent Builder through Cloud API Registry integration, allowing administrators to centrally manage and curate approved tools across their organization while developers access them via a new ApiRegistry object in the Agent Development Kit. 
  • This addresses the duplicate work problem where developers previously built tools separately for each agent and gives enterprises better control over what data and APIs their AI agents can access.
  • The Agent Development Kit now supports Gemini 3 Pro and Flash models with full TypeScript compatibility, plus improved state management features including automatic recovery from failures, human-in-the-loop pause and resume capabilities, and conversation rewind functionality. 
  • The new Interactions API integration provides consistent multimodal input/output handling across agents, while A2UI enables agents to pass UI components directly to applications without the security risks of executable code.
  • Agent Engine sessions and memory bank reach general availability, powered by Google Cloud AI Research’s topic-based approach for managing both short-term and long-term agent memory across interactions. 
  • The service expands to seven additional regions globally, with runtime pricing reduced and billing for additional Agent Engine services beginning January 28, 2026 (specific pricing details available in documentation).
  • Customer implementations show practical benefits: Burns & McDonnell uses Agent Builder to transform project data into real-time intelligence, Payhawk reduced expense submission time by over 50 percent through Memory Bank’s context retention, and Gurunavi projects a 30 percent improvement in user experience for their restaurant discovery app by remembering user preferences and patterns.
  • The platform now includes Vertex AI Agent Garden with one-click deployment of curated agent samples and an Agent Starter Pack providing production-ready templates for building, testing, and deploying agents. 
  • Apigee integration allows organizations to transform existing managed APIs into custom MCP servers, bringing multi-cloud tools into a centralized catalog through Cloud API Registry.

38:47  📢 Ryan – “This just goes to show how early we are in this ecosystem. Companies are just starting to sort of get wise that they’ve got a whole bunch of developers using these platforms, and they’re all kind of doing their own things and separate little silos and there’s very little ability to share or get any kind of optimization with those central resources… I do think that this is a good thing.” 

39:55 Introducing VM Extensions Manager | Google Cloud Blog

  • Google launches VM Extensions Manager in preview to centralize and automate the installation and lifecycle management of OS agents across Compute Engine fleets. 
  • The service eliminates manual scripting and startup script dependencies by providing policy-driven control that can reduce operational overhead from months to hours, according to Google.
  • The preview supports three critical extensions at launch: Cloud Ops Agent for telemetry collection, Agent for SAP for monitoring SAP workloads, and Agent for Compute Workloads for workload evaluation. 
  • Administrators can pin specific extension versions or let the system automatically deploy the latest releases, with more extensions planned for future support.
  • VM Extensions Manager offers two rollout speeds for global policies: SLOW mode executes zone-by-zone deployments over 5 days by default to minimize risk, while FAST mode enables immediate fleet-wide updates for urgent security patches. 
  • Zonal policies at the project level are available now, with global policies and organization or folder-level policies coming in the following months.
  • The service integrates directly into the existing compute.googleapis.com API without requiring new API enablement or discovery, allowing administrators to start creating policies immediately through the Cloud Console or gcloud CLI. Documentation is available here

42:18  📢 Matt – “I like that they released both of those day one – both slow and fast mode.” 

43:23 Cloud SQL for MySQL introduces optimized writes | Google Cloud Blog

  • Cloud SQL for MySQL Enterprise Plus edition now includes optimized writes, a feature that automatically tunes five different MySQL parameters and configurations based on real-time workload metrics to improve write performance. 
  • The feature is enabled by default on all Enterprise Plus instances and requires no manual intervention or configuration changes.
  • Google reports up to 3x better write throughput compared to the standard Enterprise edition, with reduced latency, particularly beneficial for write-intensive OLTP workloads. 
  • Performance gains vary based on machine configuration, and the feature complements the existing SSD-backed data cache that provides up to 3x higher read throughput.
  • The optimized writes feature works by automatically adjusting MySQL flags, data handling, and parameters in response to instance and workload characteristics. 
  • Customers can benchmark the improvements using sysbench by comparingthe  Enterprise edition, the Enterprise Plus without optimized writes, and the Enterprise Plus with optimized writes enabled.
  • Existing Cloud SQL instances can upgrade to the Enterprise Plus edition in-place to access optimized writes, though specific pricing details for the Enterprise Plus tier are not provided in the announcement. 
  • The feature targets organizations running write-heavy database workloads that previously required manual MySQL tuning and optimization efforts.

Azure 

43:23 Microsoft announces acquisition of Osmos to accelerate autonomous data engineering in Fabric – The Official Microsoft Blog

  • Microsoft acquires Osmos to bring agentic AI capabilities to Fabric for autonomous data engineering workflows. Osmos uses AI agents to automate data preparation tasks that typically consume most of data teams’ time, transforming raw data into analytics-ready assets in OneLake without manual intervention.
  • The acquisition addresses a common enterprise challenge where organizations have abundant data but lack efficient ways to make it actionable. 
  • Osmos will integrate into Microsoft Fabric’s unified data platform, allowing AI agents to handle data connection, preparation, and transformation tasks that currently require significant manual effort and technical expertise.
  • The Osmos team joins Microsoft’s Fabric engineering organization to advance autonomous data operations within the existing Fabric ecosystem. 
  • This builds on Fabric’s existing capabilities around OneLake, Power BI, and unified data analytics by adding intelligent automation for data engineering workflows.
  • No pricing details or availability timeline were announced, though Microsoft indicates integration updates will be shared through the Microsoft Fabric Blog. The acquisition targets organizations spending excessive resources on data preparation rather than analysis, particularly those already invested in the Fabric ecosystem.

45:32  📢 Ryan – “As long as they deliver on the promise. There’s been solutions that make the same promise – not with AI… and it just never works the way it should. Drives me nuts that it’s so failure prone. As long as the AI and Agentic add to these things, that’s fantastic.” 

46:38 Microsoft’s strategic AI datacenter planning enables seamless, large-scale NVIDIA Rubin deployments | Microsoft Azure Blog

  • Azure is deploying NVIDIA’s next-generation Rubin platform at scale, with infrastructure already designed to handle its power, cooling, and networking requirements. 
  • Microsoft’s Fairwater datacenters in Wisconsin and Atlanta can accommodate Rubin’s 50 petaflops per chip and 3.6 exaflops per rack without retrofitting, representing a five-times performance jump over GB200 systems.
  • The deployment leverages Azure’s systems approach where compute, networking, storage, and infrastructure work as an integrated platform. 
  • Key technical enablements include support for sixth-generation NVLink with 260 TB/s bandwidth, ConnectX-9 1,600 Gb/s networking, HBM4 memory thermal management, and pod exchange architecture for rapid hardware servicing without extensive rewiring.
  • Azure’s track record includes operating the world’s largest commercial InfiniBand deployments and being first to deploy both GB200 and GB300 NVL72 platforms at scale. 
  • The company’s multi-year collaboration with NVIDIA on co-design means Rubin integrates directly into existing infrastructure, enabling faster customer deployments compared to competitors who need infrastructure upgrades.
  • Microsoft’s regional superfactory approach differs from other hyperscalers’ single megasite strategy, allowing more predictable global rollout of new AI capabilities. 
  • This modular design combined with Azure Boost offload engines, liquid cooling systems, and optimized orchestration through CycleCloud and AKS aims to maximize GPU utilization and deliver better performance per dollar at cluster scale.

Oracle 

46:38 Oracle is Set to Power on New Data Center in Michigan

  • Oracle is building a new data center in Saline Township, Michigan specifically to serve OpenAI’s infrastructure needs, marking another major cloud capacity expansion for AI workloads. 
  • The facility will use closed-loop non-evaporative cooling systems that consume water comparable to an average office building rather than millions of gallons daily like traditional evaporative systems.
  • The project includes a 17-year power agreement with DTE Energy where Oracle pays 100% of energy costs including new transmission lines and an onsite substation, with Michigan law prohibiting utilities from passing data center costs to existing ratepayers. 
  • Oracle claims its large customer contribution to DTE’s fixed costs will reduce overall energy costs for other customers by approximately $300 million annually by 2029-2030.
  • The facility will create 2,500 union construction jobs and 450 permanent on-site positions plus an estimated 1,500 jobs across Washtenaw County, with construction scheduled to begin in Q1 2026. The project includes $8 million annually for local schools, $1.6 million yearly in direct tax revenue for Saline Township, and over $14 million in community benefits.
  • Oracle is developing only 250 of 575 acres with the remaining land protected as open space, farmland, wetlands and woodlands including 47.5 acres in conservation easement. 
  • This represents Oracle’s 148th data center with 64 more under construction globally, though the company provides no specific pricing or service details for customers beyond OpenAI.

52:15 📢 Ryan – “But are you trading the water concern for the high energy costs?”

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0:00
0:00