256: Begun, The Custom Silicon Wars Have

April 24, 2024 00:40:5946.9 mb 0 Comments

tcp.fm

256: Begun, The Custom Silicon Wars Have

00:00 / 00:40:59

256 custom silicon

74 / 100

Powered by Rank Math SEO

Welcome to episode 256 of the Cloud Pod podcast – where the forecast is always cloudy! This week your hosts, Justin and Matthew are here this week to catch you up on all the news you may have missed while Google Next was going on. We’ve got all the latest news on the custom silicon hot war that’s developing, some secret sync, drama between HashiCorp and OpenTofu, and one more Google Next recap – plus much more in today’s episode. Welcome to the Cloud!

Titles we almost went with this week:

🍻I have a Google Next sized hangover
🎼Claude’s Magnificent Opus now on AWS
➡️US-EAST-1 Gets called Reliable; how insulting
🛩️The cloud pod flies on a g6

A big thanks to this week’s sponsor:

Check out Sonrai Securities’ new Cloud Permission Firewall. Just for our listeners, enjoy a 14 day trial at www.sonrai.co/cloudpod

General News

Today, we get caught up on the other Clouds from last week, and other news (besides Google, that is.) Buckle up.

04:11 OpenTofu Project Denies HashiCorp’s Allegations of Code Theft

After our news cutoff before Google Next, Hashicorp issued a strongly worded Cease and Desist letter to the OpenTofu project, accusing that the project has “repeatedly taken code Hashi provided under the BSL and used it in a manner that violates those license terms and Hashi’s intellectual properties.”
It notes that in some instances, OpenTofu has incorrectly re-labeled Hashicorp’s code to make it appear as if it was made available by Hashi, originally under a different license.
Hashi gave them until April 10th to remove any allegedly copied code from the OpenTofu repo, threatening litigation if the project failed to do so.
OpenTofu struck back – and they came with receipts!
They deny that any BSL licensed code was incorporated into the OpenTofu repo, and that any code they copied came from the MPL-Licensed version of terraform.
“The OpenTofu team vehemently disagrees with any suggestions that it misappropriated, mis-sourced or misused Hashi’s BSL code. All such statements have zero basis in facts” — Open Tofu Team
OpenTofu showed how the code they accused was lifted from the BSL code, was actually in the MPL version, and then copied into the BSL version from an older version by a Hashi Engineer.
Anticipating third party contributions might submit BSL terraform code unwittingly or otherwise, OpenTofu instituted a “taint team” to compare Terraform and Open Tofu Pull requests.
If the PR is found to be in breach of intellectual property rights, the pull request is closed and the contributor is closed from working on that area of the code in the future.
Matt Asay, (from Mongo) writing for Infoworld, dropped a hit piece when the C&D was filed, but then issued a retraction on his opinion after reviewing the documents from the OpenTofu team.

06:32 📢 Matthew – “It’s gonna be interesting to see, you know, general common ideas of where Terraform should go – are going to be coming on both of these platforms, and when you copy or if this is a good feature that Hashi Corp released and Open Tofu wants that feature – like you can’t just pull the codes. Do you rewrite it from scratch? Right, so then you rewrite it from scratch, but it does the same thing. So you’re kind of in that gray area where they’re going to look the same.”

8:50 Secrets sync now available on Vault Enterprise to manage secrets sprawl

When not making false allegations against OpenTofu, Hashi is releasing some interesting updates to Vault Enterprise 1.16.
Secrets Sync, now generally available, is a new feature that helps you manage secret sprawl by centralizing the governance and control of secrets that are stored within other secret managers.
Hashi claims that secret management doesn’t live up to its full potential unless it is centralized and managed on one platform.
Secret syncs lets users manage multiple external secret managers, which are called destinations in Vault. Supporting AWS Secrets Manager, Google Cloud Secrets Manager, Microsoft Azure Key Vault, Github Actions and Vercel (What no chef! :-O)
Engineering and security teams can generate, update, delete, rotate and revoke secrets from vault’s user interface, API, or CLI and have those changes sync to and from external secret managers to be used by your cloud-hosting applications.

10:28 📢 Justin – “It also basically solves one of the big challenges for your multi-cloud, because you’d have to set a vault and replicate your vaults between all these different cloud providers. So you’d have them local and high late, you know, low latency to your app. Now by just leveraging the, you’re basically leveraging the cloud provider as a caching layer for your vault. And that can be, you can choose whatever cloud you want to put it into. And that can be your primary location for it, which is really handy.”

13:41 Intel details Gaudi 3 at Vision 2024 — new AI accelerator sampling to partners now, volume production in Q3

Intel made a bunch of announcements at Vision 2024, including deep dive details on its new Gaudi 3 AI processors, which they claim offer up to 1.7x the training performance, 50% better inference, and 40% better efficiency than Nvidia’s market-leading h100 processors, but for significantly less money.
Intel also announced that its Datacenter cpu portfolio has a new name, with the new Granite Rapids and Sierra Forest chips now branded Xeon 6.
They are also working on new AI NIC ASIC for the Ultra ethernet consortium-compliant networking, an AI NIC chiplet that will be used in future XPU and Gaudi 3 processors – as well as to external customers through Intel Foundry.
Intels Gaudi 3 is the third generation of the Gaudi Accelerator, which was the product of their 2b dollar acquisition of Habana labs in 2019.
Gaudi Accelerators will enter production and GA in Q3 2024 for OEM systems, as well as available in the Intel developer cloud.
There are two form factors with the OAM HL 325L being the common mezzanine form factor found in high performance GPU based systems. This accelerator has 128gb of HBM2e, providing 3.7 TB/s of bandwidth. It also has twenty-four 200 GBPS ethernet NICS.
The OAM module has a 900W TDP and is rated for 1,835 TFLOPS of FP8 performance. The OAMs are deployed in groups of 8 per server node and you can scale up to 1024 Nodes.
They also have a Gaudi 3 PCIe Dual slot add-in card with 600W TDP. It has 128GB of HBMeE and twenty-four 200 gbps ethernet NCIS.
Intel claims the PCIe card has the same peak 1,835 TFLOPS of FP8 but interesting with 300W lower TDP. However, the scaling is more limited and limited to groups of 4.
Compared to the H100 the Intel Guadi was 1.7x faster to train the LLama2-13B model and 1.4X faster with the GPT-3-175B model.

16:10 📢 Matthew – “If we’re going to keep improving a lot of these LLMs and make them have more data that we’re building in on and not just, you know, and optimize in other ways, we got to start to make these things be more efficient.”

AWS

16:39 CEO Andy Jassy’s 2023 Letter to Shareholders

Andy’s annual letter to shareholders was dropped during Google Next, and overall it was interesting.
Revenue for 2023 grew 12% year over year.
Bunch of stuff on the store, delivery speeds, price, and advertising were all touched on, but obviously we’re only interested in the cloud info.
They started seeing substantial cost optimization inside companies trying to save money in an uncertain economy.
- Much of this optimization was catalyzed by AWS helping customers use the cloud more efficiently and leverage more powerful, price-performance AWS capabilities like Graviton chips, S3 intelligent tiering and savings plans.
- This has had a direct impact on short term revenue, but was best for customers, appreciated and should bode well for customers and AWS long term. By end of year they saw cost optimization attenuating, new deals accelerating and customers renewing at larger commitments over longer periods of times and migrations growing again
Lots of recaps of announcements and a strong mention of the Power of Gen AI and Q and customer excitement
He goes on a long spread talking about building primitives, it delivers speed, but also requires patience.
Overall, some of the things he didn’t mention were more impactful overall to the company, so take that for what you will.

17:54 What Amazon’s Shareholder Letter Didn’t Say

The information rightly calls out what he didn’t talk about in the letter:
- Antitrust lawsuit from US, New legislation in Europe that is constraining Amazon and other tech companies.
- He also highlights strong customer focus, but then the information points out how they have been sued by a prime video subscriber where they heavy-handed inserted Ads into the paid subscriptions.
The information summarizes that Andy has a deft hand at managing the company, but lacks the skills at crafting sparkly letters. (A good copywriter could have helped here.)

18:30 📢 Matthew – “It doesn’t surprise me that, you know, the cloud providers… want you to make money and… they really do want you to leverage it in an effective way. Because otherwise you’re just going to leave. So I think that with the uncertain economy, this was more and more on people’s minds about cost savings and optimizations and everything.”

20:31 Announcing general availability of Amazon EC2 G6 instances

Today we are announcing the general availability of Amazon EC2 G6 instances powered by NVIDIA L4 tensor core GPUs.
G6 Instances can be used for a wide range of graphics intensive and machine learning use cases.
G6 instances deliver up to 2x higher performance for learning inference and graphic workloads compared to Amazon EC2 G4dn instances.
Customers can use G6 instances for deploying ML models for natural language processing, language translation, video and image analysis, speech recognition, and personalization as well as graphics workload.

22:10 AWS KMS announces more flexible automatic key rotation

AWS KMS is announcing new options for automatic key rotations.
- You can now customize the frequency of rotation period between 90 days to 7 years as well as invoke key rotation on demand for customer managed KMS keys.
- Lastly, you can now see the history of all previous rotations for any KMS key that has been rotated.
We also introduce new pricing for KMS automatic key rotation.
Previously each rotation would cost $1/month per rotation to a KMS customer managed key. Now KMS keys can be rotated automatically or on demand, the first and second rotation of a key adds $1/month, but this price increase is capped at the second rotation and all rotations after your second rotation are not billed.
For customers that have keys with 3 or more rotations, all of the keys will see a price reduction to $3 a month starting the first week of May 2024.

23:50 Tackle complex reasoning tasks with Mistral Large, now available on Amazon Bedrock

Last month, AWS told us about Mistral AI models Mistral 7b and Mixtral 8x7b on Bedrock. Now they are bringing Mistral Large to Bedrock.
Mistral Large is ideal for complex tasks that require substantial reasoning capabilities, or ones that are highly specialized, such as Synthetic text generation or code generation.
In addition, at the Paris summit they released Bedrock in the Paris region. So now, when you’re visiting DIsneyland Paris you’re still good to go. 👍

24:40 Irish power crunch could be prompting AWS to ration compute resources

We’ve talked about it before, but this is the first time a major cloud provider has publicly acknowledged the issue.
Amazon is apparently restricting resources users can spin up in the Ireland region, and directing customers to other AWS regions across Europe instead.
Energy consumed by DC is a growing concern.
You cannot spin up GPU nodes in AWS Dublin as those locations are maxed out power wise.
When Amazon was pushed for a statement they responded “Ireland remains core to our global infrastructure strategy, and we will continue to work with customers to understand their needs, and help them to scale and grow their business”
Grid operator estimates that datacenter power draw will reach 25.7% of the national energy consumption by 2026.

25:40 📢 Matthew – “25% of a whole country’s power is a lot. And this is one of their oldest regions. So I feel like we looked this up at one time on the show in real time; and this was one of the first European regions. So it doesn’t surprise me that it’s a little bit more resource constrained and whatnot, where some of the other ones that were probably built with higher specifications 10 years later, had a better idea what the clouds were doing than when they built it out for the first time.”

27:13 Anthropic’s Claude 3 Opus model is now available on Amazon Bedrock

Conductor Claude’s Opus is now available on Amazon Bedrock.
Opus is the most intelligent Claude 3 model, with best in market performance on highly complex tasks. It can navigate open-ended prompts and sign-unseen scenarios with remarkable fluency and human-like understanding, leading the frontier of general intelligence.
If you haven’t been paying attention (or even if you have and just forgot) that means we now have Haiku, Sonnet, and Opus all available on Bedrock.

28:29 📢 Justin – “I do look forward to tooling coming out to help you figure out which model is the best for your workload, and help you kind of figure that out because otherwise it’s, you know, a lot of costs, a lot of expense trying to work out which models are the best ones for you and lots of test runs.”

28:46 Amazon CloudWatch Internet Weather Map – View and analyze internet health

Fiber and cables and outages, OH MY!
The internet is a crazy place full of BGP, Dark Fiber, undersea cables, overland cables, etc., all controlled by numerous carriers, universities and governments around the world.
When something on the internet goes bump in the night, it affects your customers or sites, and you want to be able to quickly localize and understand the issue as quickly as possible.
Amazon Cloudwatch is pleased to introduce the internet weather map to help.
Built atop a collection of global monitors operated by AWS, you get a broad, global view of internet weather, with the ability to zoom in and understand performance and availability issues that affect a particular city.
While these types of things exist on the internet, I like one that is managed by Amazon and available in cloudwatch.
Amazon is considering additional features to add and wants feedback, but some of the ideas they’re thinking about are to display causes of certain types of outages such as DDOS, BGP Route Leaks, and issues with route interconnects. Adding a view that is specific to chosen ISPs and displaying the impact to public SaaS applications. Justin is completely on board with this.

31:47 US-EAST-1 region is not the cloudy crock it’s made out to be, claims AWS EC2 boss

Dave Brown, global VP for Compute and Networking at AWS, spoke with the register at the Sydney Summit and defended US-east-1, the cloud giant’s first region, which has had more than its fair share of outages.
Brown argued that the region’s age doesn’t mean it’s less resilient than any other AWS facility, and spans hundreds of data centers but didn’t elaborate further.
Because the region is so big, it’s a natural target for early efforts and therefore experiences early failures
US-East Sandbox-1 is called that for a reason…

GCP

33:58 All 218 things we announced at Google Cloud Next ‘24 – a recap

218 (minus 10 customer case studies and 14 partner things – so we think it’s actually 194, but cloud math.)
Justin has had a chance to check out some of the videos now as well, and he has some *thoughts.*
- Developer Keynote is great, I was most impressed with Gemini Cloud Assist that helps you manage your GCP environment, the demo showed troubleshooting a load balancer issue and identifying the issue quickly in configuration.
Things from the document worth checking out:
- Google Vids in workspaces creates AI powered video creation app
- Using LLM Gmail now now block 20% more spam
- You can setup messaging interoperability from Google Chat with Slack and Teams via their partner Mio
- C3 Bare Metal in addition to the N5 and C4 VM’s i mentioned last week
- New X4 Memory Instances (interest form here)
- Z4 Vms are designed for storage dense workloads
- Hyperdisk storage pools advanced capacity allows you to buy pools of storage and share across multiple systems
- Google Cloud networking got the gemini cloud assist, Model as a service endpoint using PSC, Cloud Load balancing and App Hub to allow model creators to model service endpoints to which application developers need to connect.
- Lots of cloud load balancing capabilities coming for Inference
- Cloud Service is a fully managed service mesh that combines traffic directors control plane and googles open-source Istio based service mesh, Anthos Service Mesh
- Cross Cloud Networking capabilities:
  - Private Service Connect Transitivity over Network Connectivity Center, Identity based authorization with MTLS integrates the Identity-aware proxy with their internal app load balancer to support ZTN including client side and soon back-end mutual TLS
  - In-line network data loss prevention in preview, will integrate symantec DLP into cloud load balancers and secure web proxy using service extensions.
  - PSC is now Fully integrated into cloud sql
- Database studio, part of Gemini in databases, brings SQL generation and summarization capabilities to their rich SQL editor in the google cloud console, as well as an AI driven chat interface.
- Database Center allows operators to manage an entire fleet of databases through intelligent dashboards that proactively assess availability, data protection, data security, and compliance.
- Database Migration Service added assistive code conversion for stored procedures
- Bigtable data boost, a pre-ga offering, delivers a high performance, workload isolated on demand processing of transaction data, without disrupting operational workloads.
- Duet AI rebranded to Gemini Code Assist, now with full codebase awareness, new code transformation capabilities and more.
- Snyk is now integrated into Gemini Code Assist

37:35 Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities

In response to Biden’s executive order on AI, and because it drives more revenue for Google says the Cynic in me.
Google is releasing new Generative AI courses on Youtube and Google Cloud Skills Boost from introductory level to advanced. Once you complete the hands-on training, you can show off your new skill badges to employers from Introductory (no cost), intermediate and Advanced levels.
Google says there are over 505k open entry level roles related to cloud cyber security analysts and 725k open roles related to cloud data analysts, which is why they are launching their new Growth with Google Career Certificates for Data Analytics and Cybersecurity.

Azure

38:36 Advancing science: Microsoft and Quantinuum demonstrate the most reliable logical qubits on record with an error rate 800x better than physical qubits

Microsoft announces a major achievement for the Quantum ecosystem, Microsoft and Quantinuum demonstrated the most reliable logical qubits on record.
By applying Microsoft’s breakthrough qubit-virtualization system, with error diagnostics and correction, to Quantinuums ion trap, they ran more than 214,000 individual experiments without a single error.
Furthermore, they demonstrated more reliable quantum computation by performing error diagnostics and corrections on logical qubits without destroying them.
- This moves them out of current noisy intermediate scale quantum (NISQ) level to level 2 resilient quantum computing.
This is a huge accomplishment, and if you want to know what the error rate has to do with anything, check out the Monday Night Live talk from Re:Invent where they talked about the large challenge in quantum computing around errors and scale.
It’s a big breakthrough, and we definitely expect to see other providers start to copy similar capabilities.
Key takeaway: error rates bad.

Small Cloud Providers

40:22 Major data center power failure (again): Cloudflare Code Orange tested

Apparently Cloudflare’s data center lost power again – the same one that lost power last time (which we covered.)
Cloudflare is pretty pleased though, as the pain and issues the datacenter caused by losing power were significantly less painful.
This resulted in an internal project called Code Orange (they borrowed the idea from Google) when they have an existential threat to their business, they declare code yellow or red. Theirs is orange.. So it makes sense.
5 months after the first failure they were able to test code orange.
Unlike in November, they knew right away they had lost power.
They also knew after an internal cut test in february, how their systems should react.
At 14:58 UTC the PDX01 scenter lost power and their systems kicked into gear, by 15:05 UTC their API and dashboards were operating normally, with 0 human intervention. Our primary focus over the past few months has been to make sure that customers would still be able to configure and operate their cloud flare service.
14 Services were down for 6 hours or more on NOvember 2nd, this time all of these services were up and running in minutes.
Not everything was made as resilient services like analytics were still impacted, as they had not completed their code orange work yet.
We’re going to blame Amazon for stealing all the power.

28:29 📢 Matthew- “Six months ago from being down for many hours and… having six minutes of automated downtime, you know, an automatic rollover six minutes. I probably clicked the button, got distracted by a shiny object and came back to it six minutes later and didn’t even think about that. So it’s amazing what they were able to get through in six months, you know, and I’m sure Bravo to every engineer, software dev dev ops, whoever was involved with doing all that. Cause that’s an impressive feat.”

Closing

And that is the week in the cloud! Go check out our sponsor, Sonrai and get your 14 day free trial. Also visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloud Pod

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.