Welcome to episode 235 of the Cloud Pod podcast – where the forecast is always cloudy! This week a full house is here for your listening pleasure! Justin, Jonathan, Matthew, and Ryan are talking about cyberattacks, attacks on vacations (aka Looker for mobile) and introducing a whole new segment just for AI. You’re welcome, SkyNet.
Titles we almost went with this week:
- 😑AI is worth investing in – says leading AI service provider, Microsoft
- 🙄Join The Cloud Pod for the ‘AI Worth Investing In’ Eye-Roll Extravaganza
- 🌊The Cloud Pod: Breaking News – Microsoft Discovers Water is Wet, AI Worth Investing In
- 🦾Jonathan finally wins the point for predicting ARM instances in Google Cloud
- 🏖️Looker for Mobile: Ruining vacations one notification at a time
- 🌨️Microsoft helps bring cloud costs into FOCUS
- 🛣️Focus only on the path forward… not the path behind you.
- 🚘GPT-4 Turbo… just be glad its not Ultra GPT-4
- 🐦I can only flinch at the idea of Finch
- 🤖The Cloud Pod finally accepts AI is the future
A big thanks to this week’s sponsor:
Foghorn Consulting provides top-notch cloud and DevOps engineers to the world’s most innovative companies. Initiatives stalled because you have trouble hiring? Foghorn can be burning down your DevOps and Cloud backlogs as soon as next week.
New Segment – AI is Going Great!
- You may be shocked to find out that there is value in AI for your business!
- To help you understand, Microsoft paid IDC to make a study that provides unique insights into how AI is being used to drive economic impact for organizations.
- 2000 business leaders and decision makers from around the world participated in the survey.
- 71% of respondents say their companies are already using AI, and 22% said within 12 months they would be using it.
- 92% of AI deployments take 12 months or less
- Organizations are realizing a return on their AI investment within 14 months
- For every $1 a company invests in AI, it is realize an average return of $3.5x
- 52% report that a lack of skilled workers is their biggest barrier to implement and scale AI. We assume that’s prompt engineering or model builders.
- IDC projects that generative AI will add nearly $10 trillion to global GDP over the next 10 years.
- Key areas where businesses are finding value:
- Enrich employee experiences
- Reinvent customer engagement
- Reshape business processes
- Bend the curve on innovation.
02:33 📢 Ryan – “There were some questions that they didn’t ask that I wanted them to, like how many respondents are already using AI but wish they weren’t, or how many months do you think it will take before they realize how expensive this is?”
03:55 📢 Jonathan – “I think it’s funny that they specifically targeted for making AI related decisions in the business, so I’m not surprised that so many companies were already using it – because they had a person in that role already; and I’m curious about the 7% who said they wouldn’t be using it.”
- OpenAI held their first developer conference, where they announced the newest model GPT-4 Turbo.
- The best thing about it is its newer data than Jan 2022, so it will no longer tell you that Nancy Pelosi is still the speaker of the house. Although it’s still wrong as it was updated as of April 2023… and Kevin Mcarthy was the speaker. But it’s **less** wrong, so win?
- The new GPT-4 Turbo supports 128k context and input and output tokens are 3x-2x less expensive.
- You no longer have to tell OpenAI if you want to use GPT 3.5, 4, internet browsing, plugins and dall-e3 and will automatically detect what to use based on the purpose of your prompt, so it’s easier and more intuitive. These updates are available to ChatGPT plus users.
- They also announced, Copyright Shield, which means OpenAI will defend its customers and pay the costs incurred if they face copyright infringement claims.
- OpenAI launched a platform to develop your own custom version of GPT.
- This is an evolved version of ChatGPT plugins, with more capabilities.
- OpenAI will also be creating a GPT store, so users can create and browse custom GPTs, much like an app store.
09:28 📢 Justin- “On the surface to me this all seems very natural, progressional, but people on Twitter (X?) were losing their minds; ‘oh my god this is the future of AI!’ and I’m like, ok – off the Kool-Aid folks.”
Side note – we are interested in your thoughts on whether or not this truly does herald the *future of AI* (insert SciFI sounding voice here.)
10:57 📢 Jonathan – “I’m really enjoying the generative search results from Google – I don’t know if everyone is getting those or not – but I’ve got this Google opinion rewards thing, and everytime I do a search it pops up a thing that asks my opinion on the results that it gave, and that stuff is up to date, so they’re either spending an absolute fortune on the back end constantly retraining the model, or something else that we don’t know yet but that really changed search for me completely.”
- ML Ops is now LLM Ops…. and we’re moving along.
- I don’t remember this happening a year ago, but a year ago AWS announced their Finch command line developer tool for building, running and publishing linux containers on Macos.
- The community has been growing around this and now is at the 1.0 milestone. I never tried it as I’m satisfied with Podman…
- Runfinch is the new website for the project. I’m curious to see if it gets popular.
- The big difference between it and Podman is that it uses CRI-O libs, where Finch uses containerd. .
- Also, Podman uses QEMU vs Finch using Lima.
16:05📢 Justin – “So if you want more containerd because you’re ECS or you’re doing EKS, Finch may be a better choice for you, versus Podman will be your more generic for any different thing you can do on a Windows.”
- AWS is announcing Amazon EC2 Capacity Blocks for ML, a new Amazon Ec2 usage model that further democratizes ML by making it easy to access GPU instances to train and deploy ML and generative AI models. With EC2 capacity blocks, you can reserve hundreds of GPUs collocated in EC2 UltraClusters designed for high performance ML workloads, using an Elastic Fabric Adapter (EFA) network in a peta-bit scale non-blocking network to deliver the best network performance available in Amazon EC2.
- This innovative new way to schedule GPUs where you reserve the capacity for a future date for just the amount of time you require.
- They are currently available for EC2 P5 instances powered by NVIDIA H100 Tensor Core GPU’s in the Ohio region.
- Think of this similarly to a hotel room reservation. With a hotel reservation, you specify the data and duration you want your room and the size of beds you’d like.
17:58📢 Justin – “I’m ok with the idea of picking a date I want to do it, but to know when my job is going to end seems suspect; I’ve been watching a hack-a-thon happen this week around AI, and people have been building ML models – and they’re not super complicated models – and it takes HOURS in some cases. So how would you ever predict how long it’s going to take if you don’t know…”
18:35📢 Jonathan – “Yeah I think it’s a sign that Amazon’s resources are just massively constrained; they must be so busy and to monetize things properly to stop people from moving off to other clouds they have this scheduling option so you can now guarantee availability.”
- Effective mid-2024, newly released Ec2 instance types will only use version 2 of the EC2 instance Metadata Service.
- Amazon is taking a series of steps to make IMDSV2 the default choice in the management console.
- In February 2024, they will create a new API function that will allow you to control the use of IMDSv1 as the default at the account level.
- IMDVS2 is the more secure option to access the internal Instance metadata and eliminates a potential attack vector in your instances if they are breached/hacked.
- Looker Studio enables millions of users to bring their data to life with insightful dashboards and visualizations, connecting to more than 1,000 data sources and a host of community-sourced report templates.
- Looker Studio Pro expands on this self-service business intelligence platform with enterprise capabilities, including team content management and Google Cloud support.
- Today, we are bringing looker studio pro to your mobile devices through a new application available for android on Google Play and for iOS on the App Store, enabling you to view reports and get real-time data about your business from anywhere.
- Justin’s executive self is in love with this.
18:35📢 Ryan – “This just proves to me that executives don’t do any real work.”
- Today we are thrilled to announce the preview release of their first VM family based on the ARM architecture, Tau T2A. Powered by Ampere Altra Arm-based processors, T2A vms delivered exceptional single-threaded performance at compelling price.
- Tau T2A comes in multiple predefined VM shapes, with up to 48 vCPUs per VM, and 4GB of memory per vCPU.
- They offer up to 32gbps networking bandwidth and a wide range of network attached storage options, making Tau T2A VMs suitable for scale-out workloads including web servers, containerized microservices, data-logging processing, media transcoding, and Java applications.
- You can also use this with Kubernetes Engine and Batch.
23:37📢 Justin – “And Jonathan, I am happy to award you a point for the Google Next Conference 3 ½ years ago for this finally being delivered!”
- Vertex Search, which was made generally available in August, has new capabilities today.
- New customization and expanded grounding and compliance capabilities for customers to develop even more powerful and secure search, chat and personalized recommendations applications.
- The new generative AI features address the needs of organizations, especially large enterprises, that want to more deeply customize AI-driven search:
- Customizable Answers
- Search Tuning
- DIY Search engines with vector search and vertex AI embeddings
- Some of the concerns enterprises have about generative AI is they can be prone to hallucinations, and with the new grounding in enterprise data or grounding in selected public data sets you can help ensure the data is more accurate than ever before.
25:46📢 Jonathan – “I think hallucinations are going to get solved pretty quickly, actually. Because if you think about a person, if I ask a person, if I ask Justin about what he knows about RV maintenance, he may know that he knows nothing. And he can say that he knows nothing about it, and he won’t just make something up. But if I ask somebody else, they may make something up. It’s what people do. It’s the way people behave. I think we need to figure out how to train models to know what they don’t know, not just what they *do* know.”
- Azure has a great post about their support in cost management after the Focus 1.0 release ships later this year. (we should note that AWS also has gotten onto the Focus bandwagon)
- Focus is a groundbreaking initiative (strong words) to define a common format for billing data that empowers organizations to better understand cost and usage patterns and optimize spending and performance across multiple cloud, saas, and even on-premises service offerings.
- “At Walmart, we spend a lot of our time not only normalizing data across different clouds, but we’re also constantly reacting to changing SKUs and services in areas like Storage, Compute, and AI/ML. One of the most significant outcomes of FOCUS isn’t just that we’re aiming to simplify and standardize on a common specification, it’s the conversations that are starting on best practices – How should we all think about amortization for committed and reserved instances? What are our standard values for service categories? It’s much more than just a conversation about a few fields. It’s a discussion that will help define best practices and standards for a cloud computing market that continues to expand into new areas like SaaS, IoT, and Gen AI. We’re discussing standards today that will be the foundation of how we talk about cost decades from now. It’s exciting.“—Tim O’Brien, Senior Director of Engineering, Cloud Cost Management at Walmart Global Tech.
- You can give focus a gest run today by taking advantage of Microsoft’s FOCUS sample Power Bi Report.
28:17 📢 Justin – “You can also get access to a report, the Microsoft FOCUS Sample Power Bi Report, which I’m excited to learn exists, because I was just talking to our finance guy, and he was saying Microsoft Azure’s billing is horrendous and he hates everything about it…and so I was like well this might solve part of his problem! Oh, Matt says no. Nevermind.”
32:28 And you get a Cyber Attack, and You get a Cyber attack… Or how our mortgage lender got owned
Are you hanging your mortgage with Mr. Cooper? Bummer. A cyberattack – most likely ransomware, is to blame for them being unable to process mortgage payments from Halloween until very recently. We’re all REALLY excited to learn the extent of our information that is now available on the dark web. We’ll keep you updated on any information that comes out.
Hacked – again. Wait, no. A support breach? Didn’t this happen last year? Password keychains for the win!
Is it REALLY the data cener’s fault – or is it really YOUR fault? We decide.
And that is the week in the cloud! We would like to thank our sponsors Foghorn Consulting. Check out our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloud Pod