The Inference Times
Posts
What does a flying golden retriever say about Google?

What does a flying golden retriever say about Google?

Samsung defects to Bing? Plus LLMs from AmazonBasics and more

April 17, 2023

Welcome to the second issue of The Inference Times. In this issue we have news of a potential defection by Samsung, flying golden retrievers, new LLM offerings from AmazonBasics and more.

Front page:

1) Samsung defection prompts new Google Search.

Google employees are shocked — shocked! that Apple and Samsung could defect from the lucrative deals that make Google the default search engine on the two companies’ phones. Losing these exclusives to Microsoft could mean a nearly $25 billion hit to revenue tied to the deals for Google.

The news is sparking furious activity within Google to add new AI features to Google Search under project Magi and even to introduce a wholly new search engine, NY Times reports.

Google has been slow to release competitive capability. Bard is weak, only exists in limited release and still has no integration with Search. While there’s been a lot of talk, Google has really introduced no major product changes to their core products to respond to OpenAI. Why?

The truth is Microsoft has much more flexibility to iterate with Bing than Google can with search. As a nearly-giddy Satya Nadella pointed out: Microsoft just needs to take a few users from Google. Because search is so profitable, this massively increases their gross margins… On the other hand, Google needs to protect all their users and all their gross margins.

Any sweeping product change from Google that reduces the revenue per search or significantly increases the cost of results would mean a massive earnings and revenue hit to the company.

Now that a huge chunk of Google’s top-line revenue is also at risk, we may start to see more drastic changes from the slumbering giant.

3) Does a flying golden retriever portend the decline of Google? Sundar’s 60 Minutes interview.

When a tech company resorts to PR to solve a product gap, you know they’re sweating it. Another warning sign? When they’re demoing proof-of-concept video from a code notebook rather than a consumer-facing product release.

I will say that the text-to-video demo of a golden retriever with wings is pretty darn cute.

The 60 Minutes interview with Sundar Pichai is evocative of the extended interviews Sam Altman has been giving and speaks to a desire recapture some of the mindshare OpenAI has won even as they working furiously to deliver the products that will allow them to compete.

2) If you can’t beat ‘em, embrace and extend ‘em: LLMs from AmazonBasics.

Amazon just announced Bedrock, their stab to remain become seem relevant in the generative AI space.

Amazon wrapped models from Stability (stable diffusion) and Anthropic (Claude), lightly integrated them with existing Amazon products like SageMaker and are selling them in a serverless model (as opposed to requiring customers to spin up their own GPU instances). Oh, Amazon also announced their horse in the LLM race, Titan, but it won’t be ready to race for a few more months.

OpenAI was first-mover, validated a massive TAM in the space, continues to ship aggressively and maintains a death-grip on state of the art: that’s got to scare to Amazon Execs.

Without a viable competitive foundation model, the next best thing for Amazon is to make it easier for customers to consume open models on AWS promise more soon.

4) The first AI-generated song that slaps.

We’ve seen a few AI generated songs, but this is the first one that’s really good.

Listen to this AI generated song featuring Drake & The Weeknd.
It goes so damn hard.
It's by "Ghostwriter977" on TikTok and it's blowing up on socials + streaming platforms.
UMG, which controls around 1/3 of the global music market, has already asked streaming platforms to ban… httptwitter.com/i/web/status/1…p
— Roberto Nickson (@rpnickson)
10:31 AM • Apr 16, 2023

4) Elon’s plans for those 10,000 GPUs.

It looks like Elon got over his profound concerns about AI development and has big plans for the 10,000 GPUs he’s rumored to have purchased, as Financial Times reported. He’s hiring folks away from Google Brain, DeepMind and other labs and soliciting investments from SpaceX and Tesla.

It would pretty wild if an entirely separate AI lab is able to build out products competitive to OpenAI from a dead start faster than Google manages to put on its shoes.

5) Hands, Text and high res: Stability releases Stable Diffusion XL.

Stable Diffusion’s latest image generation release is a big improvement. The big changes include improved text generation, higher resolution output and better photorealism. In my use, it’s improved by a lot, but well… let’s say it’s still a far cry from Midjourney 🫳🤦‍♂️. You can try it for yourself here.

6) Apocalypse delayed.

Sam Altman quashed rumors that indicated GPT-5 is being trained right now. Training incrementally larger models has a significant cost and it’s likely the case that there are lower-hanging commercialization priorities, for instance around plugin capabilities. Could we see multi-step, agent-like properties built in to ChatGPT?

🔧 Tool Time:

Question answering with embeddings. OpenAI released a really good overview of how to leverage embeddings for question answering.
Augie - AI-generated social media videos. Gated behind a signup 🙁
BabyAGI as OpenAI plugin. Won’t help you if you don’t have plugin access, but for those who do you’ll get the multi-step autonomous agent experience right inside ChatGPT.
godmode.space and cognosys.ai, two clean interfaces for AI agents. The former is nice in that it allows human-in-the-loop feedback so you can correct the agent if it goes off the rails, but it is a little slower. Good demo video here.

🧪 Research:

GPT-4 passes neurosurgery exam. The paper documents GPT-4 receiving an 82% on a neurosurgery board question bank. Stunning improvement from GPT-3.5 at 62% and Bard at 44%. Oh, and hallucinations were 2% compared to Bard at 57%!
Neural radiance fields indistinguishable from drone fly-through. (neural radiance fields are an ML method to reconstruct a 3d scene from sparse set of 2d images). The paper is noteworthy for getting superior image quality while training 22x faster than alternative methods. Demo video. You can geek out more with one of the author’s twitter thread.
Faster, cheaper alternative to diffusion models? This paper covers an alternative to diffusion models that achieves SOTA status in one-step generation. Currently, diffusion-based models like Stable Diffusion and Dall-E require a 50+ step sampling period which is slow and expensive. Consistency models take an alternative, one-step approach to image generation.
Animated children’s drawings from human video. A clever paper from FAIR takes as input a video of a dancing person and a low-fi children’s drawing, then animates the children’s drawing. Demo video and repo.

The best part in the entire paper is when they classify drawing morphologies they need to be robust to:

Bonus editorial - why is it such a pain to secure cloud GPUs?

If you've tried to play with generative AI on cloud GPUs, you've likely had a miserable time of it. Why is that??

There are a few really good reasons... But they all amount to one conclusion: Why are GPUs a lousy business? Let's count the ways...

On the supply side:

There's only one vendor for the best chips: NVIDIA. This vendor knows it's the only game in town, so they aren't obliged to provide the same deep volume discounts the clouds are used to getting.

Even worse - the chips come encumbered with restrictive licensing to ensure highest margins.

Little differentiation:

Since all the cutting edge training happens on the same NVIDIA GPUs with similar interconnect technology, cloud vendors have to compete on price for these undifferentiated products.

Oh, and pretty much all the training happens against NVIDIA's libraries or open source libraries that talk to them, there's little room for vendor differentiation at the service and product level.

Higher usage = less profitable

Most cloud workflows are like airplanes - you can oversell capacity, trusting a certain % won't get used.

Not so for ML training (where the model 'learns' from a dataset) because resource consumption here stays at 100%. While inference workload (using the pre-trained model to generate predictions) resource demands are more variable, they're also much less compute intensive.

Loyalty > margins

Once the cloud majors get their allotment of the cutting edge GPUs, they're largely incentivized to provide access to the scarce resource to their largest customers. GPU access is worth more in loyalty than it is in profit margin.

There's also a big downside fraud risk!

GPU instances are the primary target for crypto miners. If crypto miners steal credentials to an Amazon account with access to spin up lots of GPU instances, they can run up $20k bills, which AWS and GCP are typically obliged to write off.

All these factors contribute to making it a pain to get access to GPU resources.