The Generative Edge Week 20
Google wants you to know they are all-in with AI, the music industry is about to be disrupted, Google's new PALM2 LLM comes in exciting flavours and more out of Google IO 23.
Welcome to week 20 of The Generative Edge. Today’s edition will be all about last week’s Google IO. Here is the gist in 5 bullet points:
Google unveils MusicLM, an AI tool for music generation, and PaLM 2, an advanced language model.
PaLM 2 comes in four sizes, with mobile-friendly Gecko and the largest model, Unicorn.
Google IO 2023 showcases AI in various products and new B2C/B2B applications.
AI-powered updates include Duet AI Google App, Immersive View for Routes, and AI-Powered Google Search.
Anthropic releases a large language model with a gigantic context size.
And for the details, let’s hop right in:
MusicLM
AI is poised to disrupt a lot of sectors, and we’ve talked about image, video and voice generation before. Now, AI powered music generation is gaining momentum and impact (Spotify had to delete thousands of fraudulent AI generated songs recently) and many novel approaches are being tried, like using diffusion models (famously in use by generative image models) for music generation.
Last week, Google has announced their own entry into that pantheon of AI powered music generation tools: MusicLM.
MusicLM is a model generating high-fidelity music from text descriptions such as
A rising synth is playing an arpeggio with a lot of reverb. It is backed by pads, sub bass line and soft drums. This song is full of synth sounds creating a soothing and adventurous atmosphere. It may be playing at a festival during two songs for a buildup.
The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.
Slow tempo, bass-and-drums-led reggae song. Sustained electric guitar. High-pitched bongos with ringing tones. Vocals are relaxed with a laid-back feel, very expressive.
More curated examples: google-research.github.io/seanet/musiclm/examples/
Website: https://aitestkitchen.withgoogle.com/experiments/music-lm
As with so many AI tools, there is a waitlist - sign up if you want to try this at some point.
The music industry is about to experience some serious disruption and it might not be ready for it.
PaLM2
Google has been famously taken by surprise by the popularity of OpenAI’s ChatGPT and subsequently GPT-4 and has been scrambling a bit to show the world that they can also be a significant product player in this space (they are without a doubt at the top of the game when it comes to research)
They have recently shown PaLM 2, their latest and allegedly most advanced model. It was designed to enhance the logic, math, and coding capabilities of Google Bard, Google’s own AI chatbot and aims to reduce the number of hallucinations (i.e., inaccurate information) that occurred with previous models.
PaLM 2 available in 4 sizes: Gecko, Otter, Bison, and Unicorn - Gecko being the smallest model, Unicorn the largest.
Top highlight: Gecko, which is lightweight, works on mobile devices, fast for interactive applications, offline functionality
Future LLMs will be able to run mobile-native, offline and always-on
Other variants:
Sec-PaLM: Fine-tuned PaLM 2 for security, detects malicious scripts, resolves threats
Med-PaLM: Fine-tuned PaLM 2 for medical knowledge, expert-level performance on exams
You can enter the waitlist here: https://developers.generativeai.google/
Google IO 2023 - AI, AI and some more AI
Google wants everyone (especially their investors) to know that they are all-in on generative AI, not just for internal products but directly via B2C/B2B. Last week’s Google IO therefore was chock full of AI announcements from Google, we’ll quickly run down some of the more interesting ones.
As mentioned above: PaLM2
Google Bard Updates: Generates code, debugs in 20 languages, connects to Google Lens and Maps data
Duet AI Google App Updates: AI enhancements to Gmail, Docs, Sheets, Slides, "Help Me Write" feature
Immersive View for Routes: Google Maps feature for visualizing landmarks and stops
AI-Powered Google Search: Conversational responses, multifaceted query handling
Bard Extensions: Integration with third-party apps, e.g., Adobe Firefly for AI-generated images
… and what else?
Hugginface releases an experimental API to support AI agents (we’ve talked about agents before), Anthropic presents an LLM with a gigantic 100k token context
And that’s it for this week!
Find all of our updates on our Substack at thegenerativeedge.substack.com, get in touch via our office hours if you want to talk professionally about Generative AI and visit our website at contiamo.com.
Have a wonderful week everyone!
Daniel
Generative AI engineer at Contiamo