Grooving, suing and explaining your work

New Google model, copyright questions and LLM explainability

Welcome back to the Enterprise AI Playbook, Issue 6. Here are the successes, cautionary tales and deep dives from this week.

Successful launches - A Gemma 2 in the rough

Google announced their new open source LLM Gemma 2, performing comparably to LLaMa 3 on benchmarks and Lmsys arena. This model is available to download for commercial use, comes in a 9B and 27B variant and can be deployed on GCP.

Gemma 2 benchmarks from Google

The 27B parameter model is quite impressive, beating out the original GPT-4 on a number of tasks and being able to run on accessible hardware. The technical report also covers the use of LLM distillation, a fairly new area of research using a larger LLM within Google’s walls.

Gemma 2 beats out GPT-4-314 on regular prompts and is slightly worse on hard prompts

Cautionary Tales - Big Three sue Big Two

Last week, Sony Music, Universal Music Group and Warner Records open a lawsuit against AI music generators Udio and Suno asking for compensation of $150,000 per song. The lawsuit is a continued development in AI space as each domain has an evolved list of lawsuits from news, to image gen, to code copilots and now AI music. This continues to push the question to courts, what qualifies as fair use. These cases will set large precedents for the future but will likely be locked up in courts for many years to come, given their magnitude

In addition, the major players in the space, such as the CEO of Microsoft's AI org have expressed strong opinions on all internet material being accessible for training, being designated as freeware on a recent CNBC interview.

“I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” if you like, that’s been the understanding.” - Mustafa Suleyman

The question of IP will continue to revolve, but currently has not impacted Gen AI usage patterns as we saw in the last issue. Frontier LLM model training data is also kept secret so it will continue to be challenging to audit what material has been used for training.

Full youtube interview and Full read

Deep dive - LLM explainability continues to evolve

A recent paper from UCLA does a strong job visualizing how LLMs learn patterns through in-context learning. Looking at a few different features, they compare how combinations affect predictions between two classes.

Comparing LLMs for feature spaces

Traditional machine learning focused on creating separable classes, but these patters get blurrier for LLMs given their structure and open ended prompting options.

Traditional ML models for feature separability

What becomes interesting is how these state spaces evolve with more examples, highlighting visually how well LLMs generalize with more examples, and the impact of their order.

Impact of few shot learning

This research into explainability will help us better understand the impact of prompting and LLM generalizations, and gives teams new tools to test models for resilience to production type variance. Full paper link.

Question to ask your team

What is the deployment time difference for adding a new open source vs proprietary model to our use case?

Until next week, Denys - Enterprise AI @ Voiceflow