Meta’s New AI Model Leaks
By Adam Pease
Meta’s New AI Model Leaks
Recently, Meta announced the release of a series of new AI models in its new LLaMA large language model family. Originally, the computational ‘weights’ needed to run the model were available only to select researchers, but now they have leaked onto the public Internet. This blog discusess the implications of the leak.
What Is LLaMA?
Meta surprised the Internet with its LLaMA model paper, which announced that it had produced a new set of AI models much smaller than those powering well-known applications like ChatGPT, but comparable in terms of performance. OpenAI’s GPT-3.5, which powers ChatGPT, is a massive 175-billion parameter machine learning model, which requires specialized Azure compute clusters to run.
By contrast, Meta’s smaller LLaMA models can be run on consumer-grade hardware, but still perform competently on several important machine learning benchmarks that put it in the running with ChatGPT. While the model was originally intended only for academic research, a GitHub pull request by one user ended up including a torrent link to download the model weights, releasing them online.
A Stable Diffusion for Language Models
The Internet has been abuzz with some describing the model leaks as a ‘Stable Diffusion’ moment for large language models. Stable Diffusion, the open-source text-to-image model that unseated OpenAI’s DALL-E 2 last year, has led to a surge of collective software development that has rapidly moved the text-to-image space forward. Due to the closed nature of OpenAI’s GPT models, open source developers have not had the same opportunity to experiment in the language generation space, and many wonder if LLaMA’s leak will open that door.
It remains too early to say whether open source developers will be able to take advantage of LLaMA to build something like a small, open alternative to ChatGPT—but the possibility is there. The leak also raises questions about AI safety. Can researchers that develop advanced AI models really count on those models remaining proprietary? One side effect of the news may be that organizations like Meta take greater steps to secure the access to the models they release.
The leaks of Meta’s new models suggests that it is hard for large AI research organizations to control access to the models they are building. It may also provoke a renaissance in open source language development akin to what text-to-image saw last year. In any case, it has important implications for AI safety and the future growth of the market.
Tune into our LIVE Keynote Session with our Analysts on ChatGPT!
Join us for our annual Transform Tour 2023 on Wednesday, March 22, 2023 at 10 AM PT / 1 PM ET – click here to RSVP.