Site icon Aragon Research

Stability AI Releases Its First Large Language Model: StableLM

By Adam Pease

 

Stability AI Releases Its First Large Language Model: StableLM

Large language models (LLMs) have taken the world by storm since the release of ChatGPT, but they come on the heels of last year’s revolution in text-to-image models, which showed that AI could be powerfully leveraged to generate images from scratch. Stability AI, the company behind Stable Diffusion, by far the most popular text-to-image model, has now waded into LLM territory with the release of StableLM.

What Is StableLM?

StableLM is a new language model trained by Stability AI. Like most model releases, it comes in a few different sizes, with 3 billion, 7 billion, and 15 and 30 billion parameter versions slated for releases. These parameter counts roughly correlate with model complexity and compute requirements, and they suggest that StableLM could be optimized for a variety of different devices depending on resource constraints.

The new models are trained on the foundations of a dataset called The Pile, which combines large volumes of internet data, and has been shown to be a powerful dataset for training LLMs. Additionally, Stability leverages the techniques pioneered by Stanford researchers in their Alpaca language models, which extend the base model through fine tuning to make it more receptive to user inquiries, and more naturalistic to communicate with.

The Language Model Wars

When it released Stable Diffusion last year, Stability AI’s only serious competition was DALL-E 2, OpenAI’s closed-source text-to-image model. By offering a high-quality, open-source alternative, Stability AI was able to breathe life into the emerging generative content market and catalyze a frenzy of development efforts and new business model concepts.

Now, it steps into a more competitive space. As many of tech’s big players chase the ‘lightning in a bottle’ quality of ChatGPT, Stability will need to demonstrate that its model can deliver the kind of quality that many users have come to expect from language models since interacting with the offerings of market leaders. Currently, it is too early to tell how StableLM will fare in the market, and there will doubtless be considerable benchmarking done in the next several weeks to draw accurate comparisons between this new offering and other language models on the market. One promising point in its favor, though, is the 4096 token context window of the model, which supports better memory and recall than other open source alternatives.

Bottom Line

Like many others, Stability AI is diving into the large language model market. Previously, it energized the open source community around generative content and drove innovation; it remains to be seen if Stability can bring the same transformative power to LLMs.


Sign up for Aragon Research’s latest expert-led webinars!

tPaaS–The Market Required for Digital Business Transformation

Aragon Research’s Q2 2023 Research Agenda



This blog on is part of the Content AI blog series by Aragon Research’s Analyst, Adam Pease.

Missed the previous installments? Catch up here:

Blog 1: RunwayML Foreshadows the Future of Content Creation

Blog 2: NVIDIA Enters the Text-to-Image Fray

Blog 3: Will OpenAI’s New Chatbot Challenge Legacy Search Engines?

Blog 4: Adobe Stock Accepts Generative Content and Meets Backlash

Blog 5: OpenAI Makes a Move for 3D Generative Content with Point-E

Blog 6: ChatGPT and the Problem of Detecting AI-Generated Content

Blog 7: Content AI: Voice AI Takes a Step Forward

Blog 8: AI in the Courtroom: Are Robot Lawyers the Future of Law?

Blog 9: GitHub Copilot and the Legality of Generative Content

Blog 10: Google Steps into the Chat AI Ring with Bard, Anthropic Investment

Blog 11: Exploring Google Bard’s Botched Demo

Blog 12: Meta AI Is Working at the Intersection of Robotics and Generative AI

Blog 13: Meta’s New AI Model Leaks

Blog 14: Students in China Use ChatGPT from Behind the Firewall

Blog 15: OpenAI’s ChatGPT API Will Transform Application Experiences

Blog 16: Microsoft Announces Copilot X, GPT-4 Integration

Blog 17: BloombergGPT Brings Generative AI to Finance

Exit mobile version