The Open(ish) Rally of Large Language Models

Meta’s release of LLaMa-1 has ignited a competitive yet cooperative spirit in the AI world, as institutions and independent entities alike strive to contribute to the collective pool of large language model (LLM) resources. This movement is characterized by a series of notable releases, each with its unique flair and potential.

The Emergence of LLM Contenders

Several institutions have stepped up to the challenge, releasing the weights of LLMs that push the boundaries of what’s openly available:

MosaicML’s MPT-30B and TII UAE’s Falcon-40B

MosaicML’s MPT-30B has made waves, while the TII UAE’s Falcon-40B represents an exciting entry from a new player in the LLM field. Falcon-40B quickly embraced the open-source ethos, and its successor, Falcon-180B, though trained with minimal code, has opted out of coding tests, highlighting a specialized focus.

Together’s RedPajama and Eleuther’s Pythia

RedPajama stands out for its ambition to replicate LLaMa-1 entirely, ensuring a fully open-source version is available. Meanwhile, EleutherAI’s Pythia represents the community’s collaborative spirit in advancing LLM research.

The Niche of Specialization

The open-source community has been particularly agile, fine-tuning smaller versions of LLaMa on specialized datasets, which are then applied to a myriad of downstream applications. This niche tailoring has given rise to models like Mistral AI’s 7B, which has quickly been recognized as a powerhouse among smaller models.

Fine-tuning for the Future

Innovations in parameter-efficient fine-tuning methods, such as LoRa (Low-rank adaptation), originally developed by Microsoft, have enabled practitioners to customize these LLMs for specific use cases, including chat applications. A prime example of such adaptation is LMSys’s Vicuna, which represents LLaMa fine-tuned with user conversations from ChatGPT.

A Collaborative Ecosystem

This blossoming of LLMs is more than just a race; it’s a collective journey towards a future where the power of AI can be harnessed more openly and efficiently. As new models emerge and existing ones are refined, we witness the power of collaboration and openness in AI, setting a precedent for innovation and accessibility.

Stay with us as we continue to explore the evolving landscape of LLMs, where the fusion of competition and cooperation fosters an environment ripe for breakthroughs and widespread AI advancement.

Scott Felten