AI EDUCATION: What Is a LLaMA?

3752

Each week we find a new topic for our readers to learn about in our AI Education column. 

What do llamas have to do with artificial intelligence? We usually associate them with inside jokes in video games and on old Monty Python’s Flying Circus reruns. 

As it turns out, there’s a host of powerful, open-source AI models that have adopted llama as their name. Welcome to another edition of AI Education, where we’ll be discussing Meta’s LLaMA family of artificial intelligence models. 

First launched publicly in 2023, the LLaMA family of models, or the LLaMAs for short, are large language models (LLMs), with the name LLaMA as an abbreviation for Large Language Model Meta AI. They’re similar to other open-source large language models, with chatbots being the most common method of interface, but offer multi-modal capabilities. The LLaMAs can be used for translation and mutilingual writing, image generation, coding, mathematical calculations, text generation and to answer questions. 

Most importantly, as open-source software, they’re generally freely available for both academic and commercial uses—which, of course, means they’re also widely used and serve as the foundation for other AI technologies in development. While other providers’ most powerful models, like Open AI’s GPT-4o and Anthropic’s Claude—are proprietary and closed source, LLaMAs are not. 

What Is LLaMA? 

Readers may be asking themselves where they can access these digital llamas—and do they spit? We’ll answer the second question first: of course they don’t spit, they’re software. For most of us, a LLaMA is just another large language model. As to where we can find them, chances are great that we’ve all encountered them already. The AI capabilities on Meta platforms—like Facebook, Oculus, Instagram, Messenger and WhatsApp—are powered by Llama models. Llama is also available as a web application and via API. 

LLaMAs are mostly trained on web content—web pages, archived pages, Wikipedia and freely available public domain publications. They’re also trained on the output of previous and contemporary iterations of AI. Otherwise, LLaMAs are trained no differently from any other large language mode: by being taught to correctly predict the next word in a sequence of text. 

This is where we get the generic LLaMAs we’re accessing via Meta’s web application. The real value of using a LLaMA, however, is in the ability of businesses, institutions and developers to further train, customize and build upon what Meta is offering to create something more customized, personalized, unique and new. Because of their wide-open nature, LLaMAs are like AI Swiss Army knives. While in reality the LLaMAs are not quite as powerful as some competitors, they are highly adaptable. 

Not only that, but older versions of the models, like Llama 3, are streamlined enough to run on some personal computers and devices, allowing some developers to work outside of cloud and network environments. LLaMAs can then be trained on whatever data the developers or end users want, to automate and perform nearly any task. Thus, many developers have become dependent on the various Llama models. Perhaps this is why Meta CEO Mark Zuckerberg has been consistent in his commitment to keeping the LLaMAs open. 

The Latest LLaMA 

Llama 4, launched to the public in April of this year, comes in a few different iterations, inluding Llama 4 Maverick, intended to compete with the latest versions of Google’s Gemini and Open AI’s ChatGPT, and Llama 4 Scout, a streamlined version of the model intended for more targeted deployments. Not yet released is Llama 4 Behemoth, a heavier version of Llama 4 that was still in training as of the launch of Scout and Maverick, and Llama 4 Reasoning. 

Scout and Maverick were built by “distillation,” the process of transferring knowledge and capability from larger and more sophisticated models to smaller and more efficient ones. Meta distilled Llama 4 Behemoth in its mostly trained state to train Scout and Maverick—which means they were developed to come up with the same responses and answers as Behemoth but with lower resource (meaning computing power and electricity) use. 

Like some of the models from French competitor Mistral AI, Llama 4 models use a mixture of experts architecture, meaning that while it has tens or hundreds of billions of parameters available to complete tasks, at any given time it’s only using a fraction of them—Llama 4 models split parameters among smaller models called experts, that are only active when being used, reducing energy costs. Behemoth is reported to have trillions of parameters, with a relatively small number active at any given time. 

Unlike previous versions, the Llama 4 models are too large to be run on home or personal computers—but they are easily run on Azure, AWS or Google Cloud. Meta claims that Llama 4 is capable of outperforming GPT-4o in some reasoning metrics 

The Uncertain Future of the LLaMAs 

AI models are so new and, until recently, have been operating in such a blue ocean environment that it was difficult to imagine a major AI model, let a lone an entire family of models, having a clouded future, but that may be the case for the LLaMAs. Reports from earlier this summer, included a story in the New York Times, have alluded to high-level conversations at Meta regarding potentially moving away from the LLaMA family of models. 

Rumor has it that not only is Meta considering looking towards other model providers for building its technology, leaning more towards closed ecosystems and away from the open source movement, but that the company may divest from the LLaMAs entirely. Doing so would further shift Meta from developing its own technology in house and towards forging AI-oriented partnerships or hunting for more acquisitions in the AI space. 

While these conversations are ongoing, the company appears to still be full-speed ahead on LLaMA releases in the near term. While the loss of Meta support would deal a serious blow to the future of the LLaMAs, with their popularity, it’s hard to imagine that the entire family of models simply becomes defunct—there will probably be some future for the LLaMAs, it’s just difficult to say what it will look like right now.