Top Large Language Models that Made Noise this Year

Just as they form the bedrock to AI-powered chatbots, the AI race that has taken over is essentially a race to build a better large language model.

December 20, 2023 03:42 pm | Updated 03:42 pm IST

FILE PHOTO: The AI race that has taken over is essentially a race to build a better large language model

FILE PHOTO: The AI race that has taken over is essentially a race to build a better large language model | Photo Credit: Reuters

Even if large language models have stuck around for a while now, the term fully entered the mainstream consciousness this year. Just as they form the bedrock to AI-powered chatbots, the AI race that has taken over is essentially a race to build a better large language model.

As the name suggests, large language models are advanced artificial intelligence models that perform natural language processing or NLP tasks. They ingest large amounts of textual data and then are able to do a lot with language - they can understand words and respond to questions, or translate one language from another, generate text around a subject if they were trained on it and also analyse a chunk of text submitted or even summarise a wordy document in brief.

But as chatbots themselves evolve, large language models are also moving towards the multimodal type, where it is trained on different types of data like images, text and audio.

Let’s take a look at the most prominent large language models that were released this year:

  • GPT-4: Released early in March this year, GPT-4 is the latest AI model from Sam Altman-led OpenAI. The LLM is the current benchmark and the successor to GPT-3 which was widely regarded as the previous benchmark among LLMs. The model can process both text and images. While details around how the model was trained is shrouded in secrecy, a Semafor report stated that GPT-4 had more than a trillion parameters which is six times bigger than GPT-3 with 175 billion parameters. It is well known that the model was finetuned using the Reinforcement Learning by Human Feedback method or RLHF, where human feedback is people compare and rank different AI-generated results from the same prompt. This data is again fed into the model to train it. It is theorised that this is what has helped distinguish between the quality of data that GPT-4 has versus any other large language models. Put against other large language models, GPT-4 has the least hallucinations while also being the most creative. In November, the company released a slightly upgraded version called the GPT-Turbo which had data until April of this year. Turbo can handle longer prompts and is cheaper compared to GPT-4.
  • Google Gemini: A couple of weeks ago, Google finally released its answer to GPT-4, Google Gemini. The large language model is different from its counterparts in the sense that it was trained to be multimodal from scratch. Gemini is able to process text, images, audio and video. There are three different sizes for Gemini - Nano for the Google Pixel 8, Pro which is mid-tier and the underlying LLM for its chatbot, Bard, as well as Ultra which is the multimodal version. While Gemini Ultra will only be released sometime next year, Google has claimed that the version surpasses the performance of GPT-4. The LLM’s results were better at 30 out of 32 state-of-the-art benchmark tests for multimodal capabilities. Once Ultra is out, a new version of Bard called Bard Advanced will be available to the public. On the other hand, the Gemini Pro stood comfortably between the GPT-4 and GPT-3.5 models in terms of performance. Google has also promised the most rigorous safety checks saying it has employed the “best-in-class adversarial testing techniques” to mark out safety issues before Gemini is deployed into the real-world.

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

  • GPT-3.5: When OpenAI had casually released ChatGPT in November last year, the LLM underpinning the chatbot was widely known as GPT-3.5. A watered down form of the GPT-4, OpenAI had described its latest LLM as being “10 times more powerful than the GPT-3.5.” Expectedly, GPT-3.5 is less sophisticated in more than ways - it can only handle text, hallucinates more and can take care of coding tasks and other queries as long as they are less complex. The GPT-3.5 model is what supports the free version of the ChatGPT and does not have access to the internet unlike ChatGPT Plus which works on the GPT-4.
  • Llama 2: Instead of releasing an AI chatbot, Meta AI open-sourced a LLM package called Llama in March, that developers could request to access. A few months later, in July, Meta released a second series of models under the Llama 2 family. The LLM became instantly popular in the community because it was free for research and commercial purposes. There were three versions of the model in varying degrees of ability - a 7-billion parameter one. a 13-billion parameter one and a 70-billion parameter one. In terms of performance, Llama 2 is behind GPT-4 or even Google’s PaLM2 and is significantly behind GPT-4 in computer programming. In the weeks following Llama’s release, it became apparent that once developers got their hands on the base model, they could optimise it easily to become a more powerful and personalised model comparable to models built behind closed doors in tech giants.
  • PaLM 2: At the Google I/O developer conference held in May, Google had launched PaLM 2 or Pathways Language Model, to rival GPT-4. Before the release of Gemini, PaLM 2 held the position of being Google’s most powerful LLM. There are no technical details around the model available but just for comparison PaLM 1 is a 540-billion parameter model. The model boasts of far more improved reasoning capabilities and was trained on an expansive dataset that included 100 languages. The older version of the Bard was based on PaLM 2.
  • Claude 2: Founded by former OpenAI employees and siblings, Dario and Daniela Amodei, Anthropic AI was formed to develop powerful AI systems that took safety seriously. Released in July-end this year, their latest LLM Claude 2 impressed developers because of its huge context-length i.e., the number of words a model can remember when coming up with an input. In the case of Claude 2, an entire book could be fed into the model. Later in November, a new version called Claude 2.1 was released with an even higher context length which surpassed GPT-4. Plus, it could integrate tools and APIs into it like calculators or external databases. 
  • Mistral 7B: A relatively unknown Paris-based startup Mistral AI has seemingly taken a different route than usual. Instead of building a massive LLM to compete with the likes of GPT-4 or Gemini, it builds niftier LLMs. In September, the startup released Mistral-7B and made it freely available to developers. A week ago, the startup quietly dropped a torrent link to another new LLM called Mistral 8x7B. Described as a “scaled-down” version of GPT-4, the model is again free for use. Like Meta AI, Mistral is has set itself directly in competition with Meta AI’s Llama 2 models and is a valuable contributor to the open-source movement. (The team behind Mistral had worked in Meta AI previously.
0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.