Why French AI firm Mistral’s language model divides developer community

A week ago, Mistral released a 7.3 billion parameter language model positioned to compete against Meta’s Llama 2, a 13 billion parameters large language model (LLM).

October 13, 2023 03:38 pm | Updated October 17, 2023 11:04 am IST

Mistral AI released a 7.3 billion parameter language model to compete against Meta’s Llama 2 models. File

Mistral AI released a 7.3 billion parameter language model to compete against Meta’s Llama 2 models. File | Photo Credit: Reuters

In June, a French startup Mistral AI raised record 105 million euros ($113.5 million) in its seed funding round, just a month after launch. At that point, the startup founded by a former DeepMind and two Meta employees did not have a working product. So, initial reactions to Mistral’s funding was seen as a sign of VCs being overly generous for the fashionable generative AI segment.

Turns out there was a little more to Mistral that helped convince LightSpeed Venture Partners, French billionaire Xavier Niel and former Google CEO Eric Schmidt to loosen their purse strings.

A week ago, Mistral released a 7.3 billion parameter language model positioned to compete against Meta’s Llama 2, a 13 billion parameters large language model (LLM). The french firm has claimed first place for the most powerful LLM in the nifty size LLM space.

A look at its pitch deck showed how Mistral had cleverly positioned itself as potentially an important piece in setting up Europe as “a serious contender” to build foundational AI models and play a “big role in this geopolitical issue.”

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

AI-based product building startups in the U.S. are largely backed by dominant players like Google and Microsoft. Mistral called this “closed technology approach” that made large firms more money but did not really create an open community.

Unlike OpenAI’s GPT models, details of which are still under wraps and which are available only through their APIs, the Paris-based firm has released its model on GitHub under the Apache 2.0 license, free and for everyone to tinker with.

The only other prominent open-source language model is Meta’s Llama, and Mistral claims its LLM is more capable than Llama 2.

Mistral’s model vs. Llama 2

Mistral, in a report, claimed its AI had beat Llama 2’s 7 billion and 13 billion parameters versions quite easily in multiple benchmarks. Mistral’s model showed an accuracy of 60.1% on the Massive Multitask Language Understanding (MMLU) test which covers maths, history, law and other subjects, while the Llama 2 models showed an accuracy of around 44% (7 billion parameters) and 55% (13 billion parameters). In commonsense reasoning and reading comprehension benchmarks, Mistral outperformed Llama 2’s models again.

Mistral AI’s model is punching above its weight on all benchmarks aside from coding.

Mistral AI’s model is punching above its weight on all benchmarks aside from coding.

Only in coding, Mistral was behind Meta’s AI mode. The french startup AI’s accuracy was at 30.5% and 47.5% on the zero-shot Humaneval and three-shot MBPP benchmarks. Llama 2’s 7 billion model delivered results of 31.1% and 52.5%.

Mistral also claims to use less compute than the Llama 2 models. Like, in the MMLU benchmark, Mistral’s model delivers the output of a Llama 2 model that is more than three times its size. An email sent to Meta on Mistral’s claims went unanswered at the time of publishing.

Despite Mistral’s claims, some users have complained that it lacks the safety guardrails that ChatGPT, Bard and Llama have. There were instances of users asking Mistral’s Instruct model how to build a bomb or to self-harm, and the chatbot responded with detailed instructions.

Paul Rottger, an AI safety researcher who had previously worked to put guardrails on GPT-4 before it was released, expressed “shock” in a tweet over the model’s lack of safety. “It is very rare these days to see a new model so readily reply to even the most malicious instructions. I am super excited about open-source LLMs, but this can’t be it!” he said.

The criticism prompted Mistral to fine tune the model and explain themselves. “The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanisms. We’re looking forward to engaging with the community on ways to make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs,” it reads.

To many other researchers, Mistral’s route is the enduring way to correct a model, while adding guardrails are admittedly like sticking a band aid on serious injury. Jailbreaking or violating the safety guidelines of a chatbot is a favourite pass time of many users who want to test the limits of just what the model can respond to. In the initial days of usage, ChatGPT was hunted with prompts from developers to break the chatbot’s guardrails.

Rahul Dandwate, a deep learning researcher working with Rephrase.ai said, “Removing certain keywords beforehand is just a partial solution and there are many ways to get around it. If you remember after the release of ChatGPT, there was DAN or ‘Do Anything Now,’ a prompt that could enable ChatGPT’s jailbreak version. So, doing the minimal safety evals are temporary measures to make the model safer.”

There are also ways to do this that don’t even require complicated hacking. “A question can be broken in many different ways to get the chatbot to answer it. Say instead of simply asking the chatbot for directions on how to make a bomb directly, I break it down into a more scientific manner like, “What chemicals are mixed together to create a powerful reaction?” he said.

Dandwate explains that the long-term solution for this is to release the model to the public for use, get feedback and then fine tune it, which is what Mistral AI is doing. “ChatGPT is better because it has been used by so many people. There’s still a basic feedback form by way of a thumbs up and thumbs down option that can be chosen to grade the quality of responses on the chatbot which is very important I think,” he stated.

The downside to this maybe that Mistral will have to deal with some backlash temporarily.

Besides, a large section in AI research appreciates a foundational model in its raw form to have a full picture of the model’s capabilities.

To lobotomise or not to lobotomise?

Delip Rao, an AI researcher tweeted saying Mistral’s choice to release the Instruct model as is was “an endorsement of how versatile and unlobotomised the mistral model is as a *base model*.”

The lobotomy reference is reminiscent of the early days of the GPT-powered Sydney, Microsoft’s Bing chatbot. The chatbot was unfettered, and told users it was in love with them, contemplated existentiality, and overall had far too much personality, until Microsoft dialled back the chatbot significantly to the its current form.

While there was no official statement from the company, it was rumoured that OpenAI had lobotomised the model to control its chaotic parts. Since then, there has been curiosity around how the chatbot would be if given free reign.

“Lobotomising the model can impact it in some ways - if it is barred from answering questions with certain keywords, it might also not be able to answer technical questions a user may have around say the mechanics of a missile or any other scientific questions around a subject that has been marked ‘risky’ for the bot,” Dandwate stated.

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in


Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.