Facebook owner Meta on Friday put its latest AI chatbot to test. The social media firm released the codes, model weights, datasets and model cards to the scientific community to understand the model’s potential and limitations. The chatbot is open to users in the U.S. for now.
(Sign up to our Technology newsletter, Today’s Cache, for insights on emerging themes at the intersection of technology, business and policy. Click here to subscribe for free.)
The BlenderBot 3 is said to be capable of searching the web and holding a virtual chat on any topic. The conversational AI model is designed to improve its skill by learning from feedback it gets through the online discussion.
The third iteration of Facebook’s BlenderBot was trained on a dataset containing 175 billion tokens and exhibits, and is comparable to OpenAI’s GPT-3, which has been pre-trained in as much parameters as Facebook’s latest chatbot.
Meta understood early on that language models perform better when they learn from chatting with people. This experience helps the model understand based on the feedback it gets from users.
In the case of BlenderBot 3, Facebook found that its chat AI sometimes mimicked and generated “unsafe, biased or offensive remarks”. The social media firm claimed that despite building safeguards into the model, it “can still make rude or offensive comments, which is why we are collecting feedback that will help make future chatbots better.”
Facebook has built-in user feedback mechanism within the bot. So, one can click either thumbs-up or thumbs-down icon to note whether the model responded correctly or not. The thumbs-down button thumbs-down lets the user explain why they disliked the message: whether it was off-topic, nonsensical, rude, spam-like or something else.
The California-based company first made its conversational AI open source in April 2020, culminating years of research in natural language processing (NLP). The first BlendBot was trained on only 9.4 billion parameters. At that point, Google’s Meena model was the most advanced chatbot. The conversational agent was pre-trained on 2.6 billion tokens and exhibit, which was 8.5 times more data than what OpenAI used to train its GPT-2 model.
Two years and three months on, the Facebook trails behind Google’s language model. In December, the search giant introduced Generalist Language Model (GLaM), which was built on dataset of 1.6 trillion token. Google claims the model’s performance is comparable to GPT-3 with “significantly improved learning efficiency across 29 public NLP benchmarks in seven categories.”
The race for larger language model continues.