The story so far: Large tech firms are competing to develop and implement AI-based language models. Google is developing a model that can support the 1,000 most spoken languages of the world, the company revealed at its AI event this month. To investigate the potential of such a vast project, Google is also working on an AI language model which can support around 400 languages. The company claimed this model had the “largest language coverage” among existing speech models.
(For insights on emerging themes at the intersection of technology, business and policy, subscribe to our tech newsletter Today’s Cache.)
However, bringing these models into daily use raises questions about their potential use cases and their founding aims.
What is the purpose of the new language model?
Google doesn’t have a specific use case for the language model. However, the end goal is to enable Google users to experience better searches, more accurate auto-generated captions, natural online translation, and faster calculations, according to a report by The Verge, a tech outlet.
The project is under development and researchers are now collecting linguistic data to train the model.
How is this initiative different from Google’s other models?
While many other existing AI language models are deployed for business or research, Google’s 1,000 languages initiative aims to improve AI language models as a whole for diverse use cases.
Google’s plan is to build one gigantic model for the 1,000 languages so that both widely used and rarer languages can co-exist, interact, and grow together.
What are AI language models used for?
Through AI language models, companies aim to automate manual processes, generate new insights based on existing data, and reduce reliance on human labour in fields like translation, customer service, or computation.
Today, website chatbots help us 24x7, services like Google Translate instantly decode foreign languages, and AI-based art generators help non-artists create prize-winning digital paintings while the same technology also powers price predictors for investors.
These services are impossible without the deep learning process that AI-based models currently go through. Big Tech companies, with massive volumes of user data and content in their servers, are in the perfect position to use these resources to train such models. APIs let other companies, clients, and customers integrate these Big Tech companies’ models and adapt them for their own purposes.
What other language models exist today?
AI research firm, OpenAI built the GPT-3 (Generative Pre-trained Transformer 3) set of models named Davinci, Curie, Babbage, and Ada that can generate “natural” text responses and perform tasks like classification, simple summaries, address correction, answering questions, and more.
Meta is also working on AI-based language translation. Facebook AI claims the M2M-100 model to be the first multilingual translation model that does not use English as the default language when it translates directly between 100 languages. It is also open source.
The Facebook-parent company is further focusing on AI-based translation for not just text, but primarily oral languages like Hokkien. However, it did not reveal the timeline for its AI language models and projects.
Google is also looking to collect data for languages that are widely spoken but do not have a strong online presence.
- Google’s plan is to build one gigantic model for the 1,000 languages so that both widely used and rarer languages can co-exist, interact, and grow together.
- Today, website chatbots help us 24x7, services like Google Translate instantly decode foreign languages, and AI-based art generators help non-artists create prize-winning digital paintings while the same technology also powers price predictors for investors.
- The Facebook-parent company is further focusing on AI-based translation for not just text, but primarily oral languages like Hokkien.