(Subscribe to our Today's Cache newsletter for a quick snapshot of top 5 tech stories. Click here to subscribe for free.)
Microsoft recently received an exclusive license to use OpenAI’s GPT-3 (Generative Pre-trained Transformer) language model in its own products and services. The model uses deep learning method to create human-like text in real-time.
The third-generation model built by San Francisco-based AI research company OpenAI is available to developers via the OpenAI application programming interface (API), which can be used to develop applications and services.
GPT-3 and its applications
GPT-3 has 175 billion machine learning (ML) parameters, over 100 times more than its predecessor. The ML parameters are crucial for solving problem as they look into multiple configurations to approach a specific task. It also represents the skill and complexity level of the model.
The latest generation has been designed to understand inputs in English and to generate output with minimal interactions or adjustments from a user.
To generate text, the model just needs descriptions in simple English, along with a few examples for it comprehend and start working.
GPT-3 can write (including long-form generative text), translate, comprehend text, answer closed book questions, reason common tasks, and code.
Training and learning
The examples provided during a task, help in programming the API, while the success generally varies depending on the complexity of the task, according to OpenAI.
The API’s performance can be improved for certain tasks by training model on a dataset of examples provided by the user, or by learning from human feedback.
GPT-3 is trained on Microsoft cloud’s AI supercomputer with various datasets, which consist of text either posted or uploaded on the internet.
The internet data includes a version of the Common Crawl dataset, an expanded version of the WebText dataset, two internet-based book databases, and English-language Wikipedia.
For the model’s contextual learning, it has been trained to predict answer with only a description of the task, without any examples, with just one example of the task, or by providing few examples.
Limitations and possibilities
According to OpenAI, GPT-3 has the tendency to express incorrect text information confidently, and it can provide reasonable output when given inputs are similar to the ones present in its training data.
The training data mostly has texts in English, meaning the model is best suited for classifying, searching, summarising, or generating in that language.
So, when working with inputs from non-English languages, and certain dialects of English that are not as well-described in training data, the model’s performance may vary.
The API can be integrated into a particular product, or used to develop an entirely new application. As it provides a general-purpose “text in, text out” interface, users could try it on virtually any English language task.
OpenAI also expects the model to be used by researchers to better understand the behaviours, capabilities, biases, and constraints of large-scale language models.