background Layer 1

MIT and Google researchers discover how GPT-3 can learn new tasks with minimal data

Scientists from the Massachusetts Institute of Technology (MIT), Google Research and Stanford University have published a new study which shows how large language models, like GPT-3, can learn new tasks, using a small number of examples and without attracting new training data. According to the researchers, neural network models can contain smaller and simpler linear models within them. This means that a simple training algorithm can be used in a large model that will train a linear model to solve a particular task, using only the information already contained in the large model. If the results of this study are confirmed, it will be an important step toward understanding how contextual learning works.

Similar research is being conducted by a team from the same Massachusetts Institute of Technology and the joint MIT and IBM Watson Artificial Intelligence Laboratory. They have developed a technique that allows the model to quantify uncertainty more efficiently with less computing power than is used in other techniques. It also does not require additional data. This technique does not involve additional retraining or model modification, and can be used in most applications. It includes a simple complementary model that helps the basic ML model calculate uncertainty.


We use cookies for analytical purposes and to deliver you the best experience with our website. Continuing to the site, you agree to the Cookie Policy.