Indy's Weblog
2020 Jul 27

Stark Future with GPT-3

Great Power vs Great Responsibility

OpenAI recently released GPT-3 to the world. This has significant and wide ranging implications for the technology and society in general. It's not the Terminator/SkyNet doom mongering we should be vary of. It's more subtle than that. This is a short survey on the possible dangers, for us to think about how to mitigate them.

What is GPT

GPT stands for Generative Pre-trained Transformer. This class of models, generates its output. This language model, predicts and generates the next word when given a sequence of words as its input. Compare this to say a Classifier model, which outputs a class and a probability.

Pre-trained means, you don’t have to train the model again from scratch to solve a similar problem. So again taking a Classifier as an example, a classifier that was trained to recognise dog breeds, have to be retrained from scratch to recognise cats breeds. With pre-trained models, you can prime the model with few examples without having to train it again from scratch. They are also called few-shot learners, and can be adapted to perform wide range of Natural Language Processing Tasks.

Transformer is a Deep Learning model which can transform from one sequence of input to a different sequence of output. For example sequence of English words to a sequence of German words. Significant element of a transformer is something called Attention. What attention does is that it encodes how much a word is relevant in the context. This is important because in human language we have long range references and Attention mechanism helps to improve resolving these vague references.

How is it built

GPT-3 is a very large language model. It consumed almost the entirety of the internet up until December 2019, that’s 500 billion words and created this humongous model with 175 billion parameters. This is the largest language model so far, for English language. To give a sense of scale, it was estimated that each training run would cost about 4.6 mil US dollars and would take 355 GPU years(1). Normally, you need to train many 10s of times to produce a good model. What this means is that this type of technology is beyond the reach of many researchers and organisations.

GPT-3 is an engineering marvel, more than a scientific leap. This is because the concepts hasn’t changed from the previous model GPT-2. It is just the scale. Training and producing a model of this scale requires extremely large amount of engineering effort.


When the GPT-3 API was released to a select number of applicants. Many were amazed by how good it is in generating text. People adapted the model to perform wide variety of interesting and often mind blowing tasks. We saw people use it to generate excellent essays in a given subject. Generate poetry, with amazing quality. Generate chats in the likeness of any personality. Generate software code with amazing accuracy. This generated huge hype to say that GPT-3 is almost human like.


  1. In Search

Being a text generator, it does an excellent job of creating very coherent text but what excited me most is its ability to seek and provide an exact answer for a natural language query(2). This has the potential to change the search landscape drastically. Normally with Google search we are provided with a page of links we have to click and find the answer ourselves. This had become a cornerstone of search business driven by advertising. More content people have eyeballs on, more opportunities for displaying ads. This in turn fuels content creation, so the ecosystem works. But with a disrupting technology like exact answer, people would be spending less time looking for things, and this would affect the ad business, and in-turn content creators.

  1. Code synthesis

Code generation demos were plenty a couple of weeks back. Ability to generate code from natural language seems like magic. Ability create functioning designs from natural language seems magic. The model has learnt associations from the given natural language examples, and the code to generate. This is mostly due to having large number of code examples that pretty much do the same thing. The fact that this can be automated means, that most of the low hanging fruits like creating generic apps and designs would be better automated. This is something long time coming. The developer tools must improve so that productivity increases. If you have to fight with a tool or a framework, rather than solving the business problem you were trying to solve, that is ripe for replacement. And It is a welcome change so that coding would be simplified, and freeing the developers to focus on more interesting and difficult tasks. Of course this would mean most of the repetitive jobs like building apps would be reduced, or render obsolete.

  1. Impersonation

With chatbots backed by GPT-3 would be able to create the style of writing indistinguishable from the original personality. There were so many examples of conversations with celebrities like Elon Musk, or Kanye(4). What this means is that it would be easier for some bad actors to exploit this to harmful ends. We already have deep fakes, creating videos of people that require enormous effort to combat. This would make the situation even worse. We would have to come up with another set of tools for example a decentralised record of public statements, that can be verified.

  1. Illusion of knowledge

Being a word generator, GPT doesn’t have an inherent knowledge of a subject. It also doesn’t have an opinion or a message to deliver. From the examples presented to it, it will interpolate coherent pieces of text for a certain length limit. The coherence tend to diverge longer it has to generate. But this length will improve with later models. But what it wouldn’t improve is having the actual knowledge. That would be a quantum leap getting very close to AGI.

With GPT-3 it had been largely easier to convince a reader that the generated text puts forward a cogent argument. What this means is that it had become easier to convince a significantly above average intelligent person with misinformation. It had made detecting misinformation more difficult, without having a deep knowledge of a given subject(5). The ease and efficiency of generating disinformation would overpower any means of combatting them. This is already dangerous as we are drowning in subversion through disinformation.

What could possibly fight this type of adversity is very debatable. The generation of disinformation at a massive scale would be extremely difficult to fend off. In the shorter term, society would be misguided and could be divided by this. In the longer term, society will lose trust in technology. OpenAI is doing a lot to prevent this from happening. But if a state actor with similar or more resources would be able to recreate GPT-3 like technology, which is not inconceivable, fighting such adversity would be extremely difficult.

Optimism over despair

One of the great things about AI is that it makes us closer to understanding ourselves, our speech, means of communication, our cognition, etc.. We may be surprised that most of our waking life is a product of quasi random generation of thoughts and words. May be all of our reality is an illusion that was evolved for survival. What this technology does is to lift the curtain to reveal our machinations.

Despite all this, I am optimistic about what it can bring to the betterment of the society, and every large technology leap brings its own set of problems to solve, but that’s how we thrive - solving one thing and the next.

Disclaimer: This is a follow up of my previous post Simulating Intelligence with GPT-3, and has overlapping ideas & phrases.

Update Log

2020 Aug 03 - Include references


[1] OpenAI's GPT-3 Language Model: A Technical Overview Chuan Li Chief Science Officer at Lambda labs

[2] Fully Functioning search engine on top of GPT-3 Paras Chopra @paraschopra

[3] Currated list of GPT-3 based applications Aditya Joshi @1adityajoshi

[4] Text completion and the combination of style rewriting and text completion Carlos E. Perez @IntuitMachine

[5] "We can now automate the production of passable text on basically any topic" Julian Togelius @togelius - AI and games researcher, Associate professor at NYU

Attibution: Photo by Jeremy Lishner on Unsplash