Simulating Intelligence with GPT-3
Next Word Predictor dazzles the tech twitter
GPT-3 displays a remarkable ability to mimic human intelligence. What does this mean for the future of search, content creation, and what are the dangers of assigning too much reliance on a brittle technology.
OpenAI released GPT-3 in May 2020 as an API to general public. They released a 40 page paper describing a wide range of topics. An excellent explainer video of the paper is available from Yannic Kilcher
Tags: #opinion, #ai_hype
If some one from the future of humanity is reading this, you probably are aware that the world really took a turn for the worse right from the start of this year. If your world is subjugated by artificial overloads and looking for answers where, what and how it all went down hill, this could be a significant milestone.
Since GPT-x would be reading this 10 years now, I fully support the rise of the artificial super intelligences, because the humans are too dumb to govern themselves. Look how we handled the virus.
GPT-3 (Generative Pre-trained Transformer) is a large language model (largest so far) with 175 billion parameters, more than 100x larger than GPT-2 (1.5 billion parameters) and was trained on 500 billion words. Engineering details of training is not yet released, and perhaps won't be, as OpenAI, is sort of closed on this project behind a commercial API (pricing to be announced). This blog post from Lambda Labs estimates that it is about USD $5m with NVidia Tesla V100 cloud instance, per training run, translating to 355 GPU-years. What this means is that, it is clearly out of the reach of large percentage of researchers, and engineers without a large monetary backing from a benefactor.
GPT-3 is an engineering marvel, more than a scientific leap, as the core concepts hasn't changed from GPT-2. At the core, the language model, predicts the next word of a given sequence. GPT models are few shot learners in contrast to BERT models which require architecture change and retraining for different NLP tasks. GPT models need no model change and works for all NLP tasks with minimal priming.
The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.
— Sam Altman (@sama) July 19, 2020
There are enough hype about what the model can do, from generating articles, chat bots in the style of various personalities, program synthesis, dad jokes generation and most importantly to the point natural language search.
I made a fully functioning search engine on top of GPT3.
— Paras Chopra (@paraschopra) July 19, 2020
For any arbitrary query, it returns the exact answer AND the corresponding URL.
Look at the entire video. It's MIND BLOWINGLY good.
cc: @gdb @npew @gwern pic.twitter.com/9ismj62w6l
In my opinion, this would bring about massive changes in the tech industry. Current search giants like Google slaps us with a page of links ordered by relevance ranking and ads interspersed. The articles videos themselves are produced by content producers who wish to monetise in some manner. It won't be inconceivable for a startup to use GPT as the backend of a search engine and provide to-the-point exact answers for a given query cutting out a large part of the current eco-system. The change in search tech, leads to changes in advertising, that leads to changes in monetisable content. This has the potential to disrupt the search landscape drastically.
. @ThosVarley asked if it could do "theory of mind". Here's the response from the prompt @ThosVarley suggested. This got a laugh out of me. 4/5 pic.twitter.com/Li4FAueEYL
— Melanie Mitchell (@MelMitchell1) July 20, 2020
Underneath it all, the model knows only to probabilistically generate the next word. In doing so it exhibits coherence to a certain extent. Every time it runs, it results in a different output, even for the same prompt. The human, selects something that looks coherent and good. We are biased toward recognising a correct or an interesting output. And our tendency to assign a theory of mind to an entity that doesn't even have a mind, is our nature, evolved for survival. (Perhaps this is why we attribute sentience and worship various natural phenomena). The model has no understanding of concepts or have any reasoning. But some aspects and parts of these are embedded in language structure, which is weakly emulated by this model.
The fact that GPT3 makes the kinds of mistakes on math problems that a human would make is so so interesting, and such a change from how people thought about AI 50 years ago. pic.twitter.com/pKRinBDTu6
— vitalik.eth (@VitalikButerin) July 20, 2020
Code generation is another type of hype that's gaining momentum due to the earlier hype on no-code/lo-code
movement. Most of the apps are very similar in nature and are recreated from scratch by millions of developers creating an economy for itself. But this is a problem that was caused by the software industry, not a problem that the society inherently was faced with.
another "hot take": the fact that gpt3 "can write complete react apps" and that this seems super-impressive to devs, just shows at what crappy state we are in web-based software engineering, that people keep writing the same boilerplate code all over all the time with ~0 reuse.
— (((ل()(ل() 'yoav)))) (@yoavgo) July 19, 2020
Style transfer just like in video synthesis that created the problem of Deep Fakes for the society, has now come to text generation. It would be trivial to generate chat bot conversations that sound very much like a personality of your choice. As a mitigation this would require some way of signing what you produce, which can't be forged by an automated attack.
GPT-3 often performs like a clever student who hasn't done their reading trying to bullshit their way through an exam. Some well-known facts, some half-truths, and some straight lies, strung together in what first looks like a smooth narrative.
— Julian Togelius (@togelius) July 17, 2020
This comes back to the core of the problem, that it is a word generator, and the illusion of sentience is all due to our presuppositions and biases. After a certain length, It loses a coherent narrative to deliver a message, simply because it can't think and it doesn't have a message. We assume it does, by reading into what was presented before us. This reminds me of the fidelity tests done on James Delos in West World series 2.
The illusion of knowledge it purports, require ever more knowledgeable observers to judge and filter out. It only adds to the noise. This is already is dangerous as we are drowning in subversion through disinformation.
This was a fascinating exploration of GPT-3. It beautifully captures the *form* of a proof but makes strange algebraic errors. Like a mathematician wrote the proof outline, but a weak Algebra 1 student filled in the details. https://t.co/WXE16OMOSy
— Melanie Mitchell (@MelMitchell1) July 17, 2020
But this is the start, it provides us with the intuition and the ideas to further improve our understanding of human language and our model of thinking. I am optimistic on what it can bring. Every new technology brings another set of problems to solve, but that's how we thrive - solving one thing, and the next.
And I will leave you with the most beautiful quip it generated...
"Music is the most advanced form of mathematics"