Elon, GPT-3, And the A.I. Bonanza

headshot_name.png
 

What do NeuraLink and OpenAI have in common? They are both on a quest to synthesize intelligence using a wet lab vs. dry lab approach. Neuralink [wet lab] is trying to interact with the brain and learn straight from the source through implantable brain-machine interfaces. OpenAI [dry lab] is looking to emulate brain functions in the real world. The OpenAI researchers want to democratize access to safe artificial intelligence. Both companies were founded by the real-life version of Tony Stark, the genius inventor and alter ego of the Marvel superhero known as Iron Man. As showcased with his companies SpaceX, Tesla, and Powerwall, the visionary Elon Musk guides us towards a technologically-advanced reality. 

OpenAI has made the news lately with the release of their GPT-3  model. With this release, OpenAI demonstrates their prowess in building AI capable of language skills. The GTP-3 language model was released earlier this month with accolades that it can write sequels to the “Lord of the Rings” or even prescribe medicine.

gpt-3-analysis.jpg

What do we know about GPT-3? 

High-performance computing and big data are the forcing factors behind recent advances in AI applied to language understanding. We can now simulate the laws of complex universes using vast amounts of data. With the right deep learning models, we can even train AI that learns to mimic human-reasoning pathways embedded in data. This kind of AI that discovers hidden truths has become a mirror for our society as it highlights social biases promoted over generations. After all, robots reflect their creator. If we want to birth safe AI, today is our chance to improve as individuals and collectively. 

The rule of the game in AI language research is to train a more substantial language model on more data. Language models like GTP-3 are trained on VAST amounts of data. GPT-3 has 175 billion parameters and was trained on 40GB of Internet text. GPT-3 is available on a limited release and Kevin Lacker was one of the lucky ones to get access to the giant machinery. Kevin shared an excellent analysis of where GPT-3 does a good job vs. where it falls short. GPT-3 often performs like a clever student who hasn’t done their reading trying to bullshit their way through an exam. Some well-known facts, some half-truths, and some straight lies strung together in what first looks like a smooth narrative. You wouldn’t want to rely on this particular student to unearth the next string theory.

Information ≠ Knowledge

Memorizing information does not help you solve complex questions on math tests. Like life, math places you in front of new situations that can only be cracked by fitting totally new variables into old-time equations. This game of life and math requires reasoning. A system that only memorizes the laws of the universe will fall short when asked to reason. GTP-3 can generate occasionally impressive results from memory, but stumbles when asked to create logical content.  

Here are some theoretical and practical aspects of why GTP-3 falls short of our expectations.

  1. Reasoning can’t be an afterthought in the design of AI solutions. Systems like GTP-3 will perform poorly in any domain where past experience is sparse - you can’t memorize the unknowns. In the world of Google search, it makes little sense to test one’s ability to memorize facts and much more sense to structure and apply knowledge.

  2. Giant AI models have a substantial memory and compute footprint. Not many individuals or companies can afford spinning up production environments that can sustain something on the scale of GTP-3. If OpenAI is keen on democratizing access to AI, this approach will surely not enable most stakeholders to engage with, design, or further contribute to AI. 

  3. It is a challenge to find representative training data for specialized domains that are not well indexed on the Internet. If we want AI to speak like doctors, the National Library of Medicine will not suffice as a training ground.

  4. AI performs best when refined for a problem space. More humble language models like BERT give great results (>90% F1) on knowledge extraction tasks. It is harder for novel language models to improve these results and bridge the gap from 90% -> 100% F1. However, language tasks like summarization have more uncharted territory and could experience a renaissance. GPT-3 could be a powerful tool, if applied to the right pain point.

Despite its shortcomings, GTP-3 makes for great press.

AI in the industry 

I’ve learned from my experience building AI solutions at small startups and large enterprises that changes to the underlying AI stack can lead to diminishing returns. We need to better frame the problem for high-fidelity results. Today, we are committed to developing AI that is hyper-focused on a particular task and vertically-integrated into a business solution. Humans do not excel at everything, but they perform excellently once they specialize in the job. 

GPT-3 shouldn’t mislead people into believing that language models are capable of understanding or meaning. If general AI is our destination, memorization won’t lead us there. Language models could be a helpful starting point, but there’s a great journey ahead.

Previous
Previous

Biohacking The Emotional Brain

Next
Next

Nerves and Guts for COVID-19 Prevention