In Three Words, ‘Deep Learning Worked’

“Generative AI essentially learns the probability laws that drive human or natural data”

OpenAI chief Sam Altman recently published a blog titled “The Intelligence Age,” in which he says, “It is possible that we will have superintelligence in a few thousand days (!); it may take longer, but I’m confident we’ll get there.”

Altman credited ‘deep learning’ as the driving force behind AI’s rapid progress, saying that humanity has discovered an algorithm capable of learning from massive datasets with increasing precision. He said, “In three words: deep learning worked.”

“In 15 words: deep learning worked, got predictably better with scale, and we dedicated increasing resources to it,” he quipped. He believes AI can solve complex problems, including climate change, space colonisation, and fundamental physics.

“That’s really it; humanity discovered an algorithm that could really, truly learn any distribution of data (or really, the underlying “rules” that produce any distribution of data). To a shocking degree of precision, the more compute and data available, the better it gets at helping people solve hard problems,” said Altman.

While speaking at Cypher 2024, Vignesh Subrahmaniam, group manager of data science at Intuit, discussed deep learning and generative AI, expanding on Altman’s thoughts.

Subrahmaniam began by quoting Scottish philosopher David Hume: “By this means, all knowledge degenerates to probability”. This quote, made nearly 300 years ago, captures the essence of how generative AI functions, as it relies on probability to generate new data based on existing data.

At its core, generative AI learns the probability laws that govern data generation, simulating data in a way that mimics human patterns. Subrahmaniam explained this concept by asking the audience how they would simulate new data points based on a provided dataset. The key, he revealed, lies in understanding the underlying probability distribution of the original data.

“Generative AI essentially learns the probability laws that drive human or natural data,” he explained. For instance, it can generate new images of cats from an existing dataset or compose new music based on a sample. “By providing the text of a song or a picture, generative AI can create something new that aligns with human expectations,” Subrahmaniam added.

The Pillars of Machine Learning 

Subrahmaniam explained the three pillars that form the foundation of machine learning are gradient descent, automatic differentiation, and stochastic approximation. 

He said that gradient descent—first formalised in the 1970s, in the paper “Minimization of Functions Having Lipschitz Continuous First Partial Derivatives” by Larry Armijo—remains a key optimisation technique used in machine learning models today. “It’s the basis for minimising functions with continuous derivatives,” he explained. 

Automatic differentiation, a technique that allows for the exact calculation of derivatives of complex functions, enabled researchers to improve models without suffering from approximation errors. 

Finally, stochastic approximation enables large scale parallel computation of these derivatives across millions of processors, allowing models to scale to new heights. 

“That is the reason why GPU’s are so powerful. The technology we have enables the parallelisation of the computation of these derivatives at scale, through multiple simulations. That is essentially why we can ‘learn’ extraordinarily complex functions at scale,”said Subramianiam. 

What’s Next? 

One of the most exciting developments in AI today, according to Subrahmaniam, is the rise of language models. “Human language is essentially a sequence of tokens, and language models assign probabilities to sentences or sequences of words,” he explained. 

These models, such as those used for text generation, have changed the way we interact with AI. By generating the next word in a sentence or filling in missing words, AI systems can now produce coherent, human-like text. 

However, Subrahmaniam believes that to reach ‘Superintelligence,’ we need a fundamentally different mathematical model that relates to human reasoning. “We don’t have that yet. Right now, it’s generated. So it’s called generative AI. It’s not called reasoning AI,” he concluded.

📣 Want to advertise in AIM? Book here

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Conference on AI Startups
April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.