Amid growing fears that the AI boom is turning into a bubble, a widely respected scientist and pioneer of deep learning has now warned of another bubble taking place: the humanoid robot race.
Yann LeCun, the chief AI scientist at Meta, has warned that most robotics companies do not know how to develop the intelligence required to make humanoid robots useful and are focused, instead, on building the hardware.
“There is a large number of robotics companies that have been created over the last few years building humanoid robots. The big secret of the industry is that none of those companies has any idea how to make those robots smart enough to be useful or I should say, smart enough to be generally useful,” LeCun said. He was speaking at the inaugural MIT Generative AI Impact Symposium (MGAIC) at the prestigious Massachusetts Institute of Technology in Massachusetts, United States.
The symposium aims to highlight MIT’s commitment to shaping the generative AI landscape through interdisciplinary research and impactful collaborations with industry.
“We can train those robots for particular tasks, maybe in manufacturing and things like this. But your domestic robot, there is a bunch of breakthroughs that need to arrive in AI before that’s possible,” LeCun added. He further argued that the future of these companies, which have successfully raised billions of dollars in investment, essentially depends on “whether we’re going to make progress, significant progress, towards those kinds of world model planning-type architectures.”
LeCun’s remarks reflect a sobering assessment of several research-level bottlenecks that need to be addressed in order to kick off the decade of robotics. The generative AI race has also similarly drawn cautionary comments, with experts pointing out that challenges like continual learning must be addressed in order to achieve artificial general intelligence or AGI.
“They don’t have continual learning. You can’t just tell them something and they’ll remember it. They’re cognitively lacking and it’s just not working. It will take about a decade to work through all of those issues,” Andrej Karpathy, an OpenAI co-founder and AI/ML researcher, said on a recent podcast episode that went viral on social media.
Story continues below this ad
Much like AGI, the timelines for rolling out humanoid robots on a commercial scale has become a subject of debate. LeCun believes that present-day large language models are not capable of powering humanoid robots. “First of all, we’re missing something big, that we need AI systems to learn from natural, high-bandwidth sensory data like video. We’re never going to get to human-level intelligence by just training on text,” he said at the MIT event.
“A four-year-old has seen as much data through vision as the biggest LLMs trained on all the publicly available text,” he added. Instead, the 65-year-old French researcher has expressed confidence in something known as a ‘world model’ to make robots smarter.
What is a world model?
A world model is an AI system that can learn from high-bandwidth video and sensory input to build an internal understanding of the physical world.
“Given a representation of the state of the world at time T, and given an action that an agent would imagine taking, can you predict the state of the world resulting from taking this action? That’s a world model,” LeCun said. Highlighting own research on non-generative, self-supervised architectures like V-JEPA (Video Joint Embedding Predictive Architecture), which are trained to predict what will happen next in a video, LeCun said, “Those systems basically can show that they’ve learned a little bit of common sense.”
Story continues below this ad
“If you show them a video where something impossible occurs, like an object spontaneously disappears or changes shape or something, the prediction error goes through the roof. And so they can tell you something really unusual occurred that I don’t understand. That’s a first sign of a self-supervised learning system,” he added. According to LeCun, world models can be used “to get a robot to accomplish a task zero shot. You don’t have to train it to accomplish this task. There’s no training whatsoever. No RL. The training is completely self-supervised.”
Who is Yann LeCun?
Known as one of the three godfathers of AI, LeCun is a French computer scientist who has expertise in various fields like machine learning, computational neuroscience, computer vision, and mobile robotics.
LeCun has a PhD in computer science from Sorbonne University. He is also currently a professor at New York University. His work on convolutional networks and deep learning has transformed how machines see and learn, and how they listen and understand the world.
In 2018, LeCun won the Turing Award (which is the Nobel Prize-equivalent for computing) along with Geoffrey Hinton and Yoshua Bengio.
