Why Physical AI May Not Scale Like Language Models
Matthew Johnson-Roberson, Dean of the College of Connected Computing at Vanderbilt University and former director of the Robotics Institute at Carnegie Mellon, argues that physical AI may not follow the same path as large language models.
Language models had a clear training target: predict the next word. That gave researchers a simple objective that could be scaled across massive amounts of text.
Robotics does not appear to have the same equivalent yet.
A robot can collect large amounts of video, sensor and encoder data, but that does not automatically solve the harder problem: what should the system actually optimize for?
Predicting the next frame, joint angle or robot motion is not as universal as predicting the next word in a sentence.