Scale AI Dives Into Physical AI

Like the rest of Silicon Valley, Scale AI has quickly become a big-time defense contractor since, well, AI is all the rage in the Pentagon right now. On Wednesday, the AI training data startup—which Meta took a 49 percent stake in for a cool $15B in June—announced its expanded physical AI data engine platform for autonomy and robotics companies.

Scaling fast: Scale AI, started in 2016 by Alexandr Wang (no, not the designer, for all the confused fashionistas), has built a bit of an empire in the AI training world. Their platforms are used by some small organizations like OpenAI, Meta, Microsoft, Google, Nvidia, and, of course, the Department of Defense. Sorry, War. Please don’t make us do push-ups.

Here’s what Scale AI does:

They prepare data for AI by collecting, cleaning, labeling, and organizing massive datasets (text, images, video, sensor data) so machine learning models can actually learn from them.
They also test AI models by providing tools to evaluate, benchmark, and fine-tune AI systems, including generative AI models.
For the defense folks, they’ve won more than a few $100M+ contracts with the Pentagon to help apply and build AI applications for intel analysis, planning, and autonomous systems, including a $250M ceiling BPA with the Joint Artificial Intelligence Center in 2022 and a $99M Army R&D contract in August. Cha-ching.

Getting physical: Their physical AI offerings, which started out focused on processing millions of hours of sensor training data for autonomous vehicle companies, according to Scale AI, have now expanded to robotics.

That’s important, because “for robotics, there is no preexisting repository of physical interactions to reference,” the company said in a statement. “Unlike text or images, robotic manipulation data can’t be scraped from the web. It must be collected, one interaction at a time, in the real world.”

Building the robo-brain: Scale AI says they’ve collected more than 100,000 production hours at their prototyping lab and from contributors worldwide to build their annotated physical AI data repository for autonomy and robotics companies, using data collection robots and human demonstrations designed to improve robotic capabilities. For the dear Tectonic readers building robotic and autonomous systems, they’re pitching to you.

Now, Scale isn’t the only company building training data for autonomous systems and robotics. Nvidia announced in March that it’s developing the “world’s largest” open-source physical AI dataset for advanced robotics and autonomous vehicle development. Applied Intuition has also carved a nice niche in the synthetic data, simulation, and sensor modeling in the autonomous system space.

Scale AI’s main difference is that their physical AI training and curation data isn’t synthetic. It’s based on data collected from real-world physical robots, which the company says makes it more high-fidelity and at the volume and consistency needed for the massive datasets that robotics training requires.

Since the robotics and physical autonomy hype doesn’t look likely to fade anytime soon, the Zuck’s $15B bet on Scale AI could start to pay off with a few more bigly Pentagon contracts and defense tech customers pushing to make all the unmanned systems the military has heart-eyes for.

Tectonic Defense Newsletter