Press "Enter" to skip to content

Facebook Uses Billions Of Instagram Images To Train AI Algorithms

With its goal to better understand objects in images, Facebook announced yesterday, May 2nd, at its annual F8 developer conference, that it trains its artificial intelligence.

Facebook says the approach, which culls images from publicly available hashtags, is a way to collect and train software with billions of pictures without the need for human workers to laboriously analyze the data and annotate it.

While other image recognition benchmarks may rely on millions of photos that human beings have pored through and annotated personally, Facebook had to find methods to clean up what users had submitted that they could do at scale. The largest of the tests used 3.5 billion Instagram images spanning 17,000 hashtags — even Facebook doesn’t have the resources to supervise the data carefully.

The result is a training system that created algorithms; Facebook says beat top-of-the-line industry benchmarks.

Facebook explained how they used the data of billions of public Instagram photos that had been annotated by users with hashtags, to train their image recognition models.

They relied on hundreds of graphics processing units GPUs running around the clock to analyze the data but were ultimately left with deep learning models that beat industry benchmarks, the best of which achieved 85.4 percent accuracy on ImageNet.

The models this data trained will be pretty universally useful to Facebook, but image recognition could also bring users better search and accessibility tools, as well as strengthening Facebook’s efforts to combat abuse on their platform.

Facebook is only extracting object-based data at the moment, and it’s not necessarily trying to draw inferences about user behavior from the contents of photos. But as we know with Facebook’s facial recognition system that automatically tags photos, the company does see value in being able to understand who users are with and where they are in the world.

Facebook is building these AI systems primarily to help it scale its moderation efforts. In addition to 20,000 new human moderators for its platform, Facebook is increasingly looking to automation as it grapples with Russia election interference, the Cambridge Analytica data privacy scandal, and other hard questions about how to moderate the content on its platform and keep bad actors from abusing its tools.

Facebook’s Chief Technology Officer Mike Schroepfer, said onstage at F8, “Until very recently we often had to rely on reactive reports. We had to wait for something bad to be spotted by someone and do something about it.”

Schroepfer added, “The bulk of the moderation is being handled by AI, which is helping the company screen for and scrub its platform of terrorist propaganda, nudity, violence, spam, and hate speech. “This is why we are so focused on core AI research. We require new breakthroughs, and we require new technologies to solve problems all of us want to solve.”