IARPA Needs More Training Data for Video Surveillance Algorithms

Production Perig/Shutterstock.com

The data would improve the tech’s ability to link together footage shot across a broad geographic space, allowing it to better track and identify potential targets.

The intelligence community’s research arm wants to train algorithms to track people across sprawling video surveillance networks, and it needs more data to do it.

The Intelligence Advanced Research Projects Activity is recruiting teams to build bigger, better datasets to train computer vision algorithms that would monitor people as they move through urban environments. The training data would improve the tech’s ability to link together footage from a large network of security cameras, allowing it to better track and identify potential targets.

Computer vision is a type of artificial intelligence that allows computers to interpret images and videos. Many law enforcement and public safety organizations already use the tech to investigate crimes, monitor critical infrastructure and secure major events that could be targets for terrorists. An early version of the tech was used to identify the perpetrators of the Boston Marathon bombing in 2013, for instance, its popularity has only grown in the years since.

But according to IARPA, the data used to train algorithms today is fairly narrow, which limits the tech’s ability to dissect the wide range of situations they’d see in the real world. With the new datasets, officials aim to improve the training process and enable computer vision systems to connect footage shot from cameras positioned across a broad geographic area.

“Further research in the area of computer vision within multi-camera video networks may support post-event crime scene reconstruction, protection of critical infrastructure and transportation facilities, military force protection, and in the operations of National Special Security Events,” IARPA officials wrote in the solicitation.

Under the solicitation, selected vendors would compile roughly 960 hours of video footage covering numerous different environments and scenarios.

The dataset must include footage from at least 20 different security cameras with “varying positions, views, resolutions and frame rates” scattered across roughly 2.5 acres of “urban or semi-urban space.” The videos would be shot all hours of the day and in different weather conditions, and include pedestrians, moving vehicles, street signs and other “distractors.”

The footage must also include at least 200 test subjects behaving in different ways across the camera network. Ultimately, these are the people the algorithms would focus on to sharpen their identification and tracking skills.

Interested vendors must respond to the solicitation by May 17.