Monday, June 2, 2014 - 4:00pm to 4:50pm
KEC 1001

Speaker Information

Rob Hess
Software Development Engineer
Flickr Vision group


With the ubiquity of high-quality cell-phone cameras, nearly everyone can now take good photographs.  And they do.  Lots of them.  This trend makes increasingly apparent the need for effective systems for photo organization and retrieval, while at the same time making it increasingly unlikely that photographers will manually organize or annotate the tens, hundreds, or even thousands of photos they each take every day.

In this talk, I will describe the large-scale computer vision system we've developed at Flickr to help our users more easily find the photos they're looking for without relying on manual organization or annotations.  This system, called Autotags, is based on features computed using deep convolutional neural nets and can currently recognize about a thousand things in images.  Using large-scale distributed computing technologies like Storm and Hadoop, Autotags can process the 10 million daily Flickr uploads in real time, and it can process the entire Flickr corpus of 10 billion images in a few days.  The tags applied to photos by Autotags have made possible vast improvements in the quality of Flickr's image search and have opened the door to many other novel ways to organize photos and engage users.

Speaker Bio

Rob Hess is a Software Development Engineer in the Flickr Vision group, where he works on applying large-scale distributed computing technologies to problems in computer vision.  He received a PhD in Computer Science in 2012 from Oregon State University, where he studied computer vision and its application to American football under Alan Fern.  Before joining Flickr, he worked as a Research Scientist at IQ Engines, a computer vision startup focused on image retrieval and personal photo organization that was acquired by Yahoo in 2013.