Skip to content

OPLIN 4cast #414: Image search research

Posted in 4cast

search iconHow many times have you had a library patron say, “I once read a really good book, it had a red cover with a bicycle on the front [or some other cover description] — can you find that for me again?” That kind of request to basically find a described image (the book cover) doesn’t just happen in libraries anymore. As the content of the Internet continues to shift from text to graphics, accurately searching for images based on a general description becomes more and more important to some of the biggest Internet companies. In the past couple of weeks, researchers at both Google and Yahoo (owner of Flickr) have posted some interesting news about their recent work to improve image searching.

  • A picture is worth a thousand (coherent) words: Building a natural description of images (Google Research Blog | Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan)  “But accurately describing a complex scene requires a deeper representation of what’s going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. Many efforts to construct computer-generated natural descriptions of images propose combining current state-of-the-art techniques in both computer vision and natural language processing to form a complete image description approach. But what if we instead merged recent computer vision and language models into a single jointly trained system, taking an image and directly producing a human readable sequence of words to describe it?”
  • Image search, analysis emerge as powerful tools, privacy threat (eWeek | Mike Elgan)  “In a nutshell, these systems identify objects in a photograph—say, a boy, a dog, a ball, a tree, a park, a bird, some clouds and so on—then use sophisticated artificial intelligence to understand that the boy is throwing the ball for the dog to chase in a park and that the bird isn’t involved in the main action of the photo. Combine this technology with face recognition and anyone with access (which will be everyone) will be able to search the Web for people doing things or involved with or associated with some activity.”
  • Science powering product: Yahoo Weather (Yahoo Labs | David A. Shamma, Jia Li, Lyndon Kennedy, and Bart Thomée)  “But even more difficult than finding a stunning photo that accurately reflects the weather in a given location is the challenge of finding what the Flickr community believes is an interesting weather photo. A little while before we set out to surface our one million photos, we made an observation about how people designate photos on Flickr as ‘favorites.’ Something as simple as favorites and likes on social network sites are rich social signals that can be used to surface themes of images.”
  • Finding an image with an image and other feats of computer vision (Ars Technica | Megan Geuss)  “Yahoo’s efforts to make photo search better has a simple mantra: ‘more relevant photos for users, not just the most popular photos,’ as Li put it. To that extent, Flickr tries to improve general search while also improving search relevance within a person’s likely-massive online photo album. Shamma noted that batch upload and the gigabytes and terabytes of storage offered to customers at relatively cheap prices have changed how we photograph things. Accordingly, storage and recall of photographs has to adapt to fit the morphing definition of photography. ‘The practice of photography is changing very quickly, using photos for communication has been growing,’ Shamma said.”

Articles from Ohio Web Library: