Google Open Sources Powerful Image Recognition Tool

by Ostatic Staff - Oct. 03, 2016

On the artificial intelligence front, there is a true renaissance going on right now, and it includes a slew of new open source tools, many of which are likely to give rise to businesses built around them. For example, Google recently open sourced a program called TensorFlow. It’s based on the same internal toolset that Google has spent years developing to support its AI software and other predictive and analytics programs. You can find out more about TensorFlow at its site, and you might be surprised to learn that it is the engine behind several Google tools you may already use, including Google Photos and the speech recognition found in the Google app.

Now, Google has open sourced a "Show and Tell" algorithm to developers, who can purportedly use it recognize objects in photos with up to 93.9 percent accuracy, and help to automate smart photo captioning. It's based on TensorFlow, and here are details.

As Fossbytes noted:

 In 2014, the Google Brain team started working on a system that could analyze an image and write a caption for it. The system could analyze what was happening in the image. At that time, their image classification model Inception V1 enabled the system to achieve an accuracy of 89.6%. Months followed and the image classification model was upgraded to Inception V2 in 2015 enabling 91.8% accuracy.

The improved system can detect multiple objects in an image along with their characteristics and write a more relevant caption. 

The code base is on GitHub. 

 According to the project description:

 Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing....The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively.

 This technology is just one of the more interesting artificial intelligence tools to go open source recently. Here are more examples:

We reported on how H2O.ai, formerly known as Oxdata, is advancing its machine learning toolset, and the company is entirely open source-focused. We recently caught up with Oleg Rogynskyy, VP of Marketing & Growth at H2O, for an interview.

Meanwhile, Facebook is open sourcing its machine learning system designed for artificial intelligence (AI) computing at a large scale. It's based on Nvidia hardware. And, IBM announced that its proprietary machine learning program known as SystemML will be freely available to share and modify through the Apache Software Foundation.

And, Yahoo has released its key artificial intelligence software (AI) under an open source license. The company previously developed a library called CaffeOnSpark to perform a popular type of AI called “deep learning” on the big troves of data found in its Hadoop file system. Now CaffeOnSpark is becoming available for community use under an open source Apache license on GitHub.

As WIRED notes:

"CaffeOnSpark is based on deep learning, a branch of artificial intelligence particularly useful in helping machines recognize human speech, or the contents of a photo or video. Yahoo, for example, uses it to improve search results on Flickr by determining the contents of different photos. Instead of relying on the descriptions and keywords entered by the people who upload photos to the site, Yahoo teaches its computers to recognize certain characteristics of a photo, such as specific colors or even objects and animals."

 CaffeOnSpark works with x86 chips or graphics processing units (GPUs). It can be run on cloud infrastructure or within data centers. Among many uses for it at Yahoo, it has helped make connections for content recommendations.

 "In 2016, every company will want to get on the machine-learning bandwagon," said Monte Zweben, co-founder and CEO of Splice Machine and executive chairman of RocketFuel, in a recent interview. "But without the right people, many won’t have the expertise to do it. Expect to see the development of turnkey databases that allow developers to build predictive models without having a Ph.D."

Notably, Microsoft has open sourced the artificial intelligence framework it uses to power speech recognition in its Cortana digital assistant and Skype Translate applications. The framework is called, CNTK, and can help machines do things like understand speech and determine logical connections between photos.

Microsoft released its Computational Network Toolkit (CNTK) as an open source project on GitHub, and developers are likely to leverage it to advance deep learning networks.

Ray Kurzweil's site has also rounded up many of the top machine learning and artificial intelligence breakthroughs of recent times here.