More Open AI and Machine Learning Toolsets Arrive

by Ostatic Staff - Dec. 02, 2016

Recently, in an article for TechCrunch, Spark Capital's John Melas-Kyriazi weighed in on how startups can leverage artificial intelligence and machine learning to advance their businesses or even give birth to brand new ones. As a corollary avenue on that topic, it's worth noting that some very powerful artificial intelligence and machine learning engines have recently been open sourced. Quite a few of them have been tested and hardened at Google, Facebook, Microsoft and other companies, and some of them may represent business opportunities.

Just recently, two new open source entries on this front have emerged, and they are worth investigating. Here are details.

Health Catalyst has created as a repository of healthcare-focused open source machine learning software, with an eye toward encouraging the healthcare industry to tap into the power of AI and machine learning. According to an announcement:

"The packages are designed to streamline healthcare machine learning. They do this by including functionality specific to healthcare, as well as simplifying the workflow of creating and deploying models. We believe that machine learning is too helpful and important to be handled solely by full-time data scientists. These packages are a humble attempt at machine learning democratization in a realm that needs it most—healthcare."

 " packages provide an easy way to create models on your data," the announcement adds. "This includes linear and random forest models, ways to handle missing data, guidance on feature selection, proper performance metrics, and easy database connections."

"The use of machine learning and predictive analytics has been limited in healthcare because of challenges in programming solutions, typically by highly trained data scientists, mostly in the nation’s top academic medical centers," notes Health Data Management.

Meanwhile, Google has gathered some compelling AI and machine learning demonstrations and placed them in its Google AI Experiments showcase. Through the showcase, you can experiment in an open source way.

"With all the exciting A.I. stuff happening, there are lots of people eager to start tinkering with machine learning technology," Google engineeers said. "A.I. Experiments is a showcase for simple experiments that let anyone play with this technology in hands-on ways, through pictures, drawings, language, music, and more."

Interested in more AI and machine learning technology from the open source world?

Google has open sourced a program called TensorFlow. It’s based on the same internal toolset that Google has spent years developing to support its AI software and other predictive and analytics programs. You can find out more about TensorFlow at its site, and you might be surprised to learn that it is the engine behind several Google tools you may already use, including Google Photos and the speech recognition found in the Google app.

According to Google, TensorFlow could help speed up processes ranging from drug discovery to processing astronomy-related data sets.

Additionally, we reported on how, formerly known as Oxdata, has announced a new funding round that it is getting to the tune of $20 million. The money will go toward advancing its machine learning toolset, and the company is entirely open source-focused. We recently caught up with Oleg Rogynskyy, VP of Marketing & Growth at H2O, for an interview.

Meanwhile, Facebook is open sourcing its machine learning system designed for artificial intelligence (AI) computing at a large scale. It's based on Nvidia hardware. And, IBM announced that its proprietary machine learning program known as SystemML will be freely available to share and modify through the Apache Software Foundation.

And, Yahoo has released its key artificial intelligence software (AI) under an open source license. The company previously developed a library called CaffeOnSpark to perform a popular type of AI called “deep learning” on the big troves of data found in its Hadoop file system. Now CaffeOnSpark is becoming available for community use under an open source Apache license on GitHub.

As WIRED notes:

"CaffeOnSpark is based on deep learning, a branch of artificial intelligence particularly useful in helping machines recognize human speech, or the contents of a photo or video. Yahoo, for example, uses it to improve search results on Flickr by determining the contents of different photos. Instead of relying on the descriptions and keywords entered by the people who upload photos to the site, Yahoo teaches its computers to recognize certain characteristics of a photo, such as specific colors or even objects and animals."

 CaffeOnSpark works with x86 chips or graphics processing units (GPUs). It can be run on cloud infrastructure or within data centers. Among many uses for it at Yahoo, it has helped make connections for content recommendations.

 "In 2016, every company will want to get on the machine-learning bandwagon," said Monte Zweben, co-founder and CEO of Splice Machine and executive chairman of RocketFuel, in a recent interview. "But without the right people, many won’t have the expertise to do it. Expect to see the development of turnkey databases that allow developers to build predictive models without having a Ph.D."

Notably, Microsoft has open sourced the artificial intelligence framework it uses to power speech recognition in its Cortana digital assistant and Skype Translate applications. The framework is called, CNTK, and can help machines do things like understand speech and determine logical connections between photos.

Microsoft released its Computational Network Toolkit (CNTK) as an open source project on GitHub, and developers are likely to leverage it to advance deep learning networks.

Ray Kurzweil's site has also rounded up many of the top machine learning and artificial intelligence breakthroughs of recent times here.