IBM and Other Tech Titans Raise Commitments to Apache Spark

by Ostatic Staff - Jun. 08, 2016

Folks everywhere in the Big Data and Hadoop communities are becoming increasingly interested in Apache Spark, an open source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley, and we've covered updates coming out this week from Spark SummitIBM had previously announced a major commitment to Apache Spark, billing it as  "potentially the most important new open source project in a decade that is being defined by data." Now, the company has launched a promising cloud-based development environment for Spark.

Meanwhile, Microsoft is preparing to increase its commitment to the open-source Apache Spark big-data processing engine this week at the Spark Summit in San Francisco. Microsoft said it will be integrating its HDInsight, Cortana Intelligence Suite, Power BI and Microsoft R Server with Spark. In a couple of months, R Server for HDInsight will become generally available. It will include Spark integration for the cloud as well as an on-premise version of HDInsight.

IBM has launched Data Science Experience, which is a cloud-based development environment for Apache Spark that could help data scientists work very efficiently with developers to build smarter apps. According to Tech Republic

"The Data Science Experience is available through IBM's Cloud Bluemix platform, and it provides curated data sets, open source toolsets, and a collaborative space. In theory, it will allow data scientists to provide developers with better insights and data-driven models to be used in application development."

 Microsoft is gearing up to deliver R Server for Hadoop on-premises that will offer support for Microsoft R and Spark’s native execution frameworks. In a blog post, Microsoft said: "Combining R Server with Spark gives users the ability to run R functions over thousands of Spark nodes letting you train your models on data 1000 times larger and 100 times faster than was possible with open source R and nearly two times faster than Spark’s own MLLib."

This week, we are seeing some of the biggest commitments to Spark yet seen, and the whole ecosystem surrounding Spark is likely to be moved forward through the money that tech giants like Big Blue and Microsoft are going to be putting into it.