One of the two Hadoop platforms from the merging firms will win out. What will it mean for users?
A proposed $5.2 billion merger of Cloudera and Hortonworks won’t be without headaches for organizations that have standardized on one of the two Hadoop platforms both companies provide.
Cloudera CEO Tom Reilly, who will remain as CEO of the combined entity, during a conference call this week revealed that sometime after the merger closes in the first quarter of 2019 the combined company will deliver a Unity release of Hadoop made up of the best elements of the two distributions that currently exist, That process will include, for example, determining what elements of the Impala SQL engine that Cloudera built versus the Hive platform that Hortonworks uses to query Hadoop using SQL.
“We’re going to pick from the best of both,” says Reilly.
Overall, Reilly is maintaining the two distributions as they stand today are complementary. For example, Hortonworks as of late has been focusing its efforts on streaming data from the network edge into Hadoop to drive Internet of Things (IoT) applications. Cloudera, on the other hand, has been focusing much of its research and development on optimizing Hadoop for public cloud distributions. The combined entity will provide a common distribution of Hadoop and associated technologies that can be deployed from the network edge to artificial intelligence (AI) applications in the cloud that Reilly describes as an enterprise data cloud. That cloud platform will be deployed most commonly across a hybrid IT environment spanning multiple cloud and on-premises IT environments.
However, until the merger is closed both companies will continue to compete for customers for their respective Hadoop distributions, says Reilly.
Reilly predicts that the combined companies will generate revenues of over $1 billion by 2020. Cloudera values the total addressable market for its offerings at over $83 billion, which includes a broad range of databases and analytics applications.
Savings from rationalizing sales, marketing and development efforts around a single distribution should make the combined entity more profitable than either company would likely be on their own, adds Reilly. Those savings will also enable the combined entity to bring new innovations to market faster, says Reilly.
Cloudera today generates $411 million in annual revenues compared to $309 million for Hortonworks. Cloudera, meanwhile, has over 1,300 customers compared to over 1,400 customers for Hortonworks. A total of 927 of the customers spend more than $100,000 with either company.
In general, Reilly notes there are only a few companies running both distributions of Hadoop, so he doesn’t envision the merger being that disruptive to operations. Many companies today are making a compromise between selecting Cloudera and Hortonworks that they will now longer need to make, adds Reilly.
Of course, more than a few IT organizations have struggled with building and maintaining Hadoop clusters. That issue has enabled alternatives to Hadoop for processing Big Data such as the open source Apache Storm project and the BigQuery cloud service provided by Google to gain market traction. Both options provide an alternative to relying on the Hadoop Distributed File System (HDFS) for storing massive amounts of data. Additionally, as some organizations move to develop AI applications they are opting to work with raw data outside of a Hadoop environment.
Naturally, it’s too early to say with certainty to what degree Cloudera and Hortonworks will be better together. But the one thing that is clear is that come 2019 a lot of IT organizations are about to find out.