SHARE
Facebook X Pinterest WhatsApp

Hortonworks Provides Needed Visibility in Apache Kafka

thumbnail
Hortonworks Provides Needed Visibility in Apache Kafka

As Apache Kafka-driven projects become more complex, Hortonworks aims to simplify it with its new Streams Messaging Manager (SMM).

Written By
thumbnail
Michael Vizard
Michael Vizard
Aug 28, 2018

About the only thing harder than setting up a real-time streaming analytics application based on open source Apache Kafka software is arguably managing and securing it. To address those issue Hortonworks has unveiled Streams Messaging Manager (SMM), an open source monitoring and management tool for Kafka environments.

The goal is to provide visibility into Kafka environments that IT operations teams today have no visibility into because they lack tools, says Hortonworks’ Jamie Engesser, vice president of product management. IT operations teams today are unable to see, for example, who is employing Kafka to publish services, who is consuming those services or where there might be any bottlenecks, says Engesser.

“There’s no visibility,” he says.

He notes that monitoring and management tools are needed to optimize not just tuning, replication and synchronization but also track the lineage of data lineage from the network edge to the cloud. As compliance mandates become more challenging to meet around the world, IT operations teams need to be able to audit where data has been streaming.

SMM addresses that issue by providing monitoring and management tools that can reach all the way out to the network edge, says Engesser. In fact, he says that 30 percent of the support revenue being generated by Hortonworks now stems from a distributed big data application that reaches out the network edge.

Apache Kafka external support needed

Engesser says Hortonworks is betting that as the complexity challenges surrounding Hadoop and Kafka becomes more apparent it’s only a matter of time before IT organizations look to rely on more external support.

Hortonworks is also moving to make it simpler to manage data flows across instances of the Hadoop distribution it curates. A release of version 3.2 of Hortonworks DataFlow, which adds tighter integration with version 3.0 of the Hortonworks Data Platform (HDP). Those additional capabilities include enhanced resiliency to smooth workflow across large clusters, support for version 3.0 of Apache Hive data warehouse software and more granular control over multitenant environment using Kerberos keytab isolation.

In general, IT organizations are being challenged to manage data at scale at a time when investments in artificial intelligence (AI) application is starting to escalate. The machine and deep learning algorithms that drive the models on which those AI applications depend require access to massive amounts of data to train them. Engesser says big data platforms such as Hadoop provide a mechanism to not only centrally manage that data but also score data being collected via the open source TensorFlow machine learning framework at the edge of the network.

There’s no doubt that IT operations issues surrounding big data are becoming more challenging with each passing day. Data is now flowing in and out of modern data warehouses to help drive a new generation of real-time analytics applications. Instead of analyzing a sample of that data an organization collects it’s now feasible to analyze not just all the data an organization owns, but also stream external data into the data warehouse to correlate that data against multiple sources.

The paradox behind managing all that data to drive AI models it that as the sheer volume of data that needs to be managed will inevitably require IT administrators to rely more on AI technologies to manage it at scale.

Recommended for you...

The State of the Neoclouds Market
Why Agentic AI Projects Are Getting Canceled (And How You Can Save Yours)
Akhil Verghese
Mar 2, 2026
Will Your Organization Take the Quantum Leap in 2026? Read This First.
David McNeely
Feb 26, 2026
IBM’s New Acquisition Highlights Organizations Aren’t Ready for Real-Time
Max Vermeir
Feb 24, 2026

Featured Resources from Cloud Data Insights

The AI That Actually Scales Is Boring. That’s the Point.
Jared Coyle
Mar 9, 2026
Real-time Analytics News for the Week Ending March 7
The State of the Neoclouds Market
What High-Performing Manufacturers Do Differently

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.