Beyond Kafka: Capturing the Data-in-motion Industry Pulse


To fully participate in today’s data-in-motion world requires that a data streaming platform be part of every company’s architecture.

Jay Kreps, Confluent CEO, got to the heart of the matter from the get-go. In his opening keynote, he reflected on the transition of the real-time Data-in Motion industry:  “It was no longer just about Kafka; there was now an entire ecosystem around stream processing.”  The conference brought that ecosystem to the forefront. That shift was apparent when one looked at the broad range of session topics and the more than 50 Data-in-Motion solution providers in the expo hall.

Kreps noted that until recently, data streaming and stream processing were deployed in specialized niche applications that never became mainstream. Well, that certainly is not the case today. Getting from those early days to the present time has been the journey in both Apache Kafka and Apache Flink in the world of stream processing. “The question you could ask is, where do we go next now that we have solved some of the fundamental problems? What do we do with this? How does it evolve in the world?” asked Kreps.

See also: Confluent Launches Data Streaming for AI and Adds Apache Flink

What’s next?

Kreps noted that the journey continues today. It is the journey towards ubiquity. “How does this turn into something that’s everywhere?” he asked. “That’s what’s happening now.”

He expanded on that point: “We’re going from a world where data streaming was an ingredient in certain applications, certain applications that had to have real time where it was worth all the effort of getting this stuff working together. But as it becomes easier as this technology is more and more ubiquitous, data streaming really becomes a platform, and it becomes something that has a fundamental role in the architecture of companies. And that’s what we’re going to talk about through the rest of these keynotes.”

Specifically, he noted they were going to talk about the big data mess. Every company is impacted. The issue is how do they change? In particular, the issue is how companies can get data to flow in a simpler and better way. A data streaming platform can make this happen. “To make this a reality and make this part of the architecture of every company, it’s really important that the technology evolves,” he said.

During the keynote, Kreps brought in Shaun Clowes, Confluent’s Chief Product Officer, to continue the train of thought. “I’d like to explore this idea in a little bit more detail, and I’m going to frame it across two key questions that are preventing enterprises from really capturing the value of all of their data,” said Clowes. Those questions are:

(1) How do you integrate data across all of your applications and systems to deliver customer and employee experiences that make sense?

And (2) how do you embed data and analytics into all of your business units and departments to make better business decisions?

He delved into why these questions are so important today. “You can think about data in the enterprise as existing across two major estates,” he said. “The operational estate, where it’s used to run the business, and the analytical estate, where it’s used to understand the business.”

See also: Do You Need to Process Data “In Motion” to Operate in Real Time?

He noted that traditionally, those two estates existed as technology silos. The operational side had a relatively limited number of big applications, such as ERP and CRM. And there were mainframes and operational databases. If a business wanted to make them work together to deliver an experience, they’d wire them together with code. In contrast, the analytical estate was often just one big data warehouse sitting in a rack data center somewhere. “All you had to do was wire together a bunch of ETL to suck all the data out of the operational databases and push them into the data warehouse,” he said.

Obviously, the world is constantly growing ever more complex. With respect to the operational estate, businesses keep introducing new applications to power new business processes or deliver new experiences. They are using things like SaaS apps or new microservices. Those applications come with their own new operational databases. So, businesses still have to figure out how to wire them together with all of the other different operational systems to deliver some sort of coherent end-to-end experience.

And the analytical estate is getting more complex, too. It’s not just one data warehouse anymore. Now, businesses have data lakes, cloud data warehouses, and a variety of analytical and reporting tools as well.

“It’s incredibly complex, expensive, and tedious work to try and wire all of this stuff together. But it’s not simply just a matter of complexity either,” he said. “It’s also a matter of increasing demands for timeliness.”

This has all led to the concept of data products. “The power of data products is really profound because what we’re actually doing here is we are escaping the vicious cycle that led us to the data mess in the first place where we are constantly wrestling with data infrastructure,” he said. “Instead of doing that, we’re now focusing on a virtuous cycle where data sets are more than the sum of their parts, and they’re available everywhere across your organization to drive insights and operations.”

With that approach, data is everywhere it’s needed. It can be joined and used to power new business processes without needing to know where it originated from. Data products are an asset that drives real business value.

He noted that taking this approach delivers great value. “Imagine all of the most important entities of your business, the nouns and verbs, if you like to think about it that way,” he said. “But instead of them being dead rows in a database somewhere that occasionally gets queried, they are living data products ready to be combined and reshaped to be mixed and matched to solve new problems.

Making it all happen

He noted that if businesses want to make data products possible at scale, they are going to need more than just streaming. “We need a data streaming platform,” said Clowes. “You heard Jay talk about it. Let me walk you through the key aspects of a data streaming platform.”

He noted that “a data stream platform obviously starts with streaming. Streaming data is moving in real time to power the needs of the modern real-time world. Data isn’t static or passive; it’s constantly moving from wherever it was created to wherever it needs to go. It’s also connected. It can reach every system everywhere, whether that system was built to be stream native or not, and bring the data into one larger streaming hole. It’s governed because while data is always incredibly valuable, that’s not the case if you’re not sure that it’s secure. It’s trustworthy and understandable.”

Additionally, data must be capable of being exposed for reuse throughout the enterprise. Finally, a business must be able to process the data because “while the data itself is always valuable, it tends to get dramatically more valuable if it’s mixed and remixed with other data to drive increasing business context around that piece of data.”

Addressing these key aspects ensures that a business can confidently build on top of its data streaming platform.

He noted that Confluent has been building towards this for several years. “We have a suite of over 75 different managed connectors and the ability to load your own custom connectors to bring all data from any application into the streaming hole. We have our global streaming network and streaming technologies that make streaming ubiquitous everywhere you need it to go. So, real-time data-in-motion can pulse like a central nervous system. We have the ability to govern the data, apply data quality rules, and be sure that the data is trustworthy, reliable, and discoverable.

Related articles:

Les Yeamans

About Les Yeamans

Les Yeamans is founder and Executive Editor of RTInsights and CDInsights. He is a business entrepreneur with over 25 years of experience developing and managing successful companies in the enterprise software, financial services, strategic consulting and Internet publishing markets. Before founding RTInsights, Les founded and led, an Internet portal company specializing in the application of critical enterprise technologies including BPM, event-driven architectures, and event processing. When was acquired by TechTarget, Les became Associate Publisher, managing a group of websites. Previously, Les had founded a new enterprise software business called ezBridge which provided fault-tolerant, guaranteed delivery transaction messaging on 10 different hardware platforms. This product was licensed to IBM as the initial code base for IBM MQSeries (renamed WebSphere MQ and later renamed IBM MQ) which was co-developed and co-marketed with IBM. Les was also co-founder of the Message Oriented Middleware Association (MOMA). Les has worked extensively as an analyst and consultant for end users and vendors in this growing market. Prior to ezBridge, Les raised venture capital for development and marketing of PowerBase, the industry-leading database software package for the IBM PC. He started his career consulting at Accenture, providing end-user IT solutions. Les has an MBA from the University of Michigan and a Bachelor's degree from the State University of New York at Binghamton. He is based in New Rochelle, NY.

Leave a Reply

Your email address will not be published. Required fields are marked *