Gartner: Real-time stream processing is not the future. It’s been underway for maybe as much as 20 years. However, we’re only halfway through the revolution.
Event stream processing is a bigger revolution than many think. That was the opening statement from Roy Schulte, Distinguished VP Analyst, Gartner, in his talk on Real-Time Stream Processing and Data in Motion at the Gartner Data & Analytics Summit.
He noted that event stream processing makes new kinds of context data available in real time to enhance situation awareness and improve decisions. “The information that is processed by real-time streaming is a different mix than what you see in most BI and data science,” he said. “Most importantly, though, the role real-time streaming plays in your business is different.”
He gave an example to put the difference and the impact of real time and stream processing makes on a business into perspective. “What’s the difference between postal service paper mail and a text message?” he asked.
The difference is one can be real time and one is not. The postal service is a batch process. It comes once a day on schedule. It’s got a really long delivery time, taking days or weeks for a letter to get to its destination. “If you want a hard copy and you’re talking about something that doesn’t change fast, fine, go ahead and use a mail service,” said Schulte. “But if you want to process information about what is happening now, then you might want to send a text message.”
He posed a second question. “What’s the difference between a picture and a video?” he asked. A picture is a snapshot at one point in time. It’s pretty compact because you’re only sending one image. A picture is very valuable. It’s better than many words. It’s worth a thousand words.
In contrast, a video has a lot more information value. A video can show you different aspects and different angles of an object. It can also show motion and acceleration. It can do this because it conveys a lot more data. It’s conveying typically 30 or 60 images per second. So, if you need or want more information, you use a video, not a picture.
He then tried to put the two concepts, real time and streaming, into context. “I can send you a picture in real time. I take a picture, send it 2000 miles, and you know what’s going on right this minute,” he said. “But you don’t have a lot of richness.” Conversely, he noted that you could record a video and play it a month later. So, it is no longer real time, but it has a lot of rich information.
“When you combine those two, that’s what we’re talking about when you’re talking about real-time streaming,” said Schulte. “We’re talking about streaming information about what is happening now, and we’re talking about a lot of information.” Most real-time streaming in business is not images. It’s not videos. It involves real-time streaming of other kinds of data.
The key takeaway is that stream processing combines the principles of text messages and video. Streams are event-driven to make information current. And they consist of time-series data to make information more complete.
See also: Gartner Keynote: A Franchise Model for Growing Analytics
Transitioning to real time streaming processing
When companies first started using IT for business purposes, most of what they did was transaction processing and record keeping. And that’s still what business applications do today. You have order management systems in retail that deal with transaction information from customer orders. You have a banking system that handles deposits and withdrawals. In such applications, there are transactions all day long.
Most BI and analytics efforts are based on a copy of the transaction data. The data is pulled out of a production system, put through a data engineering pipeline, and something is done to it with a BI or reporting tool. A business may also build some models on that data.
Schulte noted that today, the mix of information that businesses have for real time streaming tends to skew a little differently. It includes transaction data, but now instead of getting the transaction data once in a batch, businesses are getting the transaction data all day long streaming out of the system.
But most of the data for real-time stream processing is not transaction data, it’s context data. It’s just information about something that’s happening. That information could be something about a customer interaction, such as information from a click stream, or it could be a transcript of what they said to the contact center.
However, the biggest volume of real-time streaming data is from machines. There is a lot of IoT data. “It’s coming off of control systems, it’s coming off of sensors, it’s coming off your smartphone,” he said. “It’s data coming from the real world. And turns out that machines can type data much faster than we can.” So, a lot more data is being generated by machines.
There is also streaming data that’s coming from data brokers. That includes such things as news feeds, weather feeds, traffic feeds, market data feeds, and so forth. These data sources produce a lot of observational and context data, not transaction data.
How does that impact analysis? That streaming data is packaged differently. It comes in as a continuous sequence, an unbounded sequence of data. Typically, it’s coming in 24/7But it might only come in during the workday.
When an application processes streaming data, there’s no end of file marker. There’s no way of knowing where it ends. The data, each of those data records, is an event. Typically, it has a timestamp on it. And it provides an observation of what happened at that moment.
See also: Using Streaming, Pipelining, and Parallelization to Build High Throughput Apps
Adding real time to the mix
On a typical eCommerce website, when somebody logs in and navigates around looking for information about products, the retailer has a couple of seconds to generate the next best offer or the next best action for that customer.
The way most systems work is to take data that’s old. For example, it might take data from the customer master file. The retailer might look at past purchases’ history and the information in its campaign management system and present an offer to the person.
A retailer could enhance its offers using real-time data and real-time streaming. For instance, a retailer might use information about the person’s location from their smartphone, IP address, or whatever cell tower they’re connected to. Or they could look at the information coming in from what customers put in their carts.
Additionally, “you might have information from a contact center log,” said Schulte. “Or you can look at what the person has tweeted or posted on Facebook.” Using that additional information, an algorithm generating the next best offer can be smarter. It can take more information into account, including the current situation.
Real time stream processing is not the future; it’s here now.
Schulte concluded this part of his presentation, noting that real-time streaming is “not our future. It’s been underway for maybe as much as 20 years.” However, he thinks we’re only halfway through the revolution.
“We still have many, many applications where real-time streams would make that application smarter, but businesses are not yet using real-time streams,” he said. “That’s a job for us to finish the task over the next 20 years.”
He noted the other takeaway is that life is event-driven, and business is event-driven. “Many of the decisions you have to make must be made in real time,” he said. Making a decision in real time with old data is often not the best decision. Businesses need a combination of old data plus new data, and that new data is probably coming in through some screening mechanism. “The more your analytics are real time, the more valuable your application will be,” he said.