A rock-solid business use case and data integration are pressing issues.
The number of use cases for streaming analytics continues to expand.
In ecommerce, for example, recommendation engines are designed to ingest data from clickstreams, product databases, and sales records to come up with an offer for a customer.
For the Internet of Things, real-time machine data can be used for predictive maintenance, manufacturing optimization, energy management, industrial safety, and even customer experience management.
Then there’s real-time transaction data to assist in fraud prevention, social media data for brand or crisis management, and weather data for use in supply chain.
According to TDWI, 36 percent of businesses surveyed are planning to use real-time event streams in the next three to five years, adding to the 23 percent who are already doing so. Use of IoT data – which by its nature usually involves real-time data capture—is expected to grow to 39 percent.
Before embarking on a streaming analytics project, however, it’s helpful to keep in mind some best practices, as TDWI outlined in a report:
1. What’s the business value?
IoT projects in particular are notorious for cost overruns due to the complexity of setting up communications between different machines and different networks. The entire IoT industry, in fact, has a huge challenge with interoperability.
Meanwhile, a sophisticated recommendation engine with machine learning may be useful only for ecommerce outfits with enough inventory and enough users to make product recommendations worthwhile.
Companies should think in terms of specific goals, such as increasing revenue, streamlining production, lowering costs, or improving customer experience. If increasing revenue is the goal, for example, businesses may need to determine how quickly a project can deliver value versus the costs of the analytics investment. A customer experience project, on the other hand, may look to metrics such as ratings improvement and return customers as a way of evaluating success.
2. Analyze and enrich multiple streams of data
Often the business use case appears only when multiple streams of data are analyzed. In healthcare for example, multiple sensors may be recording a patient’s heart rate, blood pressure, and oxygen levels. In the Internet of Things, sensors may be recording temperature, pressure, and vibration of a particular piece of machinery—a combination of readings may be necessary to optimize production or engage in preventive maintenance.
For data enrichment, TDWI uses the example of a mapping app, which might use geo-data to identify nearby restaurants or hotels. That data might be enriched with brand names, addresses, or restaurant ratings, and real-time coupon might be offered. It’s necessary, however, that “data enrichment doesn’t slow down real-time processing,” the group advises.
3. Make local decisions
Factories may be composed of different manufacturing floors, though it’s not necessary for analytics purposes that the entire data stream from each floor be analyzed. TDWI advises to document the connectivity architecture of the entire network, and identify the points that need to be monitored as well as the devices that need to be controlled.
In edge computing use cases, for example, it is common that sub-second analytics are run on a single edge device for emergency control situations (think of automatic braking in a connected car). Another kind of analytics may be run on the gateway, which collects data from a subset of edge devices, and in the cloud, which may collect data from all the gateways.
4. Filter data, but don’t throw it out
The sheer volume of data flowing from an IoT or ecommerce application may overload the infrastructure of network of many businesses. Enter filters, which aim to cut down on the overload by offering only the data that is relevant to the use case. Streaming analytics is also performed only on slices of time-series data, to see whether an alert or automated decision should be implemented.
But historical data has tremendous value. In predictive maintenance for example, historical readings from a variety of sensors can be used to create a model of when a part might fail, and then that model can be embedded into the stream. The same goes for building-energy management: In order to predict when a thermostat can be raised or lowered in response to a variety of signals—electricity prices, solar power production—the historical data needs to first be collected.
5. Don’t forget integration
When choosing a streaming analytics provider, it’s of course crucial that their software integrate with the tools the business uses for data ingestion and persistence. It’s also crucial to consider integration with software used for enterprise reporting and business intelligence. A typical analytics environment is usually hybrid and often entails batch processing, real-time processing, and interactive SQL querying.
“Consider a solution that allows convergence of stream, batch, and interactive processing to get a complete picture of your business, both historical and real time,” TDWI advises.