ScaleOut’s integration of its real-time, digital twin streaming service running in the Azure cloud and Microsoft’s Azure Digital Twins platform lets Azure Digital Twins users leverage in-memory computing and extend their use of digital twins to provide real-time analytics at scale.
Many application areas for logistics, security, healthcare, manufacturing, and more require extensive analysis of streaming data from sensors and IoT devices. In many cases, storing this data and analyzing it at a later time is not fast enough. Increasingly, this data must be analyzed in real-time so appropriate actions can be taken in the moment.
RTInsights recently sat down with Dr. William Bain, CEO and founder of ScaleOut Software, to discuss the challenges of implementing real-time streaming analytics, the benefits of using digital twins, and how in-memory computing can help deliver real-time performance. Here is a summary of our conversation.
RTInsights: Given the large and growing volume of IoT streaming data that needs to be processed, what are the key problems/challenges with implementing real-time streaming analytics?
Bain: With the wide proliferation of intelligent IoT devices within complex systems, the need for streaming analytics has increased dramatically. Streaming analytics needs to be able to sift through torrents of incoming data, identify issues, and provide answers fast so that managers can respond effectively to problems and capture opportunities. They need up-to-the-minute analytics results to keep their complex systems running smoothly.
RTInsights: How are people tackling this problem today?
Bain: Most streaming analytic systems only do a rudimentary amount of analysis in real time, that is, as the data flows in. Typically, they persist incoming data in log files or databases for offline analysis, for example, by querying a database or by running batch analytics code using big data techniques.
The limitations of current techniques for streaming analytics are that they do not provide fast responses that maximize situational awareness for the managers of complex systems, such as a telematics system tracking thousands of trucks in a fleet. As a result, important issues may emerge, lack real-time analysis, and not benefit from timely, effective responses.
RTInsights: How can digital twins help?
Bain: Digital twins enable application developers to perform deeper introspection on incoming telemetry as it arrives while also simplifying their code. The digital twin model is a software technique that originated in the field of product lifecycle management and provides a compelling technique for organizing analytics code.
In streaming analytics, it can be used to track the relevant, dynamically evolving state of each data source, typically a physical device, and analyze telemetry from that data source as it arrives. By maintaining state information and focusing on a single data source, a digital twin’s analytics code can quickly find important patterns in the telemetry that point to emerging issues needing attention and action. A digital twin can also encapsulate machine learning algorithms to assist it in detecting patterns of interest that would otherwise be difficult to identify.
RTInsights: What role does in-memory computing play?
Bain: In-memory computing technology has been developed over the last several decades to provide fast, scalable data storage with integrated data-parallel computing. It provides an ideal platform for hosting digital twins that are performing streaming analytics. Its object-oriented data storage matches the digital twin model, and it can process incoming messages with extremely low latency while scaling transparently to handle many thousands or even millions of data sources.
RTInsights: How does this use of digital twins differ from other uses, such as for product life cycle management (PLM)?
Bain: In PLM, product life cycle management, digital twins are used to model the behavior of their respective data sources to aid in designing complex systems like aircraft, wind turbines, or a large manufacturing plant. In streaming analytics, they can be used to track dynamic state information about every data source, run analytics code, and use machine learning to identify important patterns and create alerts for managers.
RTInsights: Can you describe some applications that would benefit from the use of digital twins for real-time streaming analytics?
Bain: Sure. Many applications need to track telemetry from large populations of data sources and quickly identify issues. Consider a telematics application that tracks a fleet of vehicles to assist dispatchers. This application might use digital twins to find issues with truck engines or cargo, such as refrigerators within trucks, or to detect lost or fatigued drivers, and much more.
Security applications, both physical and cyber, need to detect unauthorized or unsafe entries into restricted areas in a corporate environment, or in the case of cybersecurity, to detect intrusions within a large and complex network infrastructure.
Healthcare applications that track medical devices, such as smartwatches and cardiac monitors, need to be able to detect emerging medical issues for their respective patients based on specific knowledge of each patient’s medical history and current condition.
Credit card fraud detection systems need to be able to analyze credit card transactions for potential fraudulent behavior and respond while transactions are still in progress based on information about both the details of the transaction and the credit card holder’s history.
E-commerce shopping applications need to be able to analyze shopping behavior based on each shopper’s clickstream and known preferences to make in-the-moment product suggestions from the current product inventory.
RTInsights: What does your integration with Azure Digital Twins provide?
Bain: We have created an exciting integration of our real-time, digital twin streaming service running in the Azure cloud and Microsoft’s Azure Digital Twins platform. This integration enables Azure Digital Twins (ADT) users to take advantage of our in-memory computing platform and extend their use of digital twins to provide real-time analytics at scale. They can use our platform to build digital twin models that incorporate fast streaming analytics and then automatically generate Azure Digital Twin models within the ADT platform.
Our in-memory computing platform can process incoming messages and update state information stored within Azure Digital Twins. We also provide automatic connectivity to the Azure IoT Hub, so that message processing does not have to be manually implemented by ADT users.
RTInsights: What are the benefits for Azure Digital Twins users and for the Azure IoT ecosystem?
Bain: This integration provides important new benefits in simplifying streaming analytics code while dramatically increasing performance by both reducing latency and boosting scalability. Application developers can create their real-time streaming analytics code in one place, within our in-memory computing platform, and gain immediate access to Azure Digital Twin state information.
As I mentioned previously, we also provide connectivity to Azure IoT Hub and to other data sources within the Azure IoT ecosystem. So, for example, our platform lets applications quickly and easily update the state of multiple Azure digital twins in the ADT hierarchy instead of having to create specialized and more complex code with serverless functions for this purpose.
RTInsights: How does your integration of machine learning into this digital twin solution add value for streaming analytics?
Bain: We have integrated Microsoft’s ML.NET machine learning library into our digital twin streaming service so that each digital twin can independently run an ML model to analyze telemetry from its data source. This provides important new capabilities for analyzing telemetry in real time.
For example, the various parameters of an air compressor can be continuously assessed using a trained ML model to detect anomalies and predict possible failures. Writing analytics code to perform this function can be difficult since the interactions between parameters are often subtle and complex. However, ML algorithms can be easily trained to detect anomalies like this. This gives application developers an important new tool for performing streaming analytics.