Businesses today need to be able to easily build and scale analytics applications that can analyze data when it is created or when events occur.
For years, businesses have run their operations, interacted with customers, and tried to defend against attacks and fraud using business intelligence derived by applying analytics to historical data. They’d identify patterns, search for anomalies, and make decisions or take actions based on their findings.
Unfortunately, more is needed in today’s fast-paced business world. Businesses must be in a position to act quickly on emerging opportunities, meet the rapidly changing needs and preferences of their customers, mitigate potential disruptions before they impact operations, and stop malicious activities before they inflict any damage.
All of these functions require an ability to act on information as it happens (or as close to when it happens as possible). That has ushered in a need for real-time analytics, which makes use of a variety of data sources, including events and streaming data.
Many industries and applications can benefit from the application of real-time analytics at scale. For example:
- An online retailer might use a customer’s live clickstream data, combined with that customer’s past purchases and shopping preferences, to make highly personalized, real-time offers while the customer is visiting the site.
- A telecom company or communications service provider might use real-time information about network traffic loads, customer usage patterns, and more to dynamically avoid bottlenecks and outages and meet customer performance expectations.
- A financial services firm might use real-time analysis of transactions, mixed with a customer’s activity history, to spot fraud and halt a transaction before it takes place.
Many more examples and use cases for real-time analytics abound. But the important point to take note of is that real-time analytics is analytics applied to all the real-time data being generated constantly by systems and services on which modern businesses run.
See also: The Growing Importance of Real Time in Retail
Obstacles and issues to enable real-time analytics at scale
The insights from real-time analytics help businesses address the complex nature of modern operations, which requires a situational awareness and real-time understanding of the interplay between elements and the performance of the many disparate systems, applications, and services.
Regardless of the industry or application, there are some common challenges that have prevented businesses from taking full advantage of real-time analytics. For example, the traditional approach to analytics has mostly focused on BI derived from information stored in legacy databases and more modern data warehouses. For example, a retailer might mine its customer database to make seasonal offers to select groups. Today, that retailer would need to make on-the-fly, personalized offers as a customer interacts with the business to stay competitive. Traditional methods simply cannot accommodate such operations.
What’s missing and preventing businesses from using and scaling real-time analytics is the need to accommodate three factors:
- Latency: This is a measure how of how quickly an application or service can request data and get a result.
- Freshness: This is a measure of the delay between when data is created through a real-world event and when it is available to act upon.
- Concurrency: This is a matter of supplying low-latency access to fresh data no matter how many applications or people are accessing it at the same time.
Why is there a problem supporting these basic requirements of real-time applications?
To start, while data warehouses are often the centerpiece of modern data architectures, they do not support low-latency ingestion and querying, but they are often the centerpiece of modern data architectures. Furthermore, traditional application DBs like Postgres and MySQL do not scale very well for analytics use cases. And newer OLAP DBs like ClickHouse, Druid, and Pinot can work for real-time analytics, but they’re difficult to set up and configure, which slows down production speed.
An additional element that has prevented the use of real-time analytics at scale is the siloed skills and missions of different team members. Data teams have generally been asked to focus on things like data modeling for BI and machine learning. They do not focus on application development. And backend developers are not comfortable building ETL pipelines that are needed for real-time analytics.
What’s possible when those challenges are overcome
A solution that addresses these challenges would need certain features and characteristics. Key elements would include an SQL-based solution that works with raw data, on-the-fly, without the need for ETL pipelines. The solution should also operate orders of magnitude faster than relational databases and traditional data warehouses. And have the ability to determine the best schema for the data and ingests millions of rows per second.
Using such a solution enables many application use cases. Some examples include:
- Operational intelligence: Allows a business to surface business data in real-time dashboards so the entire organization can make data-driven decisions
- Real-time personalization: Allows a business to make sense of user behavior as it happens, so it can take action and meet its customers’ needs and preferences.
- In-product analytics: Provides a business with easy access to data that helps it understand how a product is enabling value in real time.
- Anomaly detection and alerts: Helps a business identify data deviations and respond to them in real time to prevent downtime or loss of revenue.
A final word
Batch processing of data is no longer a competitive advantage. Businesses today need to be able to easily build and scale analytics applications that can analyze data when it is created or when events occur. These applications must deliver insights in sub-seconds so that actions can be taken as situations change. Additionally, the applications must support low latency and high concurrency. Such applications will help businesses position themselves for the modern world.
Learn more: The Data Journey: Unlocking data for the right now