The ability to perform real-time analytics to drive personalization is critical to making customers happy. Let’s look at some of the tools and use cases.
Today’s app users are notoriously impatient; not only do they demand new, delightful experiences that engage and inspire, they demand them at lightning speeds.
Most of us take it for granted, for example, that Expedia will present on-the-spot companion car rental offers to go with the flights we just booked, Chase Bank will identify fraudulent activity on our credit cards before a charge is authorized, and Spotify will always know the perfect song to play next without missing a beat.
But while users may have the luxury of taking the real-time nature of these personalized experiences for granted, application developers do not. The ability to instantaneously perform the analytics that fuel real-time personalization is critical not only in avoiding the ire of impatient users but, often, in generating the follow-on business. After all, there’s little point in presenting car rental suggestions (based on a resource-consuming analysis of a user’s travel itinerary, attributes, and past and present buying behavior) if the user has already left the app.
Real-world Use Cases for Real-time Analytics
Apps employ real-time analytics to create all manner of ingenious personalized user experiences, but the majority of them fall into one of three common use case categories.
Probably the most common use of real-time analytics is in presenting recommendations to users. At their essence, recommendations are predictions. Their intention is to anticipate (i.e. predict) that which will keep a user engaged.
Within a recommendations system, an application basically says, “Because of X, we recommend Y.” X can represent something as simple as a static user attribute such as gender, age, or marital status. Or, X can represent a complicated algorithm that takes into account everything known about a user, from their demographics, preferences, geolocation, interests, correlations to other users, past habits… all the way to their most recent behavior a few (milli)seconds earlier.
Amazon is recognized as a pioneer of personalized recommendations, but other platforms make extensive use of recommendations as well. For example, Facebook optimizes our activity feeds by prioritizing posts we’re most likely to engage with, Netflix is quick to fill the post-binge void with insightful viewing suggestions, and Google relentlessly haunts us with ads from websites we soon come to wish we’d never visited.
#2: Fraud Detection
The analytics behind fraud detection focus on identifying unusual behavior. As with recommendations, fraud detection is also, at its essence, a form of prediction. Expectations (i.e. predictions) of future user activity are established based on a compilation of user attributes, past behaviors, correlations, and other data points. If user activity falls outside these parameters of expected behavior, an exception is flagged and additional safeguards are triggered.
Banks use fraud detection techniques to prevent credit card fraud. Charges that notably diverge from usual spending habits—whether in amount, location, or purpose—are put on hold until the credit card holder confirms activity. Other types of fraud such as gaming fraud or identity fraud can also be thwarted through fraud detection measures. For example, many apps including Twitter, Uber, Gmail, and PayPal (to name a tiny handful) recognize login attempts from new devices and deploy additional identity authentication procedures as a result.
The requirement that fraud detection takes place before the fraudulent activity is concluded highlights the extreme value—and challenges—of real-time analytics. Vast amounts of streaming transaction data must be analyzed against vast amounts of stored user data in order to verify the validity of every single transaction (of which there are potentially thousands occurring simultaneously). And it all needs to happen in real time.
#3: Interactive Reporting
Interactive reporting (i.e. the ability for a user to query a database using user-defined criteria) is another form of real-time analytics. The dating app Match uses interactive reporting to allow its millions of users to filter dating profiles by a countless number of data point combinations. Search engines use interactive reporting to return search results.
With the potential for very large volumes of results (in the case of search engines, one query can return millions of pages), interactive reporting systems typically strive to rank results by user relevance, thereby injecting an element of prediction into the process. And just as with recommendations and fraud detection, interactive reporting can leverage any number of user data points to predict and prioritize that which will be most relevant to the user.
Optimizing Your Database for Real-Time Analytics
The increasing availability of cheap compute capacity, data processing tools, and learning frameworks is making it easier than ever to incorporate analytics into applications. However, analytics cannot come at the cost of application performance or responsiveness. Even a fraction of a second of additional latency has been shown to negatively impact engagement.
Customer-facing applications, in particular, have a high bar to meet. User expectations hover at less than 100 milliseconds response time. With Internet latencies at approximately 50 milliseconds round trip, application processing, data access, and response must all occur within 50 milliseconds. To achieve this at scale, databases must be capable of delivering sub-millisecond response times under conditions of any load, while simultaneously offering highly personalized user experiences.
It’s a complicated balancing act, but one that can be pulled off with a database that is optimized for real-time analytics. As such, here are several database features that can make a big difference when bringing real-time analytics to your application.
NoSQL Architecture: If your app deals with unstructured data such as texts, JSON, events, actions, posts, video, email, etc., there’s no compelling reason to endure the high latencies that typically accompany the cross-table joins and heavy queries of relational databases. NoSQL databases, due to their lightweight data models and ability to scale across multiple servers, can achieve much higher performance than traditional relational databases in the fast-moving landscape of real-time analytics.
In-memory Architecture: A step further than NoSQL is in-memory NoSQL. In-memory architecture delivers faster data access than disk-based architecture, which is critical when capturing real-time events at scale and subsequently rendering personalized experiences based on these events.
Flexible Data Structures: NoSQL offers more flexibility than relational databases, but not all NoSQL is created equal. When it comes to real-time analytics, true performance boosts occur when an app can take advantage of data structures that have been purpose-built to support the objective at hand. For example, sorted sets offer highly efficient means for performing the sorting and ranking operations often used within recommendations systems.
Sorted sets, sets, hashes, lists, strings, bitmap, and hyperlog are all examples of data structures designed to not only more elegantly store variably structured data, but also perform complex analytics on the data via built-in operations. The more data structures and built-in operations you have at your disposal for current and future analytics needs, the more network and computing overhead you can eliminate, while also radically simplifying application development.
Native Support for Machine Learning Models: All three of the use cases mentioned previously benefit tremendously from an app’s ability to continuously hone its predictive capabilities through machine learning. Sophisticated machine learning models are deep and have high levels of accuracy, but this proficiency comes with a large footprint and real-time update requirements. Serving such models at scale can lead to an explosion in infrastructure requirements unless your database natively understands these models. Databases that natively store, serve, execute, and update machine learning models achieve the scale and speed required for inline processing, with very few resources.
Active-Active Replication: Under active-active replication, all database instances are available for read and write operations and are bidirectionally replicated.To gain local latencies in such scenarios CRDTs (conflict-free replication datatypes) provide the most elegant solution. For globally distributed applications that must perform real-time analytics on events happening simultaneously across multiple regions and data centers, databases built on active-active architecture significantly enhance performance.
Personalized and Performant: Next-generation applications must embrace analytics in order to generate engaging and inspiring personalized experiences. And they must do it without sacrificing a drop of performance. Fortunately, with the right database optimizations, these two objectives are not mutually exclusive.