Real-time situational awareness means examining relationships. Enter the graph database.
“In government, speed can sometimes really matter, as a lot of areas involve situational awareness in real time. You need to be able to make decisions quickly.”
The statement comes from John Suhy, CTO at PureThink, a provider of security application solutions to a number of U.S. federal, state, DoD and intelligence agencies.
Nowhere is the need for situational awareness more relevant than in identifying criminality, or in critical situations such as terrorism. The good news is that graph database-based approaches are helping to address the problem.
That’s because they have three strengths that make them ideally suited to opening real-time situational awareness: One, graphs are easy to query and navigate, even for non-developers. Second, they can navigate highly complicated and connected datasets, thus helping users shine a spotlight on “unknowns.” And third, they are able to interrogate these large volumes of highly connected data at lightning speed.
The rise of NoSQL ways of working with data
Graphs are a big part of a new generation of database technologies that can manage large datasets – the NoSQL family, originally in-house solutions developed by the social web giants, but which have gradually become mainstream.
NoSQL (“Not only SQL”) includes the key-value store, the column family database, the document database and the graph database. A fifth technology, Big Data databases oriented at large-scale batch analytics, has also come out of the same research push. Each has different strengths but all are aimed at harnessing large volumes of data. That matters since we are all generating more and more data every day, a data trail that’s creating a major challenge for governments looking to gain actionable insights into issues, of which security is only one.
While classical business database software, RDBMS (relational database management systems), still has an important role to play in the government context, it struggles to tackle this class of data-based problems. Relational databases are adept at managing transactional and analytical requirements and are easy to set up, access and extend, but they struggle with the large amounts of connected data that agencies now need to manage.
However, while other NoSQL databases can power all sorts of big data work, graph databases are for tasks that require examining the connections between people, places and events in the real world as they’re able to handle data volume and data connectedness.
Graph databases are navigated and searched by following relationships. This type of storage, navigation and search is just not that easy to do with relational databases, as they are constructed from rigidly defined tables, so it’s less easy to follow connections wherever they may lead.
A set of convincing graph database use cases are emerging
So how does graph technology uncover “unknowns”? Let’s consider one noteworthy example of graph databases being used to do just that – the Panama Papers investigation, which exposed the internal operations of offshore law firm, Mossack Fonseca. The 2.6 terabytes of data obtained by a German newspaper mark the largest data investigation in history, and it’s powered by graph technology. Only graph database technology could unlock the secrets that reporters knew were there.
According to a lead researcher on this project, “It wasn’t until we picked up graph database technology] that we started to really grasp the potential of the data. And the reaction we started to get from colleagues when we put the data there? ‘Oh my God, this is magic!’”
Finally graph databases can carry out data interrogations at high speed.
As Suhy from PureThink comments: “It’s amazing how fast graph is compared to using relational… In one project, we were trying to find some specific fraud patterns using [Big Data database leader] Hadoop. We’d have an hour or even two before we’d get back results – and even then, only an analyst would ultimately be able to understand what was going on. The problem is that in that hour-long window, a fraudulent event could have already happened, making our initial Hadoop solution only good for looking at past scenarios.
“We decided to test the same data in a graph database. The results came back instantly. We were amazed.”
The door to situational awareness has opened
U.S. Immigration and Custom Enforcement, with the help of visualized relationship connections in real time, could work on individual cases of potential interest to border control and in a very immediate way. One way is track activity like social media more closely.
In our conversations with customers and prospects in the public sector, we are finding all this is just the start. How about a system so clued-up about a population’s cellphone use that it can in seconds track back anyone who phones through a bomb warning, lighting their network up on a law enforcer’s screen?
Another example where graph’s potential is being explored: Consider being able to closely follow the money trail between people and their bank accounts in order to stop tax evaders, white-collar criminals and terrorists by unraveling their webs of deception, webs too complex for RDBMS approaches to manage. In each of these scenarios, it’s a graph database that is the tool policy makers and switched-on agencies are turning to.
So what’s the conclusion for the real-time security practitioner? As business processes become faster the window for identifying criminality and in critical situations is becoming narrower, increasing the need for real-time solutions.
And as traditional technologies, while still suitable for certain types of prevention, are not designed to detect these elaborate networks, graph databases genuinely provide a unique ability to uncover a variety of important criminal patterns in real time – and help build just the right platform for situational awareness that the US public sector needs.
Case studies: compliance and anti-fraud