SHARE

How to Use Machine Learning to Manage User Access

Capital One is working to scale machine learning methodologies to manage both external and internal user access.

Written By

Kat Campise

Sep 15, 2017

3 minute read

As machine learning methods continue to improve, so do the possibilities of their utility throughout all industries. The worlds of healthcare and finance, for example, are particularly vulnerable to breaches in data security. Social security numbers, names, addresses and financial information are the data goldmine hackers are seeking to find and exploit.

While technologies such as blockchain are quickly evolving to make it nearly impossible for data to be hacked, their complexity and limitations aren’t currently functional for the millions of transactions that occur (particularly in the finance sector).

Machine learning and the human factor

Experian recently released an report stating that 66 percent of companies surveyed admitted that employees are “the weakest link in their efforts to create a strong security posture.” As such, not only are there serious threats external to organizations, but also a distinct risk of negligence from employees who are careless or purposeful in enacting a data breach.

The more data breaches that occur, the less confidence consumers have in the organizations that store sensitive data. Capital One is one organization working to scale machine learning methodologies to manage both external and internal user access. Recently, Jon Austin, a machine learning engineer from Capital One, gave a demonstration at the Qubole Data Platforms Conference of how the enterprise is using machine learning to manage administrative access on the back end.

“Start with the problem you are trying to solve” Austin said, while swiftly clicking through an array of graphics depicting edges and nodes. “Nodes are your individuals who have access to the data, and the edges are the relationships between the nodes.” The primary information Austin emphasized through his training data was the frequency of each user’s access to areas of the Capital One database. Using the Jaccard Index calculation, Austin and his team could then measure the similarities among users. This would further assist in determining the who, what, where, and when of access.

How machine learning identifies user patterns

But how can this help user access management? As Austin explained, machine learning models can be trained to recognize normal vs. abnormal (or risky) user access patterns. For example, if data scientist Dana usually enacts access and initiates her pull requests from her work computer between 8 a.m. and 5 p.m. Pacific time, the machine learning algorithm recognizes this as a typical pattern. However, if she’s traveling for a conference and tries to access the database from her laptop outside the normal timeframe – even if using a VPN – the machine learning algorithm will flag this as abnormal and prevent Dana from access.

Of course, these edge-case scenarios can be accounted for with further training and test sets. Also, Dana could notify the administrator that she’ll be requesting access from a different location and within a different time zone. As such, access privileges can be modified to accommodate the shift.

The primary takeaway here is to continuously update the machine learning model to adapt to new situations, but also to discern outlier scenarios and alert administrators. From that point, the administrator can immediately act so Dana can continue with her work. Or, if Dana is not traveling and this is a possible data breach threat, then the organization is alerted before any damage is done.

Certainly, machine learning is in the fine-tuning stages. And humans still need to play intermediary for taking decisive action on the information that algorithms provide. Machine learning can, however, lessen the latency gap between unauthorized access and a disastrous data breach.

Human beings are perpetually penetration testing. Machine learning won’t change that fundamental aspect of human nature. But, it’s a handy prevention tool that should be seriously considered in managing the intricacies of user access.