3 Inconvenient Truths about AI and ML


To bridge the gap between the data we’re collecting and the way organizations interface with it, we need to address some uncomfortable realities.

As we step into the next decade, there’s a growing sense – almost an inevitable momentum – that we’re headed towards a golden age of AI. Over the past year, we’ve witnessed incredible advances in applying artificial intelligence techniques to image recognition, language processing, planning, and information retrieval. We’re seeing practical applications of machine learning, improving everyday activities. There are more amusing applications, too, including one team teaching AI how to craft puns.

See also: Will the Consumerization of AI Set Unrealistic Expectations?

This is a future I’ve been researching and investing in for the past five years, starting at Berkeley and continuing through our work at the Stanford DAWN project. Our goal is to democratize AI by making it dramatically easier to design, build, deploy, and manage AI- and ML-powered applications. And we’ve seen huge success in a few very targeted domains.

However – particularly in the world of business – it feels like we’re “not quite there yet” when it comes to finding meaningful enterprise ML and AI applications. There’s a growing sentiment that many applications of enterprise ML are too bespoke, require extensive consulting investment, and are at risk for never showing a positive ROI. If we’re going to bridge the gap between the kinds of data we’re collecting at scale and the way analysts, business leaders, and organizations interface with it, we need to address some uncomfortable realities.

We believe there are three inconvenient truths about enterprise ML that are at the root of this challenge. The good news is that each of these challenges is surmountable with the right focus.

Training Data is Scarce

One of the most valuable investments ever made into training data is the ImageNet project, a set of over 14M images categorized and labeled, and open to the public. Thanks to this investment from Fei-Fei Li and the ImageNet team, researchers and deep learning enthusiasts have been able to improve image classification accuracy dramatically.

However, gathering this kind of labeled data at scale can be very demanding. Particularly for tasks involving sensitive data or limited domain expertise, data is difficult or even impossible to come by. For example, the collection and labeling of DICOM medical image scans is challenging for privacy reasons, and it’s even harder to find experts who can credibly identify and label tumors, tears, and abnormalities. These are really valuable tasks, but it’s an open question if it’s feasible to get enough data to train upon effectively.

Even more common, less sensitive enterprise use cases encounter this challenge. Most enterprise data is siloed and cannot be shared easily across teams or externally to generate training labels. And, in many situations, the data itself can change dramatically from week to week or month to month. A great example of this is retail fashion – inventory, styles, sizing, and merchandising trends shift rapidly, and having to re-label these data sets on a continuous basis is an arduous if not impossible task.

Deep Networks Don’t Help Much with Structured Data

What’s more, deep networks don’t help much with model accuracy for structured data use cases. This is particularly relevant for businesses, as most enterprise information is structured, tabular data. According to IDG, the average enterprise stores more than twice as much structured data than existed on the entire internet when Google launched.

A recent paper from Google on “Scalable and accurate deep learning with electronic health records” illustrates this principle clearly. Focused on prediction accuracy for healthcare outcomes, the paper demonstrates some dramatic results from some novel deep learning techniques. But, if you dig deeper into the findings, the data shows that simpler approaches like logistic regression perform almost as well.

In other words, we’re not quite at the point where the investment required to train a deep net on structured data delivers a significant ROI above and beyond other techniques. As a result, most organizations will be better off finding ways to scale up more traditional (and explainable) methods of data analysis and exploration.

AutoML is Not a Panacea

AutoML is gaining a lot of attention as the next major advance for enterprise ML. While automating key steps of the data science process can increase the pace of model creation, automation is not a panacea to solving enterprise ML. There’s still a long way to go before AutoML models reach the level of accuracy needed for real-world success.

And beyond the creation and deployment of these models, there are large gaps in our ability to monitor, manage, and diagnose the results these models produce. A seminal paper in this area from a team at Google correctly declares that post-deployment, “we find it is common to incur massive ongoing maintenance

costs in real-world ML systems.” Until this is addressed, the practical application of ML and deep-learning systems in the enterprise will be delayed.

Three Strategies to Effectively Utilize ML in the Enterprise

So, what can we do in response? By taking each of these truths, in turn, it is possible to identify a few key principles that can accelerate the adoption and effectiveness of machine learning in the enterprise.

  • First, take advantage of the data we already have. It is possible to get meaningful results from our models and tools faster. Looking across industries, most companies are not putting the data they’re collecting on a daily basis to use. Most estimates, including recent surveys by Forrester, Microstrategy, and Hitachi, indicate that 2/3 or more of the data collected by businesses goes unused. There’s a huge advantage for companies who can shift their existing data stores from passive to active assets.
  • Second, focus on augmentation, not automation. A great example of a high-impact workflow is the analysis of performance metrics. We’re helping people rapidly perform real-time diagnosis of why their most important metrics are changing, using machine learning and large-scale data explanation. By focusing on augmenting the skills and capacity of these experts in the business, we can help teams find meaningful facts buried deep in their operational data.
  • Finally, put models in production quickly. The old adage, “the perfect is the enemy of the good,” is especially true in this arena. By putting simple, effective models into production and setting the expectation that user feedback is expected and useful, we can rapidly iterate and use that feedback as a way to further train and refine the system.

With the right investments, it’s possible to avoid widespread disillusionment with AI applications in business. There are valuable, practical, and feasible applications for enterprise ML. It’s why I’m excited for the possibility of “off the shelf machine learning” and making new tools accessible to more people in the organization.

Peter Bailis

About Peter Bailis

Peter Bailis is the founder and CEO of Sisu, the fastest and most comprehensive diagnostic platform for structured data. Peter is also an assistant professor of Computer Science at Stanford University, where he co-leads Stanford DAWN, a research project focused on making it dramatically easier to build machine learning-enabled applications. He received his Ph.D. from UC Berkeley in 2015, for which he was awarded the ACM SIGMOD Jim Gray Doctoral Dissertation Award, and holds an A.B. from Harvard College in 2011, both in Computer Science.

Leave a Reply