Companies must rein in big data to ensure their industrial AI systems are delivering the right insights at the right time.
Industrial companies have a lot of data — perhaps more than they actually need — but their artificial intelligence efforts still tend to fall way short of expectations. To be successful, they need to ensure that the data they’re feeding into their industrial AI systems is well-vetted and appropriate.
That’s the call from a recent McKinsey report, which makes recommendations for reining in big data to ensure their AI systems are delivering the right insights at the right time. “Many companies in heavy industry have spent years building and storing big data but have yet to unlock its full value,” the McKinsey analyst team, led by Jay Agarwal, reports. McKinsey estimates that more than 75% of industrial companies have piloted some form of AI, yet less than 15% have realized meaningful, scalable impact. That’s because there isn’t enough operational insight or oversight over the data being directed into these AI systems.
The key to factory AI success is the availability of reliable historian data, or what Agarwal and his co-authors call “smart data.” This data needs to “adapt their big data into a form that is amenable to AI, often with far fewer variables and with intelligent, first principles–based feature engineering.” By re-engineering for smart data, as well as introducing appropriate training, companies can increase their returns by 5% to 15%, they estimate.
The McKinsey team recommends the following steps to assure a fit between data and consuming industrial AI systems:
Define the process. “Outline the steps of the process with experts and plant engineers, sketching out physical changes (such as grinding and heating) and chemical changes (such as oxidation and polymerization). Identify critical sensors and instruments, along with their maintenance dates, limits, units of measure, and whether they can be controlled.”
Enrich the data. “Raw process data nearly always contain deficiencies. Thus, creating a high-quality dataset should be the focus, rather than striving for the maximum number of observables for training. Teams should be aggressive in removing non-steady-state information, such as the ramping up and down of equipment, along with data from unrelated plant configurations or operating regimes.”
Reduce the dimensionality. “AI algorithms build a model by matching outputs, known as observables, to a set of inputs, known as features, which consist of raw sensor data or derivations thereof. When combined with the sheer number of sensors available in modern plants, this necessitates a massive number of observations. Instead, teams should pare the features list to include only those inputs that describe the physical process, then apply deterministic equations to create features that intelligently combine sensor information (such as combining mass and flow to yield density).”
Focus machine learning on the process at hand. “Overall, the focus should be on creating models that drive plant improvement, as opposed to tuning a model to achieve the highest predictive accuracy. Teams should bear in mind that process data naturally exhibit high correlations. In some cases, model performance can appear excellent, but it is more important to isolate the causal components and controllable variables than to solely rely on correlations.”
Implement and validate the models. “Teams should continuously review model results with experts by examining important features to ensure they match the physical process.”
Build a team. There aren’t enough process experts, and this is an area that needs addressing for originating and validating the proper data that will populate AI systems. “Deploying AI in heavy industry requires cross-functional teams made up of operators, data scientists, automation engineers, and process experts. We often find that companies have roles for data science, but they face three main challenges regarding process experts: there is a dearth of process expertise either at a specific facility or across the company; there are sufficient process experts, but they are not comfortable with modern digital or analytical tools; or process experts don’t know how to work effectively on digital teams.”