SHARE
Facebook X Pinterest WhatsApp

Time Consumed by Data Prep: Is This a Bad Thing?

thumbnail
Time Consumed by Data Prep: Is This a Bad Thing?

Data professionals are spending too much time on data prep, but the quality assurance that provides ensures projects are working with clean data sets.

Written By
thumbnail
Joe McKendrick
Joe McKendrick
Apr 4, 2022

To have a responsive, responsible and accurate artificial intelligence or analytics system, one needs data. The catch is, data scientists and analysts are forced to spend more time with data prep than they do in model creation, making it of value to their businesses. This suggests a need for more data engineers and database administrators to handle much of the front-end work that goes into supporting data-driven applications. Importantly, it means a high degree of teamwork is needed to make data analytics practical.

Download Now: Building Real-time Location Applications on Massive Datasets

Ask any data scientist or analyst about the level of support they need to do the jobs they were hired to do. SAS did exactly that, as documented in their recent study of 277 data managers and scientists, which finds data professionals are spending too much time on data preparation, and not enough on model creation. Respondents are spending more of their time (58%) than they would prefer gathering, exploring, managing and cleaning data.

See also: Integration Projects: How Data Prep Benefits from Automation

A typical data science project involves a variety of activities, almost always beginning with preparing data. On average, 11% of data scientists’ or analysts’ time is spent creating computer models. The question is: is this enough?

Data prep may be onerous and takes time away from working on business issues, but it’s necessary, the SAS study’s authors point out. “Regardless of your level in the organization, data management will probably take a large share of your time, even with the development of low code/no code tools and AI and machine learning algorithms being written for it,” they point out. “The likely reason is that the data you have and how you decide what’s relevant is probably specific to your industry and organization. As is the case for how you approach your model-building, knowing which data is relevant and why has a lot to do with the issues you are trying to solve.”

Data scientist and Data Science Bootcamp Leader Patrick Butler agrees, noting that the whole front-end managing and cleaning data process “is an intrinsic part of the modeling process.” Without it, “all the modeling that follows is truly just math.” The quality assurance for the data coming in up front is essential for ensuring that training data is built on clean data sets.

Download Now: Building Real-time Location Applications on Massive Datasets
thumbnail
Joe McKendrick

Joe McKendrick is RTInsights Industry Editor and industry analyst focusing on artificial intelligence, digital, cloud and Big Data topics. His work also appears in Forbes an Harvard Business Review. Over the last three years, he served as co-chair for the AI Summit in New York, as well as on the organizing committee for IEEE's International Conferences on Edge Computing. (full bio). Follow him on Twitter @joemckendrick.

Recommended for you...

Why the Next Evolution in the C-Suite Is a Chief Data, Analytics, and AI Officer
Digital Twins in 2026: From Digital Replicas to Intelligent, AI-Driven Systems
Real-time Analytics News for the Week Ending December 27
The AI Executive Order Creates Uncertainty, Not Clarity. Here’s How to Navigate It.
RTInsights Team
Dec 26, 2025

Featured Resources from Cloud Data Insights

Cloud Evolution 2026: Strategic Imperatives for Chief Data Officers
Why Network Services Need Automation
The Shared Responsibility Model and Its Impact on Your Security Posture
The Role of Data Governance in ERP Systems
Sandip Roy
Nov 28, 2025
RT Insights Logo

Analysis and market insights on real-time analytics including Big Data, the IoT, and cognitive computing. Business use cases and technologies are discussed.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.