Data Engineers Spend Two Days Per Week Fixing Bad Data

PinIt

Data engineers spend 40% of their workweek dealing with incidents relating to poor data quality, which may cost an organization 20% of its revenue.

Data engineers are spending two days every workweek fixing issues with data quality, according to a new survey conducted by Wakefield Research and published by data reliability company Monte Carlo.

The 2022 data quality report, which surveyed 300 data professionals, found that it takes four hours on average to detect an incident, and a further nine hours on average to resolve it. In a month, the average data professional has to deal with 61 incidents. 

SEE ALSO: Observability and Software Supply Chain Security

Data Observability: Managing Data Quality and Pipelines for the Cloud Era  [Register Now]

Not many of the data engineers have formal tracking for how long it takes to detect an incident. In the report, co-founder and CEO at Monte Carlo, Barr Moses, found that engineers with more experience thought they took longer to detect an incident. 

The amount of incidents has also increased rapidly over the past 12 months, with 58 percent of respondents saying that it has greatly or somewhat increased. Less than 20 percent said data incidents had somewhat or greatly decreased over that same period. 

“The first step to improving data quality and trust is measuring it, starting with the number and type of incidents, and setting baselines on response rates and data downtime,” said Shane Murray, CTO at Monte Carlo. 

Teams that ran tests more frequently reported lower levels of data incidents, although respondents did not see the reduction in downtime having a meaningful effect on revenue. It may be suggested that from this data professionals see any downtime from data incidents as a serious impediment, regardless of length. 

Accordingly, data engineers perceive that 26 percent of their organization’s revenue could be harmed by bad data quality. Only five percent of respondents said bad data had no effect on revenue generation. 

“This may seem like a shockingly high number for some, but there are multiple surveys that repeatedly reveal the high cost of poor data quality,” said Francisco Alberini, product manager at Monte Carlo. “For example, Gartner finds bad data costs an organization about $13 million a year. The value of data as a business driver is increasing every year, which also makes the cost of data downtime get more expensive.” 

When discussing the impact of bad data for decision makers and stakeholders, 47 percent of respondents said it had an impact all of the time or most of the time. Less than 25 percent said it rarely or never had an impact. 

“For companies to become data driven there needs to be trust in the data and in the data team,” said Mei Tao, product manager at Monte Carlo. “A shift needs to take place where the data team can catch and resolve data incidents before bad data is acted upon.”

Data observability and other metrics to evaluate and filter bad data could reduce the amount of incidents that impact an organization’s dataflow, thus reducing the costs that come with providing decision makers and stakeholders with inaccurate data.

David Curry

About David Curry

David is a technology writer with several years experience covering all aspects of IoT, from technology to networks to security.

Leave a Reply

Your email address will not be published.