SHARE

The Unstructured Data Problem Businesses Can No Longer Ignore

Unstructured data is generally poorly served by traditional data management and analysis tools. Without decisive action, many businesses will not just add to organizational complexity and costs but also miss out on very important opportunities that good data management can deliver.

Written By

SL

Steve Leeper

May 4, 2026

4 minute read

Modern businesses aren’t just data-centric anymore; they are fundamentally data-dependent. Indeed, our collective reliance on digital information now drives decision-making at every turn. To remain agile and competitive, enterprises have to be masters of their numbers.

If this wasn’t challenging enough, the vast majority of enterprise data is unstructured and, as NVIDIA CEO Jensen Huang recently put it, “completely useless to the world” because it cannot be easily queried or indexed.

It’s an important perspective: unstructured data is generally poorly served by traditional data management and analysis tools. Organizations find it hard to carry out some pretty fundamental tasks, such as identifying what data they have, where it resides, how it is used, and whether it has any ongoing value.

The result of this inertia is that enterprises everywhere spend vast sums collecting and retaining increasing volumes of information, but cannot bridge the gap between its latent value and improving bottom-line performance. As Huang went on to explain, “We read it, we put it into our file system, and that’s it.” It’s a remarkable situation that begs the question: Is there any other area of contemporary business practice where this level of inefficiency would be tolerated?

A change of mindset

Clearly, something has to give. Take data visibility, or the lack of it, for example. Without meaningful insight into its data assets, businesses have no reliable way to distinguish between active and inactive data, or to identify what is redundant or no longer needed.

The default response is to retain everything and add more storage as capacity limits are reached. In many ways, it’s an easy decision to make and, from a technology perspective, relatively simple to implement. The downside is that it’s also a hugely expensive control and management issue, with significant amounts of inactive or low-value data often remaining on high-performance, high-cost storage; a situation akin to storing valueless trinkets in a bank vault.

Over time, this drives unnecessary infrastructure expansion and increases cost pressure.

A lack of clear ownership and inconsistent handling also adds to already complex governance challenges and significantly increases compliance and security risks. Whichever way you look at it, unstructured data sprawl is bad for business.

This is particularly relevant at the moment when storage costs have become extremely volatile, and, as a result of all these issues, organizations badly need a change in mindset: away from keeping data at almost any cost and towards a much greater focus on data management.

This approach is based on an understanding of what data exists across the environment, supported by detailed metadata such as age, activity, ownership, and other key attributes. Armed with this insight, the data fog clears, making it possible to distinguish between high-value assets and information that is no longer relevant or in use. This creates the foundation for managing data more confidently and effectively across its lifecycle, including decisions around retention, archiving, and deletion.

In practice, the value of data is rarely static. In many environments, the window for active use is relatively short, often around 30–90 days, after which its relevance begins to decline as new information is generated. Moreover, many organizations find that over 60% of their stored data has not been accessed or modified for extended periods, yet remains in place due to a lack of visibility or clear policy direction. This reinforces the need for lifecycle-driven management, where data is continuously assessed and moved, archived, or removed based on defined criteria rather than retained indefinitely by default.

A final word on addressing the unstructured data problem

Consistency is key because in modern environments, where data is increasingly distributed, a fragmented or piecemeal approach can quickly break down. This is where good governance delivers value because it helps define ownership and, crucially, accountability for data assets.

When governance is neglected, data can very easily become orphaned, with no clear owner or defined purpose, making it much harder to manage effectively. Instead, consistent management policies ensure that datasets are handled in line with both operational requirements and regulatory obligations. These policies must be applied uniformly across all environments to be effective and should be supported by ongoing auditing and monitoring processes, which are essential to ensuring that data remains aligned with defined policies.

Without decisive action, many businesses will not just add to organizational complexity and costs but also miss out on very important opportunities that good data management can deliver. Get this right, however, and data dependency can become a truly transformative influence on business success, instead of an administrative headache whose value is rarely fully realized.

SL

Steve Leeper

Steve Leeper is the VP of Product Marketing at Datadobi. He oversees the market development of the company and manages the Presales Sales Engineers team globally. A 30-year veteran of IT, Steve has held a variety of technical and sales roles at Andersen Consulting, Sun Microsystems, and EMC.

The Unstructured Data Problem Businesses Can No Longer Ignore

A change of mindset

A final word on addressing the unstructured data problem

Steve Leeper

Featured Resources from Cloud Data Insights

Company

Categories