Five Phases of Big Data Projects


First, identify the data and brainstorm a use case. Then make sure everything’s in place to make it work.

Enterprises today have a mass of data to analyze—whether from their own database systems, machines equipped with sensors, real-time business transactions, or ecommerce. Often they will embark on a “Big Data” effort in hopes of achieving a business objective—whether more sales, a better customer experience, reduced fraud, optimized production, or predictive maintenance.

“There is increasing pressure on analytics teams to reduce time to insight and answer questions faster,” according to a recent TDWI paper on “Best Practices for a Successful Big Data Journey.” “Yet big data has introduced many new forms of complexity.”

These forms of complexity include:

  •        Varying forms of data, large volumes, and complex blending
  •        Analytic complexity, including graph and path analysis and finding hidden patterns
  •        Data governance

As enterprises embark on a complex Big Data analysis effort, they typically go through five stages, according to TDWI:

Phase 1: Ad-Hoc Exploration

In this phase, organizations experiment and learn about their big data needs. The scope of such an initiative is often small and unbudgeted, and the team is often simply trying to understand what data can be analyzed and who can analyze it. Typical problems encountered during the stage include missing or ill-prepared data, and the reliance on manual labor for data processing.

Phase 2: Opportunistic

During this phase, the big data initiative will often focus on one or two department-specific solutions, such as through a “use-case discovery” workshop. In this phase, methods to break down data siloes become important, with cross-functional coordination between IT, analyst, and business departments. (See:  “Building a Data-Driven Business: Why Integration Matters”). Another key effort during this phase is data discovery

Phase 3: Repeatable

In this phase, companies create a replicable model for big data projects. The project starts to encompass multiple departments. For example, a business might want a 360-degree view of the customer, which might involve coordination between sales, IT, and logistics. During this phase, companies gain a better idea of where data comes from and how various analysts use it, which helps your team gain repeatability at the data level.

Phase 4: Managed

With a big data program servicing several business units, measurement becomes vital.  “Process, quality, and success should each be measured and tracked individually. ROI must also be closely considered and maximized,” TDWI advises.. A centralized Center of Excelllence has emerged.

Phase 5: Optimized

Data should be flowing in at a regular rate and be processed consistently, with management and governance procedures in place. The business may uncover more use cases, and staff may have acquired specialized skills, including curation, stewardship, discovery and advanced analytics. The CoE, meanwhile, should be well organized.

Overall, a big data analytics project can be quite challenging. Teams have to prove big data has value from the start, and then overcome common obstacles such as data siloes; integration; ensuring the necessary staff and skills; and overcoming the organizational inertia and cross-communication hurdles that hamper almost any large enterprise project.

Given such challenges, it’s little surprise that some companies turn to big-data-as-a-service companies that help build data platforms for the businesses. Such data operations issues, or DataOps approaches, will be discussed during the Data Platforms 2017 conference.


Big data trends for 2017: Hadoop meets machine learning

Why is big data analysis so challenging?

Chris Raphael

About Chris Raphael

Chris Raphael (full bio) covers fast data technologies and business use cases for real-time analytics. Follow him on Twitter at raphaelc44.

Leave a Reply

Your email address will not be published. Required fields are marked *