In this article, we will look at the stages of big data analytics lifecycle in detail.

Below diagram depicts the 9 stages of big data analytics.

Stage 1: Business Case Evaluation

            In this stage, based on business requirements need to evaluate whether the business problems being addressed is really big data problem.

            Business case should be well written with the justification, motivation and goals of carrying out the analysis.

Outcome:

  • Business case should be created, assessed and approved
  • Business challenges that analysis will tackle
  • KPIs are identified to determine assessment criteria and guidance for evaluation of the analytic results

Stage 2: Data Identification

            In this stage, identify the dataset required for analysis and sources. The required datasets & sources can be internal or external.

            Internal datasets – From internal sources, such as data marts and operational systems compile and verify for data required.  External datasets – From third-party sources, such as data markets and publicly available datasets compile and verify for data required.  

Outcome:

  • Identify the dataset required for analysis

Stage 3: Data Acquisition and Filtering

            In this stage, data is gathered from all the data sources that were identified during the last stage.  Acquired data is then looked upon for

  • Filtering/removal of corrupt data
  • Removal of unusable data for analysis

Data needs to be persisted once it gets generated.

Outcome:

  • Filtered data sets by removing noise
  • Data is persisted

Stage 4: Data Extraction

            In this stage, data is extracted and transformed it into a format that the underlying big data solution can use for the purpose of the data analysis.

Outcome:

  • Transformation of data for the purpose of data analysis

Stage 5: Data Validation & Cleansing

            In this stage, data validated for invalid /missing data & cleansing will be done. Invalid data can skew and falsify analysis results.

Outcome:

  • Cleanse the dataset

Stage 6: Data Aggregation & Representation

In this stage, multiple datasets are integrated to arrive at a unified view. Data reconciliation from multiple sources needs to be done.

Need to do future data analysis requirement to foster data reusability.

Outcome:

  • Data Aggregation & future data analysis requirement complete

Stage 7: Data Analysis

In this stage, actual analysis task will be carried out which will be iterative process till the appropriate pattern or correlation is uncovered.

Outcome:

  • Identify pattern or correlation

Stage 8: Data Visualization

In this stage, analysed data will be represented using graphical tools for interpretation by business users. Its important to use the most suitable visualisation technique by keeping the business domain in context

Outcome:

  • Visual charts

Stage 9: Utilization of Analysis Results

In this stage, it will be determined how and where processed analysis data can be further leveraged. Analysis can possibly provide new insights & patterns which can be used to improve business process and application system logic.

It can be used as input for enterprise systems & alerts, optimisation of business process.

Outcome:

  • Identify & leverage analysis results further

Hope this article will be helpful for you.