Case Study: Enterprise data lake on cloud

1CloudHub helped India’s leading television entertainment network bring its scattered big data into a single source of truth, to make advanced analytics affordable.

Industry

Media

Offering

Cloud Advisory Services

Workload

DC, Servers & Data

Cloud

AWS

Project scope

— Data lake architecture design
— Data transformation and storage in data lake
— Customized reports in PowerBI

About the client

The client is a leading media production and broadcasting company, subsidiary of a global media conglomerate. They have over 30 television channels, a digital business and a movie production business, reaching over 700 million viewers in India.

Business challenge

As part of their digital strategy, our client wanted to optimise user experience across channels — iOS and Android apps, Fire TV, web, and so on — based on user behaviour and preferences. This required a deeper understanding of customer behavioural patterns across platforms.

Presently, they were using Segment as the tool to collect around 6.5 billion records (20TB of raw data) of behavioural data from their 30 million online viewers every month from across sources.

In order to deliver a user-focussed digital viewing experience, the client needed

  • Reliable storage, with protection against data corruption and other types of data losses
  • Security against un-authorized data access
  • Ease of finding a single record in billions (by efficiently indexing data)
  • An advanced analytics engine that can help them derive and visualise meaningful insights from the client’s high volume and variety of data.
  • All of this forming their single source of truth.

Solution

We, at 1CloudHub, enabled an enterprise data lake for all of the client’s data to reside in one place — preserving accuracy and timeliness of the data.

Leveraging our client’s existing mechanism to collect and feed data into the data lake, we created a pipeline with EMR (Elastic MapReduce) for data crunching or ETL (Extract, Transform, Load) and Power BI for self-service visualisation.

Our approach

Understand

Define

Design

Transform

Completion and reporting

01. Understand

  • In collaboration with the client’s development team, we outlined the volume, velocity, veracity and variety of data.

02. Define

  • We worked with the client’s business teams and domain experts to define reports in Power BI for the 18 use cases the client had identified.

03. Design

  • We mapped data to corresponding reports and planned data transformation.
  • Based on these, we designed and architected the data lake and pipeline necessary for Power BI.
  • With the client’s sign-off, we deployed the solution on AWS cloud.

04. Transform

  • Once the infrastructure was in place, our data engineering team performed the necessary ETL steps such as cleaning and consolidation to derive value from the raw data.
  • We stored this in an S3 bucket as parquet formatted files.
  • We imported transformed data as data-marts into AWS Redshift, to be used for Power BI reports.

05. Completion and reporting

  • We delivered a summary of findings and recommendations for production deployment to bring the PoC to a meaningful closure.

Outcomes

Better

Better

We enabled advanced analytics for data from up to a year — compared to the 3 months data as per agreement — to deliver the meaningful insights the business teams sought.

Faster

Faster

We crunched over 12 million records in under an hour, running more than 100 VMs concurrently in a cluster.

Cheaper

Cheaper

We delivered each report at a cost of $70. At this cost, we delivered an excellent price-to-performance ratio, driven by the spot fleet instances we used and our on-demand or pay-as-you-use cloud model.

A similar setup on-premise in a data centre would have cost the client 12,000 times more.

Looking forward

We are delighted to have helped the client create a centralized, analytics-ready repository for their Big Data and look forward to helping them meet their strategic goals using our cloud capabilities.

Latest case studies

Case Study: SAP ECC Migration on Azure Cloud for a Health Care Manufacturer

Fast tracked On-Premise SAP ECC Dev, QA & Prod Landscapes to Azure while ensuring a smooth cutover within an hour.

Case Study : SAP S/4 HANA Greenfield Infra Implementation

We helped one of the largest security service company in Singapore to adopt their first cloud application (SAP S/4 HANA) in Azure cloud

Case Study: Big Data on Cloud

1CloudHub helped one of the world’s largest manufacturers of commercial vehicles deploy a cost-effective, scalable cloud solution for their Big Data.

Case Study: DR for geographically diverse SAP

We helped one of the world’s largest paper, pulp, and packaging companies, a first-time cloud adaptor, to establish a unified DR site.

Case Study: RPA on cloud

We helped a global shipping leader achieve on-demand scaling through a multi-geography accessible RPA solution.

Case Study: Multi-cloud strategy

We helped India’s leading integrated healthcare delivery service provider design and implement their HIS on cloud.

Case Study: Enterprise app migration

We helped a global leader in supply chain services efficiently and effectively host their applications on the cloud during a period of business transformation.

Case Study: DC backup and DR

We helped India’s leading television entertainment network architect, deploy, and manage their data backup system.

Case Study: SAP on cloud

We helped one of the world’s largest shipping companies increase the future load-capacity of their mission-critical SAP CRM, at significantly lower costs.

Case Study: DC and app migration

We envisioned, designed and implemented an end-to-end cloud transformation strategy for a leading gaming company in Malaysia.

Sharing is caring!