Case Study
Change Enabling Targeted Marketing using a Self-service Analytical Platform built on AWS, Talend & Domo for NBC Universal
Overview
U.S. E&M revenues is expected to reach $759 billion by 2021, up from $635 billion in 2016, increasing at a CAGR of 3.6 percent – holding steady at the same CAGR as last year. - PWC E & M Outlook forecast U.S Industry
The M & E value chain has changed. Top-line growth from mature segments are declining, and Digital M & E content are growing, albeit at a lower rate. Fragmented audience today have myriad of options and gravitate towards skinnier bundles, which has led to fiercer competitions among M & E players for digital wallet share.

Understandably, the client who is a U.S multimedia conglomerate was looking out for ways to monitor how its digital content is consumed, to decide where to invest ad dollars and increase viewership & top line.

The Challenge

With an extensively growing client global network and the slew of digital platforms leveraged, there was an explosion in data volume and variety.

The existing reporting process was ad hoc/manual and was time consuming and error-prone, which translated into delayed and incomplete insights. This paved way for instincts driven decisions, and undermined efforts to increase viewership and stay competitive.

Lack of integrated insights limited key business leaders’ ability to make true data-driven strategic business decisions. A new scalable Analytics platform with robust automated data integration framework, data governance and audit processes was envisaged, to achieve the following:

  • Enable internal teams (Marketing, Product, Production, Social Media) to shift from instincts-driven to insights-driven decision making, by providing anytime access to all data processed automatically.
  • Automated data processing & loading in the lowest grain to do the below:
    • Help answer questions such as how many viewers transitioned among digital platforms, and change in number of full episode viewers (FEP).
    • Enable key business decision makers to gain holistic view with minimal dependence on IT.

Our Solution

To address all challenges, an elastically scalable analytics platform was built on top of AWS Cloud. Given below are key features of the platform:

  • Data lake was designed in AWS S3 and it served a dual purpose – a) As single source of truth, to store data from heterogenous sources in their native form. b) As platform to unearth insights from raw semi and unstructured data through Redshift Spectrum.
  • Massively parallel processing (MPP) data warehouse solution using AWS Redshift was designed for expedited insights from structured data. The data was pre-processed and loaded into Redshift to reduce IT dependency for Analytics reports.
  • A purpose-built data validation, cleansing, and data integration framework was designed. Talend Integration tool and custom scripts were used to fetch data from discrete data sources and load OTT services, Social impressions, and User behaviour data to the data lake and analytics platform for reporting and data science activities.
  • Data lineage documents were prepared in confluence to effortlessly map data elements to source system(s).

Key Highlights

Technologies Used: Talend Integration Cloud (with Big Data), Custom scripts on Java, Python and Linux Shell, S3 Data Lake, Redshift, DOMO, Data Bricks, Redshift Spectrum, Apache Spark.

Team size – 2 (Onsite), 1 (Offshore)

Cluster details –8xlarge, M3.xlarge. Current data volume is 5+ TB with inflow of approx. 65 GB/day

Project Duration – 6 months (ongoing)

Delivery model – Hybrid

How we worked

The project scope was decided by the client and the solution was jointly delivered in Agile methodology.

Agilisium worked closely with client’s team to design and deliver solution. Daily scrum calls ensured that key stakeholders were apprised on progress made at every stage. Weekly Status Reports and Project tracking tools were used to provide enhanced visibility.

Results and Benefits
  • 4x faster data integration: Custom scripts on Linux and Python increased data integration speed by 200%, which translated into $50,000/year in cost savings.
  • 360-degree view is now a few clicks away: As auto-processed data available in the lowest grain from all data sources; slicing and dicing it to unearth insights are just a few clicks away.
  • No more guesswork: With anytime availability of all auto-processed data, internal teams are now able to cut the guess work in decision making.
  • Advanced Analytics ready: Data in both S3 and Redshift can be leveraged for downstream scalable predictive analytics, at speed of thought.
  • Agile Data platform: Data Lake platform with highly performing Parquet file formats provide flexibility and portability to move from one visualization and analytics platform to other with minimal changes.