Case Study
Niche Solutioning: Rapid streaming data analytics on Redshift for video game developer

Overview

An American video game developer based in Los Angeles, California, faced streaming data challenges for their multiplayer online battle arena game. This hugely popular game serves millions of players month on month. The existing data ecosystem failed to keep up with the growing data storage and data analytics demands of the game. Due to its expertise in Data and Analytics space with niche solutioning capabilities in AI&ML, the client approached Agilisium to eliminate data processing bottlenecks and achieve faster streaming data analytics.

The Challenge

The existing data ecosystem was stored in Vertica (an SQL database) and analyzed for insights via Databricks. However, this system not actively maintained, which translated into data loss and multi-day data delays. Leading to high levels of complexity in the Analytics team’s ability to assess patch outcomes. Ability to maintain, monitor or derive the quality of player knowledge, was impacted.

Our Solution

  • A Gaming session data was pushed to Kinesis Data Streams through 7 streams via Kinesis Agent. The data was converted into JSON and loaded into S3, for immediate raw data availability.
  • The S3 gaming session data is branched out for batch and real-time analytics.
  • Batch: S3 gaming session data was cataloged using Glue crawlers and stored in Athena. This data was transformed and stored in Redshift for downstream BI reports using Lambda.
  • Real-time: S3 gaming session data was processed by Kinesis Analytics, in real-time, and pushed downstream via Kinesis Firehose into Redshift. This processing was repeated at 5-minute intervals for near real-time analytics.

To fulfill SocialHi’5 need for a client self-service portal that was also easy to maintain, Agilisium’s 5-member expert team built a custom web application with a heavy focus on the visualization of campaign outcomes. They also developed in parallel a DevOps process to maintain, scale and operate this portal.

Web Application Architecture


A variety of AWS services and some open source technologies were used to build and run the web application. The web layer used the PHP framework, included a login and authentication system, and used AWS QuickSight to render its outcome dashboards.

The app layer was built on Python, and the backend services were run on Elastic Container Service (ECS) dockers with Auto Scaling and Auto Load Balancing (ALB) to ensure high availability of the portal. The database was run in a private subnet and used RDS MySQL as the database service.

DevOps Process:

As mentioned earlier, SocialHi5 necessitated that the solution offered was easy to maintain, scale, and operate. To that end, Agilisium’s DevOps engineers developed a 2-part DevOps process focusing on

  • CI/CD for web application development
  • Infrastructure Provisioning for maintenance.

Continuous Integration/Continuous Deployment (CI/CD Process)

All application (Web & App Tier) maintenance was articulated via AWS’s Code Pipeline. AWS’s Code Commit, Code Deploy, and Code Build services were invoked to automate the enhancement and maintenance of the self-service portal.

CI/CD Process Flow: Web Tier


CI/CD Process Flow: Web Tier


Infrastructure provisioning

All infrastructure was hosted on an exclusive SocialHi5 Virtual Private Cloud (VPC), to add an extra layer of confidentiality. AWS CloudFormation templates were used to spin up and maintain a host of AWS services utilized for the self-service portal.

Serverless Web application hosting: EC2, ECS, RDS, S3, SSM, VPC, NAT Gateway, ALB with Autoscaling Group, LAMBDA, Certificate Manager, Route53 were some of the services used to get the portal live.

Security: Web Application Firewall (WAF) was used with Cross-site scripting, Geo match, and SQL injection rules to protect from common cyber threats in conjunction with the AWS inspector service.

Monitoring and Logging: CloudWatch, OpsWorks, Config & Inspector services were also invoked to cover configuration management, logging, and monitoring of the application and infrastructure.


Results and Benefits
  • Near real-time availability of Raw data for the purpose-built Spark recommendation engine
  • Data produced in multiple regions was seamlessly replicated to a central region.