The client is a leader in online marketing and online consumer acquisition platform. The high costs of their existing data warehouse on Teradata prompted the client to search for cost-effective alternates. Due to Agilisium’s expertise in modernizing enterprise data warehouse (EDW), we were tasked with performing a migration fit assessment from Teradata to AWS Redshift.
The existing DW was platformed on Teradata. The storage cost of exploratory data was high, and system performance suffered due to long-running queries. Besides, dependency on traditional DW (Informatica / Custom SQL) platforms leads to an inability to handle Big Data workloads. Given the varied yet in-depth need for data analysis in an online marketing company, the client also wished for enhanced real-time data ingestion & analytics capabilities.
Challenged with a sweeping assessment on 30+ TB of data on a legacy DW, the four-member expert team from Agilisium set the following goals for the assessment
- Assess Redshift and related AWS services (S3 Data lake, Athena, Spectrum, Glue Catalog, EMR, Sagemaker) as a platform that supports innovation
- Assess potential performance uplift, cost-effectiveness, and ease of maintenance through benchmarking exercise
- Upskill client IT team, on Redshift best practices and DB maintenance activities through workshops, as required
Based on the above goals, the team delivered the following assessment via a benchmarking POC
- Conversion of Teradata schema to optimal Redshift equivalent using AWS SCT
- Migrate and validate data from 800+ tables into Redshift using custom utilities and optimize Redshift
- Compress and stage data in Glue Catalog & S3 Data Lake Fast facts
- Run benchmark tests (baseline, concurrency) using candidate ETL, BI, and Data science queries for different server instances and WLM settings
- Executive presentation on cost projections and potential performance uplift
To fulfill SocialHi’5 need for a client self-service portal that was also easy to maintain, Agilisium’s 5-member expert team built a custom web application with a heavy focus on the visualization of campaign outcomes. They also developed in parallel a DevOps process to maintain, scale and operate this portal.
Web Application Architecture
A variety of AWS services and some open source technologies were used to build and run the web application. The web layer used the PHP framework, included a login and authentication system, and used AWS QuickSight to render its outcome dashboards.
The app layer was built on Python, and the backend services were run on Elastic Container Service (ECS) dockers with Auto Scaling and Auto Load Balancing (ALB) to ensure high availability of the portal. The database was run in a private subnet and used RDS MySQL as the database service.
As mentioned earlier, SocialHi5 necessitated that the solution offered was easy to maintain, scale, and operate. To that end, Agilisium’s DevOps engineers developed a 2-part DevOps process focusing on
- CI/CD for web application development
- Infrastructure Provisioning for maintenance.
Continuous Integration/Continuous Deployment (CI/CD Process)
All application (Web & App Tier) maintenance was articulated via AWS’s Code Pipeline. AWS’s Code Commit, Code Deploy, and Code Build services were invoked to automate the enhancement and maintenance of the self-service portal.
CI/CD Process Flow: Web Tier
CI/CD Process Flow: Web Tier
All infrastructure was hosted on an exclusive SocialHi5 Virtual Private Cloud (VPC), to add an extra layer of confidentiality. AWS CloudFormation templates were used to spin up and maintain a host of AWS services utilized for the self-service portal.
Serverless Web application hosting: EC2, ECS, RDS, S3, SSM, VPC, NAT Gateway, ALB with Autoscaling Group, LAMBDA, Certificate Manager, Route53 were some of the services used to get the portal live.
Security: Web Application Firewall (WAF) was used with Cross-site scripting, Geo match, and SQL injection rules to protect from common cyber threats in conjunction with the AWS inspector service.
Monitoring and Logging: CloudWatch, OpsWorks, Config & Inspector services were also invoked to cover configuration management, logging, and monitoring of the application and infrastructure.
- 2.5x performance uplift achieved via minimal code change
- One assessment to check fitment of AWS (Redshift and related AWS services) as a platform that supports innovation