Case Study
Built an immutable data storage solution to simplify reporting and third-party integration for an America’s largest metals service center company

About the Client

Our Client is North America’s largest metals service center company, offering 100,000+ products and value-added processing and serving over 125,000 customers primarily by providing metals processing, inventory management services, and quick delivery.

Challenges

  • On-prem and Legacy data model is modifying/overwriting record data all the time as result, customer unable to have single source of truth
  • Customer required an immutable data storage/system where they can run Adhoc querying, reporting and third-party integration
  • Need for real-time transactions to data warehouses and low-latency data transfer to operational and analytics users with low production impact
  • Batch loads and periodic reloads with the latest data take time and often consume significant processing power on the source system

Solution Highlights

  • AWS Lambda is used to handle and process the core data flow and thereby helps in Capturing Data changes (CDC) in near real time and applying business rules, storing it into Aurora RDS (PostgreSQL) for further analysis by the data teams
  • AWS services such as AWS Lambda, AWS Secrets Manager, Cloud Watch Rule, Aurora RDS (PostgreSQL) and SNS to finely integrate the requisite technical components
  • Rollback Procedures are set in place using AWS Cloud Formation, where in case of failures rollback will happen and reason for failures can be checked
  • AWS Lambda and SNS helps in detecting and recovering the failed executions, where in Lambda functions verify the incoming data records to detect failures
Data Processing & Migration
  • AWS Lambda handle and process the core data flows
  • CloudWatch rule triggers the Lambda process every 5 mins
  • Lambda connects to on prem SQL Server and extracts CDC data
  • Lambda in turn applies business rules within the extracted CDC data and stores the processed records to Amazon Aurora tables
  • EMR Spark based framework developed to run a distributed application environment
Deployment
  • Automated CI-CD pipeline was created
  • Agilisium follow the AWS best practice where we leverage the CloudFormation to deploy the code to various environments (dev, test and prod) through automation.

To fulfill SocialHi’5 need for a client self-service portal that was also easy to maintain, Agilisium’s 5-member expert team built a custom web application with a heavy focus on the visualization of campaign outcomes. They also developed in parallel a DevOps process to maintain, scale and operate this portal.

Web Application Architecture


A variety of AWS services and some open source technologies were used to build and run the web application. The web layer used the PHP framework, included a login and authentication system, and used AWS QuickSight to render its outcome dashboards.

The app layer was built on Python, and the backend services were run on Elastic Container Service (ECS) dockers with Auto Scaling and Auto Load Balancing (ALB) to ensure high availability of the portal. The database was run in a private subnet and used RDS MySQL as the database service.

DevOps Process:

As mentioned earlier, SocialHi5 necessitated that the solution offered was easy to maintain, scale, and operate. To that end, Agilisium’s DevOps engineers developed a 2-part DevOps process focusing on

  • CI/CD for web application development
  • Infrastructure Provisioning for maintenance.

Continuous Integration/Continuous Deployment (CI/CD Process)

All application (Web & App Tier) maintenance was articulated via AWS’s Code Pipeline. AWS’s Code Commit, Code Deploy, and Code Build services were invoked to automate the enhancement and maintenance of the self-service portal.

CI/CD Process Flow: Web Tier


CI/CD Process Flow: Web Tier


Infrastructure provisioning

All infrastructure was hosted on an exclusive SocialHi5 Virtual Private Cloud (VPC), to add an extra layer of confidentiality. AWS CloudFormation templates were used to spin up and maintain a host of AWS services utilized for the self-service portal.

Serverless Web application hosting: EC2, ECS, RDS, S3, SSM, VPC, NAT Gateway, ALB with Autoscaling Group, LAMBDA, Certificate Manager, Route53 were some of the services used to get the portal live.

Security: Web Application Firewall (WAF) was used with Cross-site scripting, Geo match, and SQL injection rules to protect from common cyber threats in conjunction with the AWS inspector service.

Monitoring and Logging: CloudWatch, OpsWorks, Config & Inspector services were also invoked to cover configuration management, logging, and monitoring of the application and infrastructure.


Monitoring & Logging
  • To monitor the status of the ongoing replication tasks with Amazon CloudWatch
  • Configure Amazon Simple Notification Service (Amazon SNS) to notify you of errors in the CloudWatch logs for the task
  • Network throughput ,Client connections, I/O for read, write, or metadata operations
Security
  • IAM best practices and principles are followed
  • Least privileged access is provided
  • Unique non-root credentials are provided
  • Programmatic access for API calls
  • Security groups are defined to restrict traffic
  • All Data stores are in private subnet
AWS services used:
  • Amazon Simple Storage Service
  • AWS Lambda
  • Amazon Aurora
  • AWS CodeCommit
  • AWS CodePipeline
  • AWS CloudTrail
  • Amazon CloudWatch
  • Amazon SNS
  • AWS Identity & Access Management
Results and Benefits
  • Near real time insights to data teams and dashboards
  • Data Propagation & Synchronization
  • Better Data Quality- Capture of real time changes to data record and meta data management
  • Improved Scalability and Performance of the system
  • Overall TCO is reduced