Case Study
Delivered insights by curating data that is growing at a faster pace for privately held company that offers various products and services

About the Client

Our Client is a privately held company that offers a range of high-quality products and services to consumers. They grow, harvest, bottle and market a diverse range of products including fruits, nuts, flowers, water, wines and juices.

Challenges

  • Business users were unable to gather much insights from data available on-premise on MS SQL database
  • Data was growing at a much faster pace thereby adding to challenges with managing the database
  • Scalability of on-prem infrastructure to meet the needs of growing data
  • Cost and performance of the solution hosted on-prem

Solution Highlights

  • Data migration from On-Prem to Cloud
  • Target was to move the data into Amazon Redshift
  • AWS Lambda is chosen to process and handle the data flows
  • DMS was considered, with an intermediate transfer to S3. But the handshake between MS SQL to S3 and S3 to Redshift was not seamless
  • AWS Glue was considered, but speed of processing data was a concern
  • Kinesis Firehose was considered, but was expensive due to the pricing based on 5KB blocks of data
Data processing
  • AWS Lambda function was implemented to convert the CSV file to parquet and store it back in S3
  • AWS Glue crawler was implemented to process the parquet converted file and read it in Amazon Redshift Spectrum
  • Redshift Spectrum was enabled to help with ad-hoc queries. There are PowerBI reports built to view snapshot data for trend analysis.
Data migration
  • Windows Server was setup on Amazon EC2
  • S3 Sync was used to move the file from on-prem to the Windows Server
  • Amazon S3 was setup to transfer data from Windows Server to S3
Deployment Automation
  • AWS CloudFormation is used to deploy in all environments (Dev, QA, Integration, Production)

To fulfill SocialHi’5 need for a client self-service portal that was also easy to maintain, Agilisium’s 5-member expert team built a custom web application with a heavy focus on the visualization of campaign outcomes. They also developed in parallel a DevOps process to maintain, scale and operate this portal.

Web Application Architecture


A variety of AWS services and some open source technologies were used to build and run the web application. The web layer used the PHP framework, included a login and authentication system, and used AWS QuickSight to render its outcome dashboards.

The app layer was built on Python, and the backend services were run on Elastic Container Service (ECS) dockers with Auto Scaling and Auto Load Balancing (ALB) to ensure high availability of the portal. The database was run in a private subnet and used RDS MySQL as the database service.

DevOps Process:

As mentioned earlier, SocialHi5 necessitated that the solution offered was easy to maintain, scale, and operate. To that end, Agilisium’s DevOps engineers developed a 2-part DevOps process focusing on

  • CI/CD for web application development
  • Infrastructure Provisioning for maintenance.

Continuous Integration/Continuous Deployment (CI/CD Process)

All application (Web & App Tier) maintenance was articulated via AWS’s Code Pipeline. AWS’s Code Commit, Code Deploy, and Code Build services were invoked to automate the enhancement and maintenance of the self-service portal.

CI/CD Process Flow: Web Tier


CI/CD Process Flow: Web Tier


Infrastructure provisioning

All infrastructure was hosted on an exclusive SocialHi5 Virtual Private Cloud (VPC), to add an extra layer of confidentiality. AWS CloudFormation templates were used to spin up and maintain a host of AWS services utilized for the self-service portal.

Serverless Web application hosting: EC2, ECS, RDS, S3, SSM, VPC, NAT Gateway, ALB with Autoscaling Group, LAMBDA, Certificate Manager, Route53 were some of the services used to get the portal live.

Security: Web Application Firewall (WAF) was used with Cross-site scripting, Geo match, and SQL injection rules to protect from common cyber threats in conjunction with the AWS inspector service.

Monitoring and Logging: CloudWatch, OpsWorks, Config & Inspector services were also invoked to cover configuration management, logging, and monitoring of the application and infrastructure.


Monitoring & Logging
  • AWS Systems Manager is setup as the Configuration Management Server
  • Patching of servers is taken care of by AWS Systems Manager
  • Amazon CloudWatch metrics are enabled to track the health of solution components
  • Logs are enabled via AWS Lambda to measure latency
Security
  • IAM best practices and principles are followed
  • Least privileged access is provided
  • Unique non-root credentials are provided
  • Programmatic access for API calls
  • Security groups are defined to restrict traffic
  • All Data stores are in private subnet
  • Amazon KMS is used for encryption of data at rest
AWS services used:
  • Amazon Simple Storage Service
  • AWS Lambda
  • Amazon EC2
  • AWS Key Management Service
  • AWS Identity & Access Management
  • AWS CloudFormation
  • AWS CloudTrail
  • Amazon CloudWatch
  • AWS Systems Manager
  • Amazon Simple Notification Service
  • AWS CodePipeline
  • Amazon Redshift
  • AWS Glue
  • AWS X-Ray
  • AWS CodeCommit
Results and Benefits
  • With all the data now available on AWS, The Wonderful Company is now enabled to move further with their long-term goal of building Data & Analytics and Data Science services
  • Scalability and Elasticity is in-built with the solution on AWS Cloud
  • Kinesis Firehose were leveraged to create the pipeline
  • With the Pay-as-you-go model, the Total Cost of Ownership of the solution is now reduced significantly
  • Optimal performance with Lambda
  • 180 MB file processed in 25 seconds
  • 3,273,300 records each of 60-70 bytes
  • Close to 1.4 Billion records processed per day