Keep up with the latest @ AWS Re-Invent!
AWS Re-invent 2019 was an exciting event for Amazon Cloud users. With over 77 product launches, feature releases and services announced, there was something in it for everyone.
With a lot of focus on innovation, AWS made a series of announcements including – Graviton2 based instances for mainstream, memory and compute intensive workloads, SageMaker Studio for managing end-to-end machine learning development workflows, custom ML chip AWS Inferentia and services like Amazon CodeGuru & Augmented Artificial Intelligence that advocate ML-free AI services.
Looking at the range of services launched, it appears that AWS is looking to establish an infrastructure that goes beyond its data centers. With AWS Local Zone for individual homes, AWS Wavelength for 5G edge services and AWS Outposts within private data centers, and the already nice-to-have Echo Dot or Alexa, AWS is creating a cloud platform that is all-pervading and easily adaptable.
Such was the impact of the services launched, one needs to know what to look for, to be able to make the best out of AWS’s latest service offerings. Developers and Architects from Agilisium who attended the event were able to identify much-needed technical tips from across sessions. Here is a sneak-peak into some of those tech tips.
1: Amazon Redshift now supports cross-instance restore, allowing restoration of Redshift snapshots to clusters that are different sizes or running different node types.
2: Amazon Athena now supports Java based UDFs in their queries, built on Athena Query Federation SDK, enabling customers to perform a range of custom processing such as compression, redaction, decryption etc. https://amzn.to/33LdG75
3: Amazon Kinesis Data Firehose has enhanced data security by supporting CMK-SSE, in addition to existing AWS-CMK support. https://amzn.to/35Ydkez
4: Amazon Athena adds support for invoking machine learning models in SQL queries. https://amzn.to/35Vul9p
5: Move from server dependent orchestration tools like Apache Airflow to serverless orchestration using AWS Step functions. Launch/Run your ETL workflows on EMR
6: Use Amazon Aurora to add machine learning ML based predictions to your applications, using a simple, optimized, and secure integration. https://amzn.to/35VuABl
7: Amazon Redshift auto managed by Machine learning optimization: Auto Analyze, Distribution/sort key advisors, Auto Vacuum & Delete, Auto Sort
8: AWS Data Exchange – a service that makes it easy for AWS customers to securely find, subscribe to, and use third-party data in the cloud.
9: AWS Identity and Access Management Access Analyzer, a new feature that makes it simple for security teams & administrators to check that their policies provide only the intended access to resources
10: Using Amazon Redshift materialized views, it’s possible to store the pre-computed results of queries, which significantly improve query performance.
11: Amazon EMR now supports running multiple EMR steps simultaneously & the ability to cancel them. This helps to increase cluster resource utilization, and reduce the amount of time taken to complete your workload
12: Amazon Redshift now supports geographic data such as latitude & longitude using GEOMETRY-typed data columns. Use cases include – finding nearest place for accommodation, best location for marketing depending on sales and others
13: Amazon SageMaker Operators for Kubernetes make it easier for developers and data scientists using Kubernetes to train, tune, and deploy machine learning models in Amazon SageMaker.
14: Amazon athena now supports ETL data into S3 data lake using CTAS and INSERT INTO statements. Subsequent athena queries returns results quicker which in turn reduces costs significantly. https://amzn.to/2Ri31Oq
15: AWS DataSync adds the ability to schedule data transfers between NFS servers, SMB servers, Amazon S3, and Amazon Elastic File System (Amazon EFS). https://amzn.to/2rUBKXz
16: Amazon Redshift now supports materialized views with incremental refresh capability, dramatically improving repeated query performance for reporting & dashboard workloads. https://amzn.to/2sH6s7a
17: Amazon SageMaker Autopilot removes guess work from choosing the right algorithm for a dataset. The API helps to inspect data, choose the appropriate algorithms and also tune hyperparameters. https://amzn.to/36d2BgL
18: Amazon SageMaker Studio removes bottlenecks during machine learning development as it unifies all the required tools for development & debugging from within the studio, eventually speeding up the development process. https://amzn.to/2LnxpmZ
19: Amazon SageMaker Debugger automatically identifies issues while developing models & running training jobs. It helps identify inappropriate parameter values, hyperparameter values and code issues. https://amzn.to/2YgZvVU
20: Losing sleep over data quality affecting your ML model in production? Amazon SageMaker Model Monitor keeps track of data quality and alerts (via cloudwatch) when there is a drift in the statistics of the data. https://amzn.to/2Rv3jlp
21: If you are an ML engineer who spends lot of time converting & cleaning data to match the ML model, worry not, Amazon SageMaker can help with pre-processing of datasets. https://amzn.to/2OPsJsc
22: If you are wondering how to keep track of all your ML training experiments & their results, Amazon SageMaker Experiments allows a developer to organize and keep track of experiments and model versions. https://amzn.to/33M26IQ
23: Amazon SageMaker now supports Deep Graph Library, that allows easy implementation of graph neural networks, typically used in training datasets for social networks and recommender systems. https://amzn.to/2Rm79wS
24: Amazon CodeGuru is a new machine learning service for development teams who want to automate code reviews, identify the most expensive lines of code in their applications, and receive intelligent recommendations on how to fix or improve their code.
25: Amazon S3 Access Points is a feature of S3 where you can easily manage access for shared data sets on S3
26: Announcing preview of New Amazon EC2 M6g, C6g, and R6g Instances, with AWS Graviton processors that are custom-built using 64-bit Arm Neoverse cores to deliver the best price performance for your cloud workloads. Await the launch soon!
27: Amazon Athena released a new feature that allows users to easily invoke machine learning models directly from SQL queries, simplifying tasks such as anomaly detection, customer cohort analysis & sales predictions.
28: Amazon introduces EMR runtime for Apache Spark, that is 32 times faster than EMR 5.16, with 100% API compatibility with open-source Spark. Now run workloads faster, at lower compute costs, without making any changes to your applications
29: PartiQL, is a multifaceted SQL-compatible query language, that can be used to query structured data in relational databases, semi-structured and nested data in open data formats & even schema-less data in NoSQL or document databases.
30: Amazon Redshift now supports stored procedures, which makes it easier to migrate existing workloads from legacy, on-premises data warehouses. AWS implemented PL/pqSQL stored procedure to maximize compatibility with existing procedures.
31: CloudFormation Announces Drift Detection Support in StackSets, which extend CloudFormation stacks by allowing you to manage stack operations across multiple regions and accounts in one operation.
32: AQUA is a new distributed & hardware-accelerated cache that enables Amazon Redshift to run up to 10x faster than any other cloud data warehouse. https://bit.ly/2RglpHD
33: Amazon Detective is a new service that offers SIEM like capability, simplifying investigation into the root cause of potential security issues or suspicious activities. It automatically collects log data from AWS resource.
34: Amazon Redshift RA3 instances let customers scale compute and storage separately and deliver 3x better performance than other cloud data warehouse providers