ElasticSearch Management
ElasticSearch Management Choices - Is Amazon's Service Right for Your Business?

The demand for scalable, open-source, robust search engines has never been greater as enterprises tackle processing and analyzing continuous amounts of big data. One such solution is Elasticsearch, a Java-based enterprise-grade search engine based on Lucerne, capable of fast search responses through index searching. Amazon has taken Elasticsearch a step further by providing a fully managed service that is easily deployed and managed.

Amazon ElasticSearch in Depth

The AWS ElasticSearch service removes the challenge of self-deploying, scaling, and managing ElasticSearch clusters by providing services for hardware provisioning, software installation and patching, failure recovery, backups, and monitoring. The service supports ElasticSearch’s open-source APIs and integrates with Logstash, Kibana, and Beats, allowing the use of an organization’s existing code and tools for efficient and secure data insights.

Amazon’s ElasticSearch domain and ElasticSearch’s cluster are identical. AWS domains are clusters that are specified with instance type and count, settings, and storage resources. AWS allows the creation of multiple indices within a domain. AWS ElasticSearch utilizes a blue/green deployment process that runs one live production environment alongside an idle test system.

Log files, messages, metrics, configuration information, and documents and lists are captured, processed, and loaded into AWS’s service, where it is fully managed and secured by Kibana. Use cases include application monitoring, security information and event management, search results, and infrastructure monitoring. The expediently processed output of search results, analytics, and logs provides organizations with real-time system insights.

Open-Source ElasticSearch Tasks

Despite the greater simplicity of setup and management of Amazon’s ElasticSearch service, many users prefer to use the independent open-source version for its greater flexibility and decreased administration costs. However, this method requires a great deal of time and expertise to deploy and manage. These tasks include network configurations as well as developing router and firewall rules for cluster communications. Other critical responsibilities are the creation of load balancers and automatic scaling groups, automating configuration, and managing security upgrades. A self-managed ElasticSearch platform also requires a 24/7 team to oversee the operations and be available to resolve issues.

Self-Managed ElasticSearch Benefits

Both types of ElasticSearch approaches to present their share of benefits and limitations. Self-managed ElasticSearch services carry the following advantages:

1. Cost savings

The unmanaged version of ElasticSearch can be as much as 50% cheaper to install and maintain than the AWS Service. This is especially the case if the amount of data is less than 5 GB and is run in multiple small clusters.

2. Increased availability of instance types and clusters

Unlike the AWS service, larger i2 instances can be used with access to the latest c4 and m4 instances. This allows further scaling, which can be more economically feasible. Clusters with more than 20 nodes can be deployed.

3. Ability to change more index settings

In addition to analysis and shards/replicas, there is the ability to delay allocation when a node leaves. All indices settings can also be changed with the /_settings endpoint. These and other various settings included in ElasticSearch enable increased setup optimization and greater efficiency in resource use.

4. Cluster-wide settings can be modified

Most settings in ElasticSearch can be changed on a running cluster with the cluster update settings API.

5. Greater access to APIs and plugins

A broader variety of APIs, such as Hot Threads, a useful debugging tool, can be used in the self-managed version of ElasticSearch. Additionally, all plugins are fully compatible with this version

6. Access to comprehensive monitoring solutions

A collection of dashboard features are available to monitor the performance of ElasticSearch, Kibana, Beats, and Logstash.

The biggest drawback to using self-managed ElasticSearch is the necessity of experienced admins who must ensure security, backups, monitoring, and failure recovery are kept current. Additionally, the added responsibilities of managing your instances can be time-consuming and costly.

AWS ElasticSearch Benefits

The benefits of using AWS ElasticSearch managed services are as follows:

1. Quickly deployed and easily managed

Setup and configuration of clusters in AWS ElasticSearch are efficient and include simplified management tools. Built-in monitoring and alerting ensure any changes are addressed proactively. Infrastructure monitoring through a log and metric collections can be accomplished to ensure efficient server performance.

2. Robust security

Secure and isolated network with Amazon VPC. Additionally, AWS IAM and Amazon’s Cognito policies effectively manage authentication and access. Logs from routers, applications, and devices can be indexed, searched, and visualized with Kibana to detect security threats.

3. Highly scalable

Clusters are easily scaled using a single API call or through the AWS console. Data can be easily replicated between three Availability Zones in the same region.

4. Built-in integrations

AWS services such as AWS IoT, Cloudwatch Logs, and Kinesis Firehouse can be seamlessly integrated.

5. Open source API support

New software or coding skills are not required since AWS ElasticSearch supports open-source APIs. Logstash for open-source data ingestion and Kibana for data visualization from the ELK stack

AWS ElasticSearch Limitations

Although Amazon’s service brings several advantages to organizations, the following limitations must also be considered before selecting this service.

1. Cost considerations

Rates vary by instance type and region and are paid by the hour. Although there are no upfront fees or minimum usage stipulations, the costs can become excessive with operational scaling. Automated snapshots are stored for no charge, but additional manual ones are subject to extra costs.

2. Limited plug-in availability

While additional plugins can be installed, AWS comes prepackaged with a limited set that cannot be modified or expanded.

3. Limited configuration settings

The AWS service does not allow access permission configuration, which could create problems if users change settings on their own. The lack of full control over cluster settings and storage can result in scalability limitations.

4. Day-to-day management is still needed

AWS ElasticSearch still requires oversight by experienced staff to ensure the system is functioning efficiently.

Key Takeaways

ElasticSearch is a powerful, distributed engine that allows companies to store, search, and analyze massive data volumes in near real-time. However, an on-premises ElasticSearch deployment requires staff to install the engine, provide the necessary software, provision infrastructure, and manage the cluster. Additionally, security, software patching, backups, and failure recovery are necessary considerations when running the engine in a self-managed environment. Conversely, Amazon’s managed service provides the resources to deploy, manage, and secure the ElasticSearch engine, allowing company staff to focus on other pertinent issues.

Agilisium delivers innovative cloud-based solutions and services to provide greater data insights that improve company profitability. Contact us for more information on our various managed service offerings.

“Agilisium architected, designed and delivered an elastically scalable Cloud-based Analytics-ready Big Data solution with AWS S3 Data Lake as the single source of truth”
The client is one of the world’s leading biotechnology company, with presence in 100+ markets globally, was looking for ways to maximize impact of their sales & marketing efforts.

The lack of a single source of truth, quality data and ad hoc manual reporting processes undermined top management’s visibility of integrated insights on sales, sales rep interactions, marketing reach, brand performance, market share, and territory management. Understandably, the client wanted to align information that has hitherto been in silos, to gain a 360-degree product movement view, to optimize sales planning and gain competitive edge.