OpenSearch Management
OpenSearch Management Choices - Is Amazon's Service Right for Your Business?

The demand for scalable, open-source, robust search engines has never been greater as enterprises tackle processing and analyzing continuous amounts of big data. One such solution is OpenSearch, a Java-based enterprise-grade search engine based on Lucerne, capable of fast search responses through index searching. Amazon has taken OpenSearch a step further by providing a fully managed service that is easily deployed and managed.

Amazon OpenSearch in Depth

The AWS OpenSearch service removes the challenge of self-deploying, scaling, and managing OpenSearch clusters by providing services for hardware provisioning, software installation and patching, failure recovery, backups, and monitoring. The service supports OpenSearch’s open-source APIs and integrates with Logstash, Kibana, and Beats, allowing the use of an organization’s existing code and tools for efficient and secure data insights.

Amazon’s OpenSearch domain and OpenSearch’s cluster are identical. AWS domains are clusters that are specified with instance type and count, settings, and storage resources. AWS allows the creation of multiple indices within a domain. AWS OpenSearch utilizes a blue/green deployment process that runs one live production environment alongside an idle test system.

Log files, messages, metrics, configuration information, and documents and lists are captured, processed, and loaded into AWS’s service, where it is fully managed and secured by Kibana. Use cases include application monitoring, security information and event management, search results, and infrastructure monitoring. The expediently processed output of search results, analytics, and logs provides organizations with real-time system insights.

Open-Source OpenSearch Tasks

Despite the greater simplicity of setup and management of Amazon’s OpenSearch service, many users prefer to use the independent open-source version for its greater flexibility and decreased administration costs. However, this method requires a great deal of time and expertise to deploy and manage. These tasks include network configurations as well as developing router and firewall rules for cluster communications. Other critical responsibilities are the creation of load balancers and automatic scaling groups, automating configuration, and managing security upgrades. A self-managed OpenSearch platform also requires a 24/7 team to oversee the operations and be available to resolve issues.

Self-Managed OpenSearch Benefits

Both types of OpenSearch approaches to present their share of benefits and limitations. Self-managed OpenSearch services carry the following advantages:

1. Cost savings

The unmanaged version of OpenSearch can be as much as 50% cheaper to install and maintain than the AWS Service. This is especially the case if the amount of data is less than 5 GB and is run in multiple small clusters.

2. Increased availability of instance types and clusters

Unlike the AWS service, larger i2 instances can be used with access to the latest c4 and m4 instances. This allows further scaling, which can be more economically feasible. Clusters with more than 20 nodes can be deployed.

3. Ability to change more index settings

In addition to analysis and shards/replicas, there is the ability to delay allocation when a node leaves. All indices settings can also be changed with the /_settings endpoint. These and other various settings included in OpenSearch enable increased setup optimization and greater efficiency in resource use.

4. Cluster-wide settings can be modified

Most settings in OpenSearch can be changed on a running cluster with the cluster update settings API.

5. Greater access to APIs and plugins

A broader variety of APIs, such as Hot Threads, a useful debugging tool, can be used in the self-managed version of OpenSearch. Additionally, all plugins are fully compatible with this version

6. Access to comprehensive monitoring solutions

A collection of dashboard features are available to monitor the performance of OpenSearch, Kibana, Beats, and Logstash.

The biggest drawback to using self-managed OpenSearch is the necessity of experienced admins who must ensure security, backups, monitoring, and failure recovery are kept current. Additionally, the added responsibilities of managing your instances can be time-consuming and costly.

AWS OpenSearch Benefits

The benefits of using AWS OpenSearch managed services are as follows:

1. Quickly deployed and easily managed

Setup and configuration of clusters in AWS OpenSearch are efficient and include simplified management tools. Built-in monitoring and alerting ensure any changes are addressed proactively. Infrastructure monitoring through a log and metric collections can be accomplished to ensure efficient server performance.

2. Robust security

Secure and isolated network with Amazon VPC. Additionally, AWS IAM and Amazon’s Cognito policies effectively manage authentication and access. Logs from routers, applications, and devices can be indexed, searched, and visualized with Kibana to detect security threats.

3. Highly scalable

Clusters are easily scaled using a single API call or through the AWS console. Data can be easily replicated between three Availability Zones in the same region.

4. Built-in integrations

AWS services such as AWS IoT, Cloudwatch Logs, and Kinesis Firehouse can be seamlessly integrated.

5. Open source API support

New software or coding skills are not required since AWS OpenSearch supports open-source APIs. Logstash for open-source data ingestion and Kibana for data visualization from the ELK stack

AWS OpenSearch Limitations

Although Amazon’s service brings several advantages to organizations, the following limitations must also be considered before selecting this service.

1. Cost considerations

Rates vary by instance type and region and are paid by the hour. Although there are no upfront fees or minimum usage stipulations, the costs can become excessive with operational scaling. Automated snapshots are stored for no charge, but additional manual ones are subject to extra costs.

2. Limited plug-in availability

While additional plugins can be installed, AWS comes prepackaged with a limited set that cannot be modified or expanded.

3. Limited configuration settings

The AWS service does not allow access permission configuration, which could create problems if users change settings on their own. The lack of full control over cluster settings and storage can result in scalability limitations.

4. Day-to-day management is still needed

AWS OpenSearch still requires oversight by experienced staff to ensure the system is functioning efficiently.

Key Takeaways

OpenSearch is a powerful, distributed engine that allows companies to store, search, and analyze massive data volumes in near real-time. However, an on-premises OpenSearch deployment requires staff to install the engine, provide the necessary software, provision infrastructure, and manage the cluster. Additionally, security, software patching, backups, and failure recovery are necessary considerations when running the engine in a self-managed environment. Conversely, Amazon’s managed service provides the resources to deploy, manage, and secure the OpenSearch engine, allowing company staff to focus on other pertinent issues.

Agilisium delivers innovative cloud-based solutions and services to provide greater data insights that improve company profitability. Contact us for more information on our various managed service offerings.

“Agilisium architected, designed and delivered an elastically scalable Cloud-based Analytics-ready Big Data solution with AWS S3 Data Lake as the single source of truth”
The client is one of the world’s leading biotechnology company, with presence in 100+ markets globally, was looking for ways to maximize impact of their sales & marketing efforts.

The lack of a single source of truth, quality data and ad hoc manual reporting processes undermined top management’s visibility of integrated insights on sales, sales rep interactions, marketing reach, brand performance, market share, and territory management. Understandably, the client wanted to align information that has hitherto been in silos, to gain a 360-degree product movement view, to optimize sales planning and gain competitive edge.