The challenges of self-managing OpenSearch motivate many organizations to migrate data to AWS OpenSearch Service. While this is not a complex undertaking, it requires attention to detail to avoid accidental data corruption or deletion. The following steps and recommendations can help ensure a smooth migration to the open search AWS.
What is AWS OpenSearch Service?
AWS OpenSearch Service is a high-quality managed service tailored to meet its customers' analytics and monitoring needs. With support for 19 versions of hosted Elasticsearch and visualizations powered by Kibana, OpenSearch provides an easy way to perform searches and quickly generate insightful reports. The platform has been incredibly successful–with tens of thousands of customers utilizing hundreds of thousands of clusters each month, and it processes trillions of requests monthly!
Why migrate to AWS OpenSearch from self-hosted OpenSearch?
AWS OpenSearch Service is a cost-effective and scalable search solution that makes it easy to set up, manage, and operate hosted search engines. It is fully compatible with the OpenSearch standard, so it can be used as a drop-in replacement for any self-hosted OpenSearch engine. There are many benefits to using AWS OpenSearch, including the following:
Elastic scaling – With AWS OpenSearch, you can quickly scale your search engine up or down to meet changing demand. There is no need to provision or manage any underlying infrastructure.
High availability – AWS OpenSearch is designed to be highly available. It uses multiple Availability Zones to ensure that your search engine is always available, even in an infrastructure failure.
Management console – The AWS OpenSearch Management Console provides a simple way to view and manage your search engine. You can monitor search performance, adjust settings, and manage indexing and query settings from the same console.
Low cost – Using EC2 Spot Instances and other cost-saving features, the cost of running AWS OpenSearch is often lower than running a self-hosted search engine.
Security – AWS OpenSearch includes built-in authentication, encryption, access control, and other security features to help keep your search engine safe and secure.
Flexible search capabilities – AWS OpenSearch service provides several tools to customize the search experience, including suggestion APIs, document scoring and filtering, and other advanced search features.
Performance – With Elastic Block Storage, EC2 High-Performance Computing, and other performance-enhancing features, AWS OpenSearch provides faster search results than self-hosted solutions.
Flexible pricing – AWS OpenSearch offers flexible pricing plans to fit any budget.
Easy integration – Using the AWS OpenSearch API, you can easily integrate your search engine with other services, such as CRM and analytics, for more comprehensive insights into your customers' search behaviour.
Easy Setup and High Security: The setup is easy, managing resources is simpler still, and security is more sound thanks to integrated tools like AWS Identity and Access Management (IAM). Additionally, you only pay for the resources you use –a pay-as-you-go system– meaning no fixed capacity or cost if not in use.
Overall, migrating to AWS OpenSearch can provide scalability, reliability, ease of use, security and cost savings compared to self-hosted solutions. Why wait? Make the switch today!
Planning and Preparation for Migration
Before migrating OpenSearch to AWS, it is essential to assess the existing environment and define the objectives for the migration. These often include addressing costs as well as improving scalability and reliability. The outcome of this first step can help determine which applications to migrate to the cloud service.
The following steps are followed in the migration's initial assessment and planning phase.
1. Assess current cluster data and plan how many shards will be needed
Analyze the indices to determine how much data they use. Determine how many shards are needed from this information. Amazon recommends a shard size of between 10 and 50 GB. Keep in mind that larger shards may complicate OpenSearch’s recovery from failures. Conversely, performance issues and memory errors can result if too many smaller shards are used. Amazon advises keeping the shard size small enough for instances to handle them. However, using an excessive number of smaller shards can result in hardware strain. Account for any missing indices and consider adding another shard if this number is expected to increase with the rollout of new applications.
2. Determine how many instances are needed in the cluster
If there are more than ten instances, enable the dedicated master nodes during the AWS ES cluster creation. In the case of less than ten instances with no dedicated master, all will be master-eligible. Amazon recommends a minimum of three nodes to avoid a lapse in communication that may lead to one cluster having two master nodes. For three dedicated master nodes, a minimum of two data nodes for each replication should be used.
3. Calculate storage requirements
One of the most common causes of cluster instability is a lack of sufficient storage. As a result, ensure accurate numbers for instance types, instance counts, and storage volumes. Use the formula, Source Data * (1 + Number of Replicas) * 1.45 = Minimum Storage Requirements. If the domain requires more than 1 PB, Amazon offers storage up to 2 PB. It is advised to verify associated costs with domains of this size before proceeding.
While calculating storage, consider the type of OpenSearch workloads. These are either long-lived indices or rolling indices. Long-lived index examples include website, document, and ecommerce searches. In these indices, code is written to process data in one more index which are updated as the source data changes. Rolling indices use continuously flowing data into temporary indices with indexing periods and retention windows. These would apply to log analytics, time-series processing, and clickstream analytics.
The number of replicas needed must also be considered. Amazon recommends at least one replica, though more may be needed to improve search performance in read-heavy workloads. Other aspects include OpenSearch indexing overhead, operating system space, and Amazon ES overhead size.
Migrating to AWS
Upon completion of the assessments above, the migration process can begin as outlined below.
1. Create the AWS domain
The AWS domain can be created by either using the CLI or the console. Using the “Create OpenSearch domain” console wizard, enter the cluster name in step one. In the steps that follow, enter the instance information and storage size. This creates a new empty AWS OpenSearch cluster.
2. Create an S3 bucket
Because AWS migration cannot be done by transporting data from connecting two OpenSearch clusters, an AWS S3 bucket must first be created before moving forward. Grant list, read, and write permissions for a new user in the access policy. The policies can also be created using the console or more advanced command options. The bucket must be in the same region as the AWS OpenSearch side to facilitate the repository registration. Access to the bucket can be verified by using CLI with the access key and secret key commands.
3. Register the S3 bucket as a snapshot repository
In order to connect the data between the two clusters for migration preparation, both OpenSearch instances must have the same S3 buckets registered as snapshot repositories. The repositories must be registered before snapshot and restore operations can be done. The steps to accomplish this for the self-hosted instance can be found here. The AWS hosted OpenSearch bucket registration process needs USER, ROLE, and POLICIES configured in AWS IAM. From there, the process is similar to that of the self-hosted OpenSearch process, except for signing the request and specifying the OpenSearch role in the message body. Interfaces such as Postman or Boto can be also be used for this process.
4. Restoring from S3 to AWS ES
After identifying the snapshots to migrate, restoring the snapshots can be done through either HTTP basic authentication with master user credentials or with AWS authentication using IAM credentials. Check that all snapshots have been uploaded from the self-hosted OpenSearch repository by calling up the AWS repository.
5. Finalize the migration
After verifying that all indices were restored as anticipated, reindex those that needed to be changed. An example of this would be changing daily indices to be reindexed monthly. Reindexing status can be checked by using the _tasks API. Other tasks include switching the Logstash OpenSearch output to be sent to AWS E3 and adjusting retention scripts as necessary. Finally, configure clients to use the new Amazon ES endpoint and configure new IAM roles.
Organizations migrating to Amazon’s OpenSearch service stand to gain numerous and significant benefits over the self-managed OpenSearch service. While OpenSearch is a highly popular, open-source search and analytics engine, deployment and management can be tedious and time-consuming. By following the above guidelines, the migration process to Amazon OpenSearch can be efficiently completed without data loss or corruption.
Agilisium is an AWS advanced consulting partner providing digital transformation expertise to businesses across several sectors. Contact us for more information on how our services and solutions can improve your business’ productivity.