Modifying Elasticsearch: Practical Examples and Things to Learn
One of our customers had an unusual problem: the default Elasticsearch search engine could not migrate indexes silently. As a result, the company's engineers had to shut down their search for several hours each time they needed to migrate. We made the system available 24/7, reduced incidences of human error, improved migration speed, and reduced maintenance costs.
Elasticsearch is a full-text search engine built on the Lucene library. Thanks to its mature ecosystem and easy infrastructure integration, Elasticsearch has become a market leader. Several major companies use it, including Microsoft, eBay, Amazon, GitHub, Netflix, and many other large firms.
Indexing/Reindexing is the process of adding information about documents and files to the search engine's database so that you can later perform full-text searches on relevant, indexed information.
A large company that had integrated Elasticsearch into its infrastructure faced many challenges. As the volume of information in the company's services grew, schema migration became increasingly time-consuming and expensive. Fast and accurate data reindexing was a critical factor in solving this problem.
Problem #1Reindexing the data takes several hours, rendering the search function unavailable to employees.
Negative impactThe search engine becomes inaccessible to users, leading to dissatisfaction with the company's internal services.
Problem #2As the business grows, the cost of technical support for the search engine increases exponentially.
Negative impactDevOps engineers are spending more time on reindexing and data migration issues.
Both issues are related to the unique characteristics of the Elasticsearch system. It lacks tools for managing migrations and data validation, and reindexing requires additional customization. All of these issues have a negative impact on the search system, which hurts both users and the business.
Solving the problem of migration control
The development of business services results in changes to the schema in the relational database that require quick migration to the search system. A tool is needed to migrate changes, track versions and monitor processes. The problem is that Elasticsearch does not include an integrated tool for this.
We have developed a utility called MigrationTool that includes flexible settings, scripting and profile support, and the ability to add functionality in the future. When launched, the tool checks for the latest version of changes in Elasticsearch and initiates the migration process if newer versions are available.
The solution ensures that all index versions are automatically migrated to the search engine one at a time, bypassing any previously applied versions. This process takes place without the involvement of engineers, who would otherwise spend more time on it and occasionally make mistakes.
BeforeAs the company grew, the cost of maintaining the search engine increased. The DevOps team had to direct their attention from critical projects to oversee Elasticsearch migrations. The process often ran into problems that required additional resources to resolve.
AfterMigrationTool monitors versions and performs migrations on its own. The cost of maintaining the system and the number of allowed bugs have decreased. Engineers are no longer distracted from critical projects.
Solving data reindexing problems
Changes to the master data in the relational database, the index structure, or its static settings can break the search or destroy the search information.
The problem can be solved by versioning additional index fields. Another way to avoid data loss is to create new versions of indexes with modified structures. We have developed a utility with the working name ReindexTool that minimizes the number of errors and automatically updates the search service, accounting for the selected criteria and modes.
The tool allows you to reindex without shutting down the search engine. For example, you want to migrate an index from version 1.0 to version 2.0. Users read information from the old index, and in the background, we create a version 2.0 index where the changes are written. When the migration is complete, we switch reading from version 1.0 to version 2.0. Search does not stop for a second.
BeforeReindexing often caused errors that broke the system and removed important data from the index. During reindexing, search information was partially unavailable to users.
AfterReindexTool automated the processes and minimized the number of errors. The search is now available 24/7 because the reindexing occurs in the background.
Reindexing OptimizationWhen the number of documents increased to 30-50 million, reindexing took up to 5 hours. We have added several settings to ReindexTool that reduce the load on the search engine and speed up the process:
- The number of index replicas before and after reindexing.
- The time interval at which updated index content becomes searchable.
- The ability to run parallel processes to retrieve data from the database.
- The ability to batch data and submit it for indexing in batches.
- The ability to index data for a specific time period and other filters for which you want to perform reindexing.
BeforeElasticsearch's out-of-the-box settings slowed the reindexing process, which, at one point, took up to 5 hours.
AfterAfter tweaking Elasticsearch's settings, it reduced the reindexing time to an average of 60 minutes.
Conclusions and metrics
As an organization grows, the search engine may struggle to keep up with the rapidly expanding data set. This deficiency can lead to longer reindexing times, errors, higher maintenance costs, and user dissatisfaction.With the right approach and experience with search engines, you can make Elasticsearch more efficient. With just two tools and a few tweaks, Sibedge:
- Reduced reindexing time by a factor of five.
- Automated migrations and freed up DevOps engineers.
- Ensured 24/7 search availability for users.
- Minimized the number of errors in the search engine.
- Reduced technical support costs for Elasticsearch.
Organizations of all sizes can learn from the outcomes of these use cases. Regardless of the complexity of the infrastructure and the size of the document database, any search engine can be made faster, more autonomous, and more stable.
Metrics improvement table:
|Number of documents (pcs.)||Search unavailability (min.)||Reindexing time (min.)||Maintenance cost per migration ($)**|
* When using a strategy with two indexes but in fully manual mode.
** 1 hour of DevOps engineer work = $675.