Modifying Elasticsearch: Practical Examples and Things to Learn

MicrosoftTeams-image (1).png

One of our customers had an unusual problem: the default Elasticsearch search engine could not migrate indexes silently. As a result, the company's engineers had to shut down their search for several hours each time they needed to migrate. We made the system available 24/7, reduced incidences of human error, improved migration speed, and reduced maintenance costs.

Reference

Elasticsearch is a full-text search engine built on the Lucene library. Thanks to its mature ecosystem and easy infrastructure integration, Elasticsearch has become a market leader. Several major companies use it, including Microsoft, eBay, Amazon, GitHub, Netflix, and many other large firms.

Indexing/Reindexing is the process of adding information about documents and files to the search engine's database so that you can later perform full-text searches on relevant, indexed information.

Major problems

A large company that had integrated Elasticsearch into its infrastructure faced many challenges. As the volume of information in the company's services grew, schema migration became increasingly time-consuming and expensive. Fast and accurate data reindexing was a critical factor in solving this problem.

Problem #1
Reindexing the data takes several hours, rendering the search function unavailable to employees.
Negative impact
The search engine becomes inaccessible to users, leading to dissatisfaction with the company's internal services.

Problem #2
As the business grows, the cost of technical support for the search engine increases exponentially.
Negative impact
DevOps engineers are spending more time on reindexing and data migration issues.

Both issues are related to the unique characteristics of the Elasticsearch system. It lacks tools for managing migrations and data validation, and reindexing requires additional customization. All of these issues have a negative impact on the search system, which hurts both users and the business.

Solving the problem of migration control

The development of business services results in changes to the schema in the relational database that require quick migration to the search system. A tool is needed to migrate changes, track versions and monitor processes. The problem is that Elasticsearch does not include an integrated tool for this.

We have developed a utility called MigrationTool that includes flexible settings, scripting and profile support, and the ability to add functionality in the future. When launched, the tool checks for the latest version of changes in Elasticsearch and initiates the migration process if newer versions are available.

01 Elasticsearch.jpg

The solution ensures that all index versions are automatically migrated to the search engine one at a time, bypassing any previously applied versions. This process takes place without the involvement of engineers, who would otherwise spend more time on it and occasionally make mistakes.

Before
As the company grew, the cost of maintaining the search engine increased. The DevOps team had to direct their attention from critical projects to oversee Elasticsearch migrations. The process often ran into problems that required additional resources to resolve.
After
MigrationTool monitors versions and performs migrations on its own. The cost of maintaining the system and the number of allowed bugs have decreased. Engineers are no longer distracted from critical projects.

Solving data reindexing problems

Changes to the master data in the relational database, the index structure, or its static settings can break the search or destroy the search information.

The problem can be solved by versioning additional index fields. Another way to avoid data loss is to create new versions of indexes with modified structures. We have developed a utility with the working name ReindexTool that minimizes the number of errors and automatically updates the search service, accounting for the selected criteria and modes.

02 Elasticsearch.jpg

The tool allows you to reindex without shutting down the search engine. For example, you want to migrate an index from version 1.0 to version 2.0. Users read information from the old index, and in the background, we create a version 2.0 index where the changes are written. When the migration is complete, we switch reading from version 1.0 to version 2.0. Search does not stop for a second.

Before
Reindexing often caused errors that broke the system and removed important data from the index. During reindexing, search information was partially unavailable to users.
After
ReindexTool automated the processes and minimized the number of errors. The search is now available 24/7 because the reindexing occurs in the background.

Reindexing Optimization

When the number of documents increased to 30-50 million, reindexing took up to 5 hours. We have added several settings to ReindexTool that reduce the load on the search engine and speed up the process:

  • The number of index replicas before and after reindexing.
  • The time interval at which updated index content becomes searchable.
  • The ability to run parallel processes to retrieve data from the database.
  • The ability to batch data and submit it for indexing in batches.
  • The ability to index data for a specific time period and other filters for which you want to perform reindexing.

Before
Elasticsearch's out-of-the-box settings slowed the reindexing process, which, at one point, took up to 5 hours.
After
After tweaking Elasticsearch's settings, it reduced the reindexing time to an average of 60 minutes.

Conclusions and metrics

As an organization grows, the search engine may struggle to keep up with the rapidly expanding data set. This deficiency can lead to longer reindexing times, errors, higher maintenance costs, and user dissatisfaction.

With the right approach and experience with search engines, you can make Elasticsearch more efficient. With just two tools and a few tweaks, Sibedge:

  • Reduced reindexing time by a factor of five.
  • Automated migrations and freed up DevOps engineers.
  • Ensured 24/7 search availability for users.
  • Minimized the number of errors in the search engine.
  • Reduced technical support costs for Elasticsearch.

Organizations of all sizes can learn from the outcomes of these use cases. Regardless of the complexity of the infrastructure and the size of the document database, any search engine can be made faster, more autonomous, and more stable.

Metrics improvement table:

Number of documents (pcs.) Search unavailability (min.) Reindexing time (min.) Maintenance cost per migration ($)**
Before After Before After Before After
100К-1M 30 (60*) 0 30 10 675 169
1M-10M 90 (90*) 0 90 20 1012 169
10M-100М 300 (120*) 0 300 60 1350 169

* When using a strategy with two indexes but in fully manual mode.
** 1 hour of DevOps engineer work = $675.