Implementation Of A Full Text Search Using Elasticsearch

  • Post published:August 28, 2015

vteam #352 was working on a project to build a web application to manage meetings. This application provides a platform for its users to manage, structure, record and follow up on their meetings. Users can create minutes, agenda or follow-up of a meeting where they can take notes, record decisions, assign tasks during the meeting. This applications acts as a task manager that facilitates the organizer to follow-up on his/her tasks. It has task and meeting workspaces where users can list their tasks and meetings.

Our client wanted us to improve the performance of loading workspaces. A full text search engine was required to be integrated within the application so that users can search keywords from their complete agenda that will include comments and attachment names as well.

Challenge:

It was a challenging task to improve the performance of loading workspaces along with the integration of a search engine that fulfills all the requirements of text searching.

Solution:

Initially Sphnix was integrated which is an open source search engine. Sphinx was a batch indexer at that time, while vteam #352 required to integrate a real-time search engine. Sphinx also required that all fields to be indexed be defined before indexing the data.

To resolve the issues with Sphinx, we benchmarked our custom solution with Solr and Elasticsearch. It was finally decided to move forward with Elasticsearch due to following features:

  • It’s real-time performance was found to be better than Solr
  • It provides a RESTful API endpoint
  • It provides a full Query DSL based on JSON to define queries
    .

Elasticsearch uses Lucene for full text search which has the most powerful full text search capabilities in any open source product. Elasticsearch is schema free and the criteria used for searching text can be changed at any time. Elasticsearch is more dynamic and data can easily move around the cluster as its nodes come and go.

Conclusion:

As a result, we improved application’s performance and implemented full text search, due to which we were able to move other filters to Elasticsearch queries. The performance of the application was improved 10x as compared to the previous implementation. Now it takes only 1-2 seconds to load the workspaces or filter Tasks and Meetings.