Mike's Blog

Review: Intro to ElasticSearch - Pluralsight Courses

April 08, 2020

Searching and Analyzing Data with ES: Getting Started

Designing Schema for ElasticSearch

Overview

These two courses by Janani Ravi provide an introduction to ElasticSearch:

  • what is ElasticSearch
  • when is ElasticSearch appropriate for your application/platform
  • how to interact with ElasticSearch
  • how to configure ES indices and fields
  • best practices when designing an ES index

The course also covers the history of search as well as the search algorithms used by ES to make it performant.

Overall, I found the course to be a great introduction to ES as a techonology and the demos were helpful, concise and informative in understanding how to configure and interact with an ES cluster.

Key Takeaways

  • ES is a schemaless, document based data store that is optimized for search.
  • The “Index” is the key component of ElasticSearch and is analougous to a collection in MongoDB and conceptually similar to a table in a relational database.
  • ES indices are distributed amongst various nodes in a cluster. This is to say an ES index is “sharded” across multiple nodes.
  • ES nodes are replicated to provide high availability.
  • You can interact with an ES cluster via a REST API. Querying an ES cluster can be done with search parameters in the url or in the request body. IMO the latter is preferable as the request body is in JSON format and is easy to read.
  • By default, ES infers the type of an index field based on the incoming JSON object. This is generally not desireable as you’ll want more control over these fields and therefore can specify upfront which fields of which type make up the index. This is similar to defining a database table in a RDBMS.
  • ES shines when it comes to text search. By default, a string field will be indexed both by “full text search” strategy and “keyword” strategy. The former strategy splits up the text, tokenizes and normalizes said text for easy text search. The latter is an exacty match on a provided search string.
  • Querying an ES index seems very robust, including pattern matching, boolean logic, subdocument search, etc.
  • While ES documents can have pointers to other documents and have parent-child relationships, my impression based on these tutorials was that this is really not what ES should be used for. It’s main purpose is for performant search queries and joining of documents and normalized data significantly hampers performance.
  • The author introduces a nifty UI tool for ES cluster monitoring called ElasticSearch Head

Mike Joyce
My name is Mike and this is my blog. I write about technology, work and, at times, my personal life.