Setup a Multi-Tier Elasticsearch Cluster: Configuring Hot, Warm, Cold, and Frozen Nodes

Setup a Multi-Tier Elasticsearch Cluster: Configuring Hot, Warm, Cold, and Frozen Nodes

Status
Not started
Created
Oct 30, 2024 06:15 AM
Tags
DevOps
Creating a multi-tier Elasticsearch cluster with Hot, Warm, Cold, and Frozen nodes is a solution to manage large volumes of data with optimized storage costs and performance.
This guide will walk you through setting up such an example of Elasticsearch cluster using Docker Compose.

Prerequisites

Before we begin, make sure you have the following:
  1. Docker and Docker Compose installed on your system
  1. A basic understanding of Docker Compose and Elasticsearch
  1. Sufficient system resources (we’re going to deploy 4 Elasticsearch container, each we’ll limit to maximum of 1 GB)

Hot-Warm-Cold-Frozen Architecture 101

Cluster nodes can be organized such that there is a pool of nodes with fast hardware profiles to take care of the more frequently searched data and then different profiles, each one with a less powerful hardware profile, to take care of search requests that will ask for data not so frequently searched.
There are lots of use cases where data gets searched even less often as time goes by. Time series data such as logs, metrics and transactions in general are a good example of data that is of less interest to users as the days/weeks/years go by.
Key Concepts:
  • Use cases on time series data (financial transactions, observability, history)
  • Background problem of different hardware profiles and the number of nodes
  • Elasticsearch node roles
  • Elasticsearch data tiers
  • Index allocation
  • Index-Lifecycle Management and migration

Cluster Overview

In this multi-tier setup:
  • Hot Nodes: handle the most recent, frequently queried data with high-performance storage and more memory
  • Warm Nodes: store older data that is less frequently accessed but still occasionally needed
  • Cold Nodes: archive data that is rarely accessed and can use lower-cost storage
  • Frozen Nodes: house historical data that’s rarely searched, using minimal resources and the cheapest storage

Step-by-Step Guide

Step 1: Define Node Configuration

In a Docker Compose setup, each type of node is defined as a service. We’ll assign node roles and storage preferences using environment variables and Elasticsearch configuration options.
Create a new docker-compose.yml file and define services for hot, warm, cold, and frozen nodes.
services: # Hot Node es-hot: image: docker.elastic.co/elasticsearch/elasticsearch:8.0.0 container_name: es-hot environment: - node.roles=data_hot - cluster.name=multi-tier-cluster - node.name=es-hot - discovery.type=single-node - ES_JAVA_OPTS=-Xms2g -Xmx2g volumes: - hot_data:/usr/share/elasticsearch/data ports: - "9200:9200" # Warm Node es-warm: image: docker.elastic.co/elasticsearch/elasticsearch:8.0.0 container_name: es-warm environment: - node.roles=data_warm - cluster.name=multi-tier-cluster - node.name=es-warm - discovery.type=single-node - ES_JAVA_OPTS=-Xms1g -Xmx1g volumes: - warm_data:/usr/share/elasticsearch/data # Cold Node es-cold: image: docker.elastic.co/elasticsearch/elasticsearch:8.0.0 container_name: es-cold environment: - node.roles=data_cold - cluster.name=multi-tier-cluster - node.name=es-cold - discovery.type=single-node - ES_JAVA_OPTS=-Xms512m -Xmx512m volumes: - cold_data:/usr/share/elasticsearch/data # Frozen Node es-frozen: image: docker.elastic.co/elasticsearch/elasticsearch:8.0.0 container_name: es-frozen environment: - node.roles=data_frozen - cluster.name=multi-tier-cluster - node.name=es-frozen - discovery.type=single-node - ES_JAVA_OPTS=-Xms256m -Xmx256m volumes: - frozen_data:/usr/share/elasticsearch/data volumes: hot_data: warm_data: cold_data: frozen_data: