Introduction

Hello, In this post, I’ll guide you through the process of monitoring your Elasticsearch cluster using OpenTelemetry. The best part is that we won’t need to modify any Elasticsearch configurations or install additional exporters, as the Otel contrib version already contains an Elasticsearch receiver that we can use. We’ll also gather logs and system metrics, building upon what I’ve previously shown you in other articles.

OTEL Version

Make sure you have OTEL 0.79.0 installed, as it allows the use of dynamic indexes in Elasticsearch, which will come in handy later on.

Filelog Receiver for Logs

First, let’s set up the filelog receiver for handling logs. In the configuration, you’ll need to specify which files to read and how to extract the timestamp. Additionally, we’ll parse the log messages into separate fields in Elasticsearch.

filelog/elastic:
  include:
  - /var/log/elasticsearch/*.json
  include_file_name: false
  include_file_path: true
  operators:
  - type: json_parser
    timestamp:
      layout: '2006-01-02T15:04:05.999Z'
      parse_from: attributes["@timestamp"]
      layout_type: gotime
  - field: attributes["elasticsearch.index.suffix"]
    type: add
    value: elasticsearch
  - parse_from: body
    parse_to: attributes
    type: json_parser

In this configuration, we start by specifying the files we want to read. Then, we utilize operators to perform specific tasks:

  1. The first operator is used to extract the accurate timestamp from the log entry. This is achieved by specifying the timestamp’s layout and identifying the JSON field that represents the timestamp.

  2. Additionally, we include an extra field in the log entry with the name "elasticsearch.index.suffix" and set its value to elasticsearch. This field plays a crucial role for the Elasticsearch exporter, as it reads this value and utilizes it in the index name, ensuring proper indexing.

  3. Finally, the last operator reads the JSON message and creates a field for each field found within the JSON data.

This configuration streamlines log processing and ensures that the Elasticsearch index is correctly formed, making it easier to manage and analyze the log data.

Metrics Receiver

Now that the logs are set up, let’s move on to metrics. We’ll use the Elasticsearch receiver to fetch Elasticsearch metrics. Specify the endpoints, username, password, and other metrics you want to receive in the configuration.

elasticsearch:
  nodes: ["_local"]
  skip_cluster_metrics: false
  indices: ["_all"]
  endpoint: https://localhost:9200
  username: ELASTIC_USERNAME
  password: ELASTIC_PASSWORD
  collection_interval: 10s
  tls:
    insecure_skip_verify: true
  metrics:
    elasticsearch.index.documents:
      enabled: true
    elasticsearch.index.operations.merge.docs_count:
      enabled: true
    elasticsearch.index.segments.count:
      enabled: true
    elasticsearch.index.segments.size:
      enabled: true

The configuration is straightforward. I have Otel running on every Elasticsearch node, and the endpoint is set to localhost. As my Elasticsearch cluster has authentication enabled, I need to specify the username and password for access.

  • nodes: ["local"]: This setting allows Otel to fetch metrics specifically for the local Elasticsearch node.
  • indices: ["_all"]: It instructs Otel to gather data about each index in the cluster.

In the metrics dictionary, I can specify additional metrics that I want to receive. However, not all metrics are enabled by default. Since I’m using these metrics in my Grafana dashboard, I have explicitly enabled them to ensure they are collected and displayed correctly.

If you want to explore which metrics are disabled by default, you can find the information here.

Feel free to reach out if you need further clarification or assistance with anything else!

Hostmetrics Receiver

For instance-level metrics like CPU and memory, we’ll use the hostmetrics receiver. Configure it with the desired metrics.

  hostmetrics:
    collection_interval: 15s
    scrapers:
      cpu:
      disk:
      filesystem:
      load:
      memory:
      network:
      paging:
      process:
        mute_process_exe_error: true
        mute_process_io_error: true

Exporters

Now let’s move on to the exporters section. Here, you’ll replace the endpoints and credentials with your own. Specify the logs_index, and OTEL will automatically add the value from the elasticsearch.index.suffix log entry to the data stream name.

exporters:
  elasticsearch:
    endpoints:
    - ELASTIC_ENDPOINT
    logs_dynamic_index:
      enabled: true
    logs_index: logs-
    password: ELASTIC_PASSWORD
    user: ELASTIC_USER
  prometheusremotewrite:
    auth:
      authenticator: sigv4auth
    endpoint: PROMETHEUS_ENDPOINT
    resource_to_telemetry_conversion:
      enabled: true
    timeout: 30s

Pipeline

Create the OTEL pipeline, which defines where to take the values, performs some processing, and sends them to the exporter.

service:
  pipelines:
    metrics:
      receivers:
        - elasticsearch
        - hostmetrics
      processors:
        - batch
      exporters:
        - prometheusremotewrite
    logs/elasticsearch:
      exporters:
        - elasticsearch
      processors:
        - batch
      receivers:
        - filelog/elasticsearch

Complete OTEL Config

Here’s your complete OTEL configuration, including all the sections we’ve covered.

receivers:
  filelog/elastic:
    include:
    - /var/log/elasticsearch/*.json
    include_file_name: false
    include_file_path: true
    operators:
    - type: json_parser
      timestamp:
        layout: '2006-01-02T15:04:05.999Z'
        parse_from: attributes["@timestamp"]
        layout_type: gotime
    - field: attributes["elasticsearch.index.suffix"]
      type: add
      value: elasticsearch
    - parse_from: body
      parse_to: attributes
      type: json_parser

  elasticsearch:
    nodes: ["_local"]
    skip_cluster_metrics: false
    indices: ["_all"]
    endpoint: https://localhost:9200
    username: ELASTIC_USERNAME
    password: ELASTIC_PASSWORD
    collection_interval: 10s
    tls:
      insecure_skip_verify: true
    metrics:
      elasticsearch.index.documents:
        enabled: true
      elasticsearch.index.operations.merge.docs_count:
        enabled: true
      elasticsearch.index.segments.count:
        enabled: true
      elasticsearch.index.segments.size:
        enabled: true

  hostmetrics:
    collection_interval: 10s
    scrapers:
      cpu:
      disk:
      filesystem:
      load:
      memory:
      network:
      paging:
      process:
        mute_process_exe_error: true
        mute_process_io_error: true

processors:
  batch:

exporters:
  elasticsearch:
    endpoints:
      - ELASTIC_ENDPOINT
    logs_dynamic_index:
      enabled: true
    logs_index: logs-
    password: ELASTIC_PASSWORD
    user: ELASTIC_USER
  prometheusremotewrite:
    auth:
      authenticator: sigv4auth
    endpoint: PROMETHEUS_ENDPOINT
    resource_to_telemetry_conversion:
      enabled: true
    timeout: 30s

service:
  pipelines:
    metrics:
      receivers:
        - elasticsearch
        - hostmetrics
      processors:
        - batch
      exporters:
        - prometheusremotewrite
    logs/elasticsearch:
      exporters:
        - elasticsearch
      processors:
        - batch
      receivers:
        - filelog/elasticsearch

Conclusion

With this configuration in place, you should now be able to effectively monitor your Elasticsearch cluster with OpenTelemetry. Happy monitoring!