Introduction
Hello, In this post, I’ll guide you through the process of monitoring your Elasticsearch cluster using OpenTelemetry. The best part is that we won’t need to modify any Elasticsearch configurations or install additional exporters, as the Otel contrib version already contains an Elasticsearch receiver that we can use. We’ll also gather logs and system metrics, building upon what I’ve previously shown you in other articles.
OTEL Version
Make sure you have OTEL 0.79.0 installed, as it allows the use of dynamic indexes in Elasticsearch, which will come in handy later on.
Filelog Receiver for Logs
First, let’s set up the filelog receiver for handling logs. In the configuration, you’ll need to specify which files to read and how to extract the timestamp. Additionally, we’ll parse the log messages into separate fields in Elasticsearch.
filelog/elastic:
include:
- /var/log/elasticsearch/*.json
include_file_name: false
include_file_path: true
operators:
- type: json_parser
timestamp:
layout: '2006-01-02T15:04:05.999Z'
parse_from: attributes["@timestamp"]
layout_type: gotime
- field: attributes["elasticsearch.index.suffix"]
type: add
value: elasticsearch
- parse_from: body
parse_to: attributes
type: json_parser
In this configuration, we start by specifying the files we want to read. Then, we utilize operators to perform specific tasks:
The first operator is used to extract the accurate timestamp from the log entry. This is achieved by specifying the timestamp’s layout and identifying the JSON field that represents the timestamp.
Additionally, we include an extra field in the log entry with the name
"elasticsearch.index.suffix"
and set its value toelasticsearch
. This field plays a crucial role for the Elasticsearch exporter, as it reads this value and utilizes it in the index name, ensuring proper indexing.Finally, the last operator reads the JSON message and creates a field for each field found within the JSON data.
This configuration streamlines log processing and ensures that the Elasticsearch index is correctly formed, making it easier to manage and analyze the log data.
Metrics Receiver
Now that the logs are set up, let’s move on to metrics. We’ll use the Elasticsearch receiver to fetch Elasticsearch metrics. Specify the endpoints, username, password, and other metrics you want to receive in the configuration.
elasticsearch:
nodes: ["_local"]
skip_cluster_metrics: false
indices: ["_all"]
endpoint: https://localhost:9200
username: ELASTIC_USERNAME
password: ELASTIC_PASSWORD
collection_interval: 10s
tls:
insecure_skip_verify: true
metrics:
elasticsearch.index.documents:
enabled: true
elasticsearch.index.operations.merge.docs_count:
enabled: true
elasticsearch.index.segments.count:
enabled: true
elasticsearch.index.segments.size:
enabled: true
The configuration is straightforward. I have Otel running on every Elasticsearch node, and the endpoint is set to localhost. As my Elasticsearch cluster has authentication enabled, I need to specify the username and password for access.
nodes: ["local"]
: This setting allows Otel to fetch metrics specifically for the local Elasticsearch node.indices: ["_all"]
: It instructs Otel to gather data about each index in the cluster.
In the metrics
dictionary, I can specify additional metrics that I want to receive. However, not all metrics are enabled by default. Since I’m using these metrics in my Grafana dashboard, I have explicitly enabled them to ensure they are collected and displayed correctly.
If you want to explore which metrics are disabled by default, you can find the information here.
Feel free to reach out if you need further clarification or assistance with anything else!
Hostmetrics Receiver
For instance-level metrics like CPU and memory, we’ll use the hostmetrics receiver. Configure it with the desired metrics.
hostmetrics:
collection_interval: 15s
scrapers:
cpu:
disk:
filesystem:
load:
memory:
network:
paging:
process:
mute_process_exe_error: true
mute_process_io_error: true
Exporters
Now let’s move on to the exporters section. Here, you’ll replace the endpoints and credentials with your own. Specify the logs_index, and OTEL will automatically add the value from the elasticsearch.index.suffix log entry to the data stream name.
exporters:
elasticsearch:
endpoints:
- ELASTIC_ENDPOINT
logs_dynamic_index:
enabled: true
logs_index: logs-
password: ELASTIC_PASSWORD
user: ELASTIC_USER
prometheusremotewrite:
auth:
authenticator: sigv4auth
endpoint: PROMETHEUS_ENDPOINT
resource_to_telemetry_conversion:
enabled: true
timeout: 30s
Pipeline
Create the OTEL pipeline, which defines where to take the values, performs some processing, and sends them to the exporter.
service:
pipelines:
metrics:
receivers:
- elasticsearch
- hostmetrics
processors:
- batch
exporters:
- prometheusremotewrite
logs/elasticsearch:
exporters:
- elasticsearch
processors:
- batch
receivers:
- filelog/elasticsearch
Complete OTEL Config
Here’s your complete OTEL configuration, including all the sections we’ve covered.
receivers:
filelog/elastic:
include:
- /var/log/elasticsearch/*.json
include_file_name: false
include_file_path: true
operators:
- type: json_parser
timestamp:
layout: '2006-01-02T15:04:05.999Z'
parse_from: attributes["@timestamp"]
layout_type: gotime
- field: attributes["elasticsearch.index.suffix"]
type: add
value: elasticsearch
- parse_from: body
parse_to: attributes
type: json_parser
elasticsearch:
nodes: ["_local"]
skip_cluster_metrics: false
indices: ["_all"]
endpoint: https://localhost:9200
username: ELASTIC_USERNAME
password: ELASTIC_PASSWORD
collection_interval: 10s
tls:
insecure_skip_verify: true
metrics:
elasticsearch.index.documents:
enabled: true
elasticsearch.index.operations.merge.docs_count:
enabled: true
elasticsearch.index.segments.count:
enabled: true
elasticsearch.index.segments.size:
enabled: true
hostmetrics:
collection_interval: 10s
scrapers:
cpu:
disk:
filesystem:
load:
memory:
network:
paging:
process:
mute_process_exe_error: true
mute_process_io_error: true
processors:
batch:
exporters:
elasticsearch:
endpoints:
- ELASTIC_ENDPOINT
logs_dynamic_index:
enabled: true
logs_index: logs-
password: ELASTIC_PASSWORD
user: ELASTIC_USER
prometheusremotewrite:
auth:
authenticator: sigv4auth
endpoint: PROMETHEUS_ENDPOINT
resource_to_telemetry_conversion:
enabled: true
timeout: 30s
service:
pipelines:
metrics:
receivers:
- elasticsearch
- hostmetrics
processors:
- batch
exporters:
- prometheusremotewrite
logs/elasticsearch:
exporters:
- elasticsearch
processors:
- batch
receivers:
- filelog/elasticsearch
Conclusion
With this configuration in place, you should now be able to effectively monitor your Elasticsearch cluster with OpenTelemetry. Happy monitoring!