elastalert配置报警并提供指标给Prometheus

elastalert是用Python写的基于elasticsearch的报警工具,这里介绍一些基础配置和如何为Prometheus提供指标

Quick Start

  • 启用一个elasticsearch
  docker run -p 9200:9200 -d -e ES_JAVA_OPTS="-Xms256m -Xmx256m" elasticsearch:5.5.0-alpine

这里使用docker快速启动一个elaticsearch数据库,如何安装docker点击这里

  • 新建文件夹rules,并添加问题example.yaml
  name: esa_twitter_message_timeout
  type: frequency
  index: ppp
  num_events: 10
  timeframe:
    hours: 4
  query_delay:
    minutes: 5
  filter:
  - term:
      user_p: "1.0"
  ignore_email: true
  alert:
  - "email"
  email:
  - "elastalert@example.com"

这里设置一个query_delay时间防止搜索时日志还没有发送到es数据库中 ignore_email:跳过发送邮件

  • 启动elastalert
  docker run -p 8000:8000 \
  -e ELASTICSEARCH_HOST=192.168.99.100 \
  -e ELASTALERT_INDEX=s102 \
  -v $(pwd)/rules:/opt/rules \
  hand/elastalert_prometheus:v0.1

ELASTALERT_INDEX指定elastalert存储的index

  • 编辑config.yaml
  # This is the folder that contains the rule yaml files
  # Any .yaml file will be loaded as a rule
  rules_folder: example_rules
  run_every:
    seconds: 10
  buffer_time:
    seconds: 10
  es_host: 192.168.99.100
  es_port: 9200
  writeback_index: elastalert_status
  alert_time_limit:
    seconds: 20

这里的单位可以使用minutes,seconds,days等 rules_folder:指定规则存放于哪个文件夹

  • 访问docker_ip:8000查看指标

配置示例

  • 完全匹配
  name: esa_cpu0_term
  type: frequency
  index: cpu_0
  num_events: 10000
  timeframe:
    seconds: 10
  filter:
  - term:
      user_p: "1.0"
  ignore_email: true
  alert:
  - "email"
  email:
  - "elastalert@example.com"
  • 文本包含
name: esa_log_warn_count
type: frequency
index: xy_log
num_events: 10000
timeframe:
  seconds: 10
filter:
- match_phrase:
    log: "level=warn"
ignore_email: true
labels:
  xxx : "yyy"
  zzz : "laddlda"
  ppp : $.key
alert:
- "email"
email:
- "elastalert@example.com"

label中可以使用$.x表示取自字段x的值

  • 聚合查询

  • text格式字段需要手动开启fielddata属性

PUT xxx.application-*/_mapping/es
{
  "properties": {
    "YOUR_FIELD": {
      "type":     "text",
      "fielddata": true
    }
  }
}
name: esa_log_grep_warn_count
type: metric_aggregation
index: xy_log
num_events: 10000
timeframe:
  seconds: 10
filter:
- match_phrase:
    log: "level=warn"
doc_type: logwarn
min_threshold: 0
query_key: log
metric_agg_key: log
metric_agg_type: value_count
aggregation_key: "log"
ignore_email: true
labels:
  xxx : "yyy"
  zzz : "laddlda"
  ppp : $.key
alert:
- "email"
email:
- "elastalert@example.com"

这里使用$.key获取聚合后的key值

VinkDong

open to open