监控组件:kafka-exporter
github地址:GitHub - imduffy15/kafka_exporter: Kafka exporter for Prometheus
启动:
docker run -d \
--restart=always \
--restart=on-failure:5 \
--name kafka_exporter \
-v /etc/localtime:/etc/localtime \
-p 9308:9308 \
danielqsj/kafka-exporter:v1.2.0 \
--kafka.server=172.30.0.11:9092
prometheus集成kafka_exporter
vim prometheus.yml
# kafka 监控- job_name: 'kafka-172.30.0.11'scrape_interval: 10sstatic_configs:- targets: ['192.168.0.39:9308']labels:kafka_ip: 'kafka-172.30.0.11'
重启prometheus容器生效
grafana码:7589
https://grafana.com/grafana/dashboards/7589
告警规则:
# cat rules/kafka-export-alert-rules.yaml groups:- name: kafka消费滞后告警rules:- alert: kafka消费滞后expr: sum(kafka_consumergroup_lag{topic!="sop_free_study_fix-student_wechat_detail"}) by (consumergroup, topic) > 1000for: 3mlabels:serverity: warningstatus: 严重annotations:summary: "kafka消费滞后"description: "{{$.Labels.consumergroup}}##{{$.Labels.topic}}:消费滞后超过1000持续3分钟(当前:{{$value}})"- alert: kafka-exporter downexpr: kafka_exporter_build_info < 1for: 3mlabels:serverity: warningstatus: 严重annotations:summary: "kafka-exporter down"description: "kafka-exporter down {{$.Labels.instance}}"- alert: kafka server downexpr: kafka_brokers < 1for: 3mlabels:serverity: warningstatus: 严重annotations:summary: "kafka server down"description: "kafka server down {{$.Labels.job}}"
多点监控参照文章:
prometheus监控kafka_蝎的博客-CSDN博客_prometheus监控kafka