Prometheus简介
prometheus官方网站:https://prometheus.io/
prometheus是基于Go语言开发的一套监控、告警和时序数据库的组合,CNCF基金会的第二个毕业项目,在容器和微服务领域有着广泛的应用。一般情况下,是监控Kubernetes的标配。其主要优点如下:
- 强大的多维度数据模型,可以通过标签来实现数据多维度查询
- 使用时序数据库保存数据,目前自带的时序数据库是TSDB,使用本地存储,可以完成每秒千万级的数据存储;在需要保存大量历史数据的情况下,也可以对接第三方时序数据库
- 支持第三方组件来实现绚丽的图形界面,例如Grafana
- 组件模块化
- 支持通过静态文件配置和动态发现机制发现监控对象,自动完成数据采集
- 提供强大的数据查询语句PromQL
- 有着众多官方和三方的exporter来实现不同的指标数据收集
promethes架构如下图所示:
其中主要组件有:
- Prometheus Server:主服务,负责接受客户端请求,收集、存储和查询监控指标数据
- Alertmanager:负责告警通知
- Promethesu targets(exporters):负责采集监控指标数据,然后提供给Prometheus Server
- PushGateway:数据收集代理服务,目标主机可以将数据推送到pushgateway,然后由prometheus server统一拉取pull
- Grafana:是一个第三方组件,web界面,用于展示监控数据
Prometheus部署
目前可以通过多种不同的方式部署prometheus监控环境,包括apt安装、docker-compose运行、二进制安装、k8s operator等,下面介绍其中两种方式:二进制安装和k8s operator安装
使用Operator在k8s集群中部署Prometheus
Operator是基于已经编写好的yaml文件,可以将prometheus server、alertmanager、grafana、node-exporter等组件在已有的k8s集群中一键批量部署完成
prometheus operator项目地址:https://github.com/prometheus-operator/kube-prometheus
下载项目代码
wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.12.0.tar.gz
tar xvf v0.12.0.tar.gz && cd kube-prometheus-0.12.0/
修改prometheus service和 grafana service,改为NodePort类型
root@master-01:~/kube-prometheus-0.12.0# cat manifests/prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:labels:app.kubernetes.io/component: prometheusapp.kubernetes.io/instance: k8sapp.kubernetes.io/name: prometheusapp.kubernetes.io/part-of: kube-prometheusapp.kubernetes.io/version: 2.41.0name: prometheus-k8snamespace: monitoring
spec:type: NodePortports:- name: webport: 9090targetPort: webnodePort: 39090- name: reloader-webport: 8080targetPort: reloader-webselector:app.kubernetes.io/component: prometheusapp.kubernetes.io/instance: k8sapp.kubernetes.io/name: prometheusapp.kubernetes.io/part-of: kube-prometheussessionAffinity: ClientIProot@master-01:~/kube-prometheus-0.12.0# cat manifests/grafana-service.yaml
apiVersion: v1
kind: Service
metadata:labels:app.kubernetes.io/component: grafanaapp.kubernetes.io/name: grafanaapp.kubernetes.io/part-of: kube-prometheusapp.kubernetes.io/version: 9.3.2name: grafananamespace: monitoring
spec:type: NodePortports:- name: httpport: 3000targetPort: httpnodePort: 33000selector:app.kubernetes.io/component: grafanaapp.kubernetes.io/name: grafanaapp.kubernetes.io/part-of: kube-prometheus
修改prometheus和grafana的networkPolicy文件,允许集群外端点访问,否则无法访问prometheus和grafana web页面
root@master-01:~/kube-prometheus-0.12.0# cat manifests/prometheus-networkPolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:labels:app.kubernetes.io/component: prometheusapp.kubernetes.io/instance: k8sapp.kubernetes.io/name: prometheusapp.kubernetes.io/part-of: kube-prometheusapp.kubernetes.io/version: 2.41.0name: prometheus-k8snamespace: monitoring
spec:egress:- {}ingress:- from: [] #from改为空值ports:- port: 9090protocol: TCP- port: 8080protocol: TCP- from:- podSelector:matchLabels:app.kubernetes.io/name: grafanaports:- port: 9090protocol: TCPpodSelector:matchLabels:app.kubernetes.io/component: prometheusapp.kubernetes.io/instance: k8sapp.kubernetes.io/name: prometheusapp.kubernetes.io/part-of: kube-prometheuspolicyTypes:- Egress- Ingressroot@master-01:~/kube-prometheus-0.12.0# cat manifests/grafana-networkPolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:labels:app.kubernetes.io/component: grafanaapp.kubernetes.io/name: grafanaapp.kubernetes.io/part-of: kube-prometheusapp.kubernetes.io/version: 9.3.2name: grafananamespace: monitoring
spec:egress:- {}ingress:- from: [] #from字段置空ports:- port: 3000protocol: TCPpodSelector:matchLabels:app.kubernetes.io/component: grafanaapp.kubernetes.io/name: grafanaapp.kubernetes.io/part-of: kube-prometheuspolicyTypes:- Egress- Ingress
root@master-01:~/kube-prometheus-0.12.0#
替换部署文件中的两个国外镜像为docker hub上的镜像,避免因为网络问题下载失败
#替换前
root@master-01:~/kube-prometheus-0.12.0# grep -r registry.k8s.io manifests/*
manifests/kubeStateMetrics-deployment.yaml: image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.7.0
manifests/prometheusAdapter-deployment.yaml: image: registry.k8s.io/prometheus-adapter/prometheus-adapter:v0.10.0
root@master-01:~/kube-prometheus-0.12.0##替换后
root@master-01:~/kube-prometheus-0.12.0# grep -r v5cn manifests/*
manifests/kubeStateMetrics-deployment.yaml: image: v5cn/kube-state-metrics:v2.7.0
manifests/prometheusAdapter-deployment.yaml: image: v5cn/prometheus-adapter:v0.10.0
root@master-01:~/kube-prometheus-0.12.0#
执行部署
kubectl apply --server-side -f manifests/setup
#等待上一步创建的资源全部就绪
kubectl wait \--for condition=Established \--all CustomResourceDefinition \--namespace=monitoring
kubectl apply -f manifests
等待Pod全部就绪
访问prometheus和grafana测试,grafana默认用户名密码admin/admin
如果需要删除prometheus环境可以执行下面命令
kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
二进制部署Prometheus
将prometheus的每个组件进行模块化单独部署,其中prometheus Server、grafana、alertmanager、node-exporter等监控组件,使用单独的服务器进行二进制安装或者单独的容器进行部署
部署prometheus-server
从官网下载prometheus-serve包:https://prometheus.io/download/
wget https://github.com/prometheus/prometheus/releases/download/v2.37.5/prometheus-2.37.5.linux-amd64.tar.gztar xvf prometheus-2.37.5.linux-amd64.tar.gz -C /usr/localln -s /usr/local/prometheus-2.37.5.linux-amd64 /usr/local/prometheus/usr/local/prometheus/prometheus -h #可以查看prometheus可配置的启动参数
prometheus相关文件
root@prometheus-server-01:~# ll /usr/local/prometheus-2.37.5.linux-amd64/
total 206468
drwxr-xr-x 5 3434 3434 4096 Feb 3 2023 ./
drwxr-xr-x 11 root root 4096 Feb 3 2023 ../
drwxr-xr-x 2 3434 3434 4096 Dec 9 13:06 console_libraries/
drwxr-xr-x 2 3434 3434 4096 Dec 9 13:06 consoles/
drwxr-xr-x 4 root root 4096 Feb 3 13:17 data/ #时序数据库数据存储目录
-rw-r--r-- 1 3434 3434 11357 Dec 9 13:06 LICENSE
-rw-r--r-- 1 3434 3434 3773 Dec 9 13:06 NOTICE
-rwxr-xr-x 1 3434 3434 109779661 Dec 9 12:49 prometheus* #二进制程序
-rw-r--r-- 1 3434 3434 934 Dec 9 13:06 prometheus.yml #配置文件
-rwxr-xr-x 1 3434 3434 101601052 Dec 9 12:52 promtool* #测试工具,用于检测配置文件正确性、检测metrics数据等
root@prometheus-server-01:~#
配置service文件
cat > /lib/systemd/system/prometheus-server.service << EOF
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target[Service]
Restart=on-failure
WorkingDirectory=/usr/local/prometheus/
#--web.enable-lifecycle表示启用配置热加载功能
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --web.enable-lifecycle --storage.tsdb.retention=720h[Install]
WantedBy=multi-user.targetEOF
启动prometheus server
systemctl daemon-reload
systemctl start prometheus-server.service
systemctl status prometheus-server.service
systemctl enable prometheus-server.service
访问prometheus界面测试
部署node-exporter
node-exporter用于收集宿主机监控指标数据,默认监听9100端口
下载安装包
wget https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz
ln -sv /usr/local/node_exporter-1.5.0.linux-amd64 /usr/local/node_exporter
/usr/local/node_exporter/node_exporter -h #可以查看node_exporter可配置的启动参数
准备service文件
cat > /lib/systemd/system/node-exporter.service << EOF
[Unit]
Description=Prometheus Node Exporter
After=network.target[Service]
ExecStart=/usr/local/node_exporter/node_exporter[Install]
WantedBy=multi-user.target
EOF
启动服务
systemctl daemon-reload
systemctl start node-exporter.service
systemctl status node-exporter.service
systemctl enable node-exporter.service
访问测试
部署grafana
grafana是一个可视化组件,主要用来接受客户端浏览器请求并到prometheus中查询监控数据,最终在浏览器上进行展示。和zabbix类似,grafana也需要使用模板查询展示数据,模板可以自己制作也可以手动导入
grafana官网:https://grafana.com/
下载地址:https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1
下载grafana
按照官网步骤进行安装
apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/enterprise/release/grafana-enterprise_9.3.6_amd64.deb
dpkg -i grafana-enterprise_9.3.6_amd64.deb
systemctl start grafana-server.service
systemctl enable grafana-server.servic
访问测试
grafana默认监听3000端口,用户名密码admin/admin
配置grafana数据源