一、实验环境
1、k8s环境
版本
v1.26.5
二进制安装Kubernetes(K8s)集群(基于containerd)—从零安装教程(带证书)
主机名 | IP | 系统版本 | 安装服务 |
---|---|---|---|
master01 | 10.10.10.21 | rhel7.5 | nginx、etcd、api-server、scheduler、controller-manager、kubelet、proxy |
master02 | 10.10.10.22 | rhel7.5 | nginx、etcd、api-server、scheduler、controller-manager、kubelet、proxy |
master03 | 10.10.10.23 | rhel7.5 | nginx、etcd、api-server、scheduler、controller-manager、kubelet、proxy |
node01 | 10.10.10.24 | rhel7.5 | nginx、kubelet、proxy |
node02 | 10.10.10.25 | rhel7.5 | nginx、kubelet、proxy |
master-lb | 10.10.10.30 | VIP |
2、Prometheus+Grafana环境
Prometheus+Grafana监控系统
主机名 | IP | 系统版本 |
---|---|---|
jenkins | 10.10.10.10 | rhel7.5 |
3、Prometheus部署方式
- kubernetes内部Prometheus监控k8s集群
- Prometheus监控内部K8S就是把Prometheus部署在K8S集群内,比如部署在K8S集群的monitoring的namespace下,因为K8S在所有的namespace下自动创建了serviceAccount和对应的Secret里自带访问K8S API的token和ca,所以就不需要手动创建serviceAccount和Secret了
- kubernetes外部Prometheus监控k8s
- kubernetes外部Prometheus监控外部K8S就是把Prometheus部署在虚拟机上,需要自己在Prometheus.yaml手动指定API的地址,ca和Token
4、版本对应
https://github.com/kubernetes/kube-state-metrics
二、配置kube-state-metrics
https://github.com/kubernetes/kube-state-metrics/tree/v2.9.2/examples/standard
1、文件下载
[root@master01 kube-state-metrics]# ls
cluster-role-binding.yaml cluster-role.yaml deployment.yaml service-account.yaml service.yaml
[root@master01 kube-state-metrics]# cat cluster-role-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:labels:app.kubernetes.io/component: exporterapp.kubernetes.io/name: kube-state-metricsapp.kubernetes.io/version: 2.9.2name: kube-state-metrics
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: kube-state-metrics
subjects:
- kind: ServiceAccountname: kube-state-metricsnamespace: kube-system
[root@master01 kube-state-metrics]# cat cluster-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:labels:app.kubernetes.io/component: exporterapp.kubernetes.io/name: kube-state-metricsapp.kubernetes.io/version: 2.9.2name: kube-state-metrics
rules:
- apiGroups:- ""resources:- configmaps- secrets- nodes- pods- services- serviceaccounts- resourcequotas- replicationcontrollers- limitranges- persistentvolumeclaims- persistentvolumes- namespaces- endpointsverbs:- list- watch
- apiGroups:- appsresources:- statefulsets- daemonsets- deployments- replicasetsverbs:- list- watch
- apiGroups:- batchresources:- cronjobs- jobsverbs:- list- watch
- apiGroups:- autoscalingresources:- horizontalpodautoscalersverbs:- list- watch
- apiGroups:- authentication.k8s.ioresources:- tokenreviewsverbs:- create
- apiGroups:- authorization.k8s.ioresources:- subjectaccessreviewsverbs:- create
- apiGroups:- policyresources:- poddisruptionbudgetsverbs:- list- watch
- apiGroups:- certificates.k8s.ioresources:- certificatesigningrequestsverbs:- list- watch
- apiGroups:- discovery.k8s.ioresources:- endpointslicesverbs:- list- watch
- apiGroups:- storage.k8s.ioresources:- storageclasses- volumeattachmentsverbs:- list- watch
- apiGroups:- admissionregistration.k8s.ioresources:- mutatingwebhookconfigurations- validatingwebhookconfigurationsverbs:- list- watch
- apiGroups:- networking.k8s.ioresources:- networkpolicies- ingressclasses- ingressesverbs:- list- watch
- apiGroups:- coordination.k8s.ioresources:- leasesverbs:- list- watch
- apiGroups:- rbac.authorization.k8s.ioresources:- clusterrolebindings- clusterroles- rolebindings- rolesverbs:- list- watch
[root@master01 kube-state-metrics]# cat deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:labels:app.kubernetes.io/component: exporterapp.kubernetes.io/name: kube-state-metricsapp.kubernetes.io/version: 2.9.2name: kube-state-metricsnamespace: kube-system
spec:replicas: 1selector:matchLabels:app.kubernetes.io/name: kube-state-metricstemplate:metadata:labels:app.kubernetes.io/component: exporterapp.kubernetes.io/name: kube-state-metricsapp.kubernetes.io/version: 2.9.2spec:automountServiceAccountToken: truecontainers:- image: bitnami/kube-state-metrics:2.9.2livenessProbe:httpGet:path: /healthzport: 8080initialDelaySeconds: 5timeoutSeconds: 5name: kube-state-metricsports:- containerPort: 8080name: http-metrics- containerPort: 8081name: telemetryreadinessProbe:httpGet:path: /port: 8081initialDelaySeconds: 5timeoutSeconds: 5securityContext:allowPrivilegeEscalation: falsecapabilities:drop:- ALLreadOnlyRootFilesystem: truerunAsNonRoot: truerunAsUser: 65534seccompProfile:type: RuntimeDefaultnodeSelector:kubernetes.io/os: linuxserviceAccountName: kube-state-metrics
[root@master01 kube-state-metrics]# cat service-account.yaml
apiVersion: v1
automountServiceAccountToken: false
kind: ServiceAccount
metadata:labels:app.kubernetes.io/component: exporterapp.kubernetes.io/name: kube-state-metricsapp.kubernetes.io/version: 2.9.2name: kube-state-metricsnamespace: kube-system
[root@master01 kube-state-metrics]# cat service.yaml
apiVersion: v1
kind: Service
metadata:labels:app.kubernetes.io/component: exporterapp.kubernetes.io/name: kube-state-metricsapp.kubernetes.io/version: 2.9.2name: kube-state-metricsnamespace: kube-system
spec:type: NodePortports:- name: http-metricsport: 8080targetPort: 8080nodePort: 32080protocol: TCP- name: telemetryport: 8081targetPort: 8081nodePort: 32081protocol: TCPselector:app.kubernetes.io/name: kube-state-metrics
2、安装kube-state-metrics
使用NodePort暴漏端口
[root@master01 kube-state-metrics]# kubectl apply -f ./
[root@master01 kube-state-metrics]# kubectl get po -n kube-system -o wide | grep kube-state-metrics
kube-state-metrics-57ddc8c4ff-krsh2 1/1 Running 0 9m5s 10.0.3.1 master02 <none> <none>[root@master01 kube-state-metrics]# kubectl get svc -n kube-system | grep kube-state-metrics
kube-state-metrics NodePort 10.97.38.90 <none> 8080:32080/TCP,8081:32081/TCP 9m17s
3、测试结果
发现部署在master02,也就是10.10.10.22
[root@master01 kube-state-metrics]# curl http://10.97.38.90:8080/healthz -w '\n'
OK
三、配置Prometheus
1、修改prometheus.yml
[root@jenkins ~]# cat Prometheus/prometheus.yml- job_name: "kube-state-metrics"static_configs:- targets: ["10.10.10.22:32080"]- job_name: "kube-state-telemetry"static_configs:- targets: ["10.10.10.22:32081"]
2、重启Prometheus
[root@jenkins ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a0497377cd82 grafana/grafana-enterprise "/run.sh" 13 days ago Up 3 minutes 0.0.0.0:3000->3000/tcp grafana
3e0e4270bd92 prom/prometheus "/bin/prometheus --c…" 13 days ago Up 3 minutes 0.0.0.0:9090->9090/tcp prometheus[root@jenkins ~]# docker restart 3e0e4270bd92
3、登录查看结果
四、配置Grafana
推荐模板:13332、13824、14518
1、导入模板