K8s: Prometheus 监控主机 和 K8s 集群

devtools/2024/9/23 20:06:24/

Prometheus 监控主机 和 K8s 集群


1 )搭建 Prometheus

  • 创建对应 ServiceAccount
    • 达成角色访问的目的,避免所有人都能看到
  • 创建配置相关的 configmap
    • 定义了一堆的任务,收集各个层面的监控数据
  • 创建告警规则相关的configmap
    • 比如内存大于 75% 就加入一些标签
    • 并且可以出发一些事件
  • 创建Prometheus的缺省用户及密码
    • 缺省用户/密码为admin/admin: echo “YWRtaW4=” | base64 -D
  • 部署Prometheus Server的Deployment
    • 对外提供 容器 9090 端口,并挂载2个configMap
    • 原始docker image是prom/prometheus:v1.7.0
  • 部署Prometheus Server的Service
    • 部署后即可在外部访问
    • :30091/graph 这个路径
    • 可以选择下拉菜单,看到不同维度的数据
    • 可以看到 Prometheus 的报表没有那么强大

2 ) 一键部署

apiVersion: v1
kind: Namespace
metadata:name: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: prometheus
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: prometheus
subjects:
- kind: ServiceAccountname: prometheus-k8snamespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: prometheus
rules:
- apiGroups: [""]resources:- nodes- nodes/proxy- services- endpoints- podsverbs: ["get", "list", "watch"]
- apiGroups: [""]resources:- configmapsverbs: ["get"]
- nonResourceURLs: ["/metrics"]verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:name: prometheus-k8snamespace: monitoring
---
apiVersion: v1
kind: ConfigMap
metadata:creationTimestamp: nullname: prometheus-corenamespace: monitoring
data:prometheus.yaml: |global:scrape_interval: 10sscrape_timeout: 10sevaluation_interval: 10srule_files:- "/etc/prometheus-rules/*.rules"scrape_configs:# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L37- job_name: 'kubernetes-nodes'scheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenkubernetes_sd_configs:- role: noderelabel_configs:#- source_labels: [__address__]#  regex: '(.*):10250'#  replacement: '${1}:10255'#  target_label: __address__- action: labelmapregex: __meta_kubernetes_node_label_(.+)- target_label: __address__replacement: kubernetes.default.svc:443- source_labels: [__meta_kubernetes_node_name]regex: (.+)target_label: __metrics_path__replacement: /api/v1/nodes/${1}/proxy/metrics# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L79- job_name: 'kubernetes-endpoints'kubernetes_sd_configs:- role: endpointsrelabel_configs:- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]action: keepregex: true- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]action: replacetarget_label: __scheme__regex: (https?)- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]action: replacetarget_label: __metrics_path__regex: (.+)- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]action: replacetarget_label: __address__regex: (.+)(?::\d+);(\d+)replacement: $1:$2- action: labelmapregex: __meta_kubernetes_service_label_(.+)- source_labels: [__meta_kubernetes_namespace]action: replacetarget_label: kubernetes_namespace- source_labels: [__meta_kubernetes_service_name]action: replacetarget_label: kubernetes_name# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L119- job_name: 'kubernetes-services'metrics_path: /probeparams:module: [http_2xx]kubernetes_sd_configs:- role: servicerelabel_configs:- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]action: keepregex: true- source_labels: [__address__]target_label: __param_target- target_label: __address__replacement: blackbox- source_labels: [__param_target]target_label: instance- action: labelmapregex: __meta_kubernetes_service_label_(.+)- source_labels: [__meta_kubernetes_namespace]target_label: kubernetes_namespace- source_labels: [__meta_kubernetes_service_name]target_label: kubernetes_name# https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml#L156- job_name: 'kubernetes-pods'kubernetes_sd_configs:- role: podrelabel_configs:- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action: keepregex: true- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]action: replacetarget_label: __metrics_path__regex: (.+)- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]action: replaceregex: (.+):(?:\d+);(\d+)replacement: ${1}:${2}target_label: __address__- action: labelmapregex: __meta_kubernetes_pod_label_(.+)- source_labels: [__meta_kubernetes_namespace]action: replacetarget_label: kubernetes_namespace- source_labels: [__meta_kubernetes_pod_name]action: replacetarget_label: kubernetes_pod_name- source_labels: [__meta_kubernetes_pod_container_port_number]action: keepregex: 9\d{3}- job_name: 'kubernetes-cadvisor'scheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenkubernetes_sd_configs:- role: noderelabel_configs:- action: labelmap- action: labelmapregex: __meta_kubernetes_node_label_(.+)- target_label: __address__replacement: kubernetes.default.svc:443- source_labels: [__meta_kubernetes_node_name]regex: (.+)target_label: __metrics_path__replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor---
apiVersion: v1
kind: ConfigMap
metadata:creationTimestamp: nullname: prometheus-rulesnamespace: monitoring
data:cpu-usage.rules: |ALERT NodeCPUUsageIF (100 - (avg by (instance) (irate(node_cpu{name="node-exporter",mode="idle"}[5m])) * 100)) > 75FOR 2mLABELS {severity="page"}ANNOTATIONS {SUMMARY = "{{$labels.instance}}: High CPU usage detected",DESCRIPTION = "{{$labels.instance}}: CPU usage is above 75% (current value is: {{ $value }})"}instance-availability.rules: |ALERT InstanceDownIF up == 0FOR 1mLABELS { severity = "page" }ANNOTATIONS {summary = "Instance {{ $labels.instance }} down",description = "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.",}low-disk-space.rules: |ALERT NodeLowRootDiskIF ((node_filesystem_size{mountpoint="/root-disk"} - node_filesystem_free{mountpoint="/root-disk"} ) / node_filesystem_size{mountpoint="/root-disk"} * 100) > 75FOR 2mLABELS {severity="page"}ANNOTATIONS {SUMMARY = "{{$labels.instance}}: Low root disk space",DESCRIPTION = "{{$labels.instance}}: Root disk usage is above 75% (current value is: {{ $value }})"}ALERT NodeLowDataDiskIF ((node_filesystem_size{mountpoint="/data-disk"} - node_filesystem_free{mountpoint="/data-disk"} ) / node_filesystem_size{mountpoint="/data-disk"} * 100) > 75FOR 2mLABELS {severity="page"}ANNOTATIONS {SUMMARY = "{{$labels.instance}}: Low data disk space",DESCRIPTION = "{{$labels.instance}}: Data disk usage is above 75% (current value is: {{ $value }})"}mem-usage.rules: |ALERT NodeSwapUsageIF (((node_memory_SwapTotal-node_memory_SwapFree)/node_memory_SwapTotal)*100) > 75FOR 2mLABELS {severity="page"}ANNOTATIONS {SUMMARY = "{{$labels.instance}}: Swap usage detected",DESCRIPTION = "{{$labels.instance}}: Swap usage usage is above 75% (current value is: {{ $value }})"}ALERT NodeMemoryUsageIF (((node_memory_MemTotal-node_memory_MemFree-node_memory_Cached)/(node_memory_MemTotal)*100)) > 75FOR 2mLABELS {severity="page"}ANNOTATIONS {SUMMARY = "{{$labels.instance}}: High memory usage detected",DESCRIPTION = "{{$labels.instance}}: Memory usage is above 75% (current value is: {{ $value }})"}---
apiVersion: v1
kind: Secret
data:admin-password: YWRtaW4=admin-username: YWRtaW4=
metadata:name: grafananamespace: monitoring
type: Opaque
---
apiVersion: apps/v1
kind: Deployment
metadata:name: prometheus-corenamespace: monitoringlabels:app: prometheuscomponent: core
spec:replicas: 1selector:matchLabels:app: prometheuscomponent: coretemplate:metadata:name: prometheus-mainlabels:app: prometheuscomponent: corespec:serviceAccountName: prometheus-k8scontainers:- name: prometheus#image: prom/prometheus:v1.7.0image: prom/prometheus:v1.7.0args:- '-storage.local.retention=12h'- '-storage.local.memory-chunks=500000'- '-config.file=/etc/prometheus/prometheus.yaml'- '-alertmanager.url=http://alertmanager:9093/'ports:- name: webuicontainerPort: 9090resources:requests:cpu: 500mmemory: 500Mlimits:cpu: 500mmemory: 500MvolumeMounts:- name: config-volumemountPath: /etc/prometheus- name: rules-volumemountPath: /etc/prometheus-rulesvolumes:- name: config-volumeconfigMap:name: prometheus-core- name: rules-volumeconfigMap:name: prometheus-rules
---
apiVersion: v1
kind: Service
metadata:name: prometheusnamespace: monitoringlabels:app: prometheuscomponent: coreannotations:prometheus.io/scrape: 'true'
spec:type: NodePortports:- port: 9090protocol: TCPname: webuinodePort: 30091selector:app: prometheuscomponent: core
---

http://www.ppmy.cn/devtools/24633.html

相关文章

CentOS8 安装ansible

CentOS8 无法使用yum进行ansible安装,此次使用pip install ansible来安装ansible 大概步骤 1,编译安装升级python,centos8系统自动安装的python3.6版本过低,安装ansible时会有警告 2,安装pip 3,pip install…

002 springCloudAlibaba Sentinel流控-关联

当与A关联的资源B达到阀值后,就限流A自己 文章目录 FlowLimitController.javaSentinelServerApplication.javaServletInitializer.javaapplication.yamlpom.xmlpom.xml 启动Sentinel8080 - java -jar sentinel-dashboard-1.7.0.jar 启动微服务8401 启动8401微服务…

【Golang】Gin 框架的多种类型绑定函数

文章目录 前言一、Gin 框架解释二、代码实现三、总结 前言 在开发 Web 应用时,处理 HTTP 请求和响应是我们经常需要做的事情。在 Go 语言中,我们有许多优秀的 Web 框架可以帮助我们完成这项工作,而 Gin 框架就是其中之一。本文将深入探讨 Gi…

2024智能科学与软件工程国际学术会议(ICISSE 2024)

2024智能科学与软件工程国际学术会议(ICISSE 2024) 会议简介 2024智能科学与软件工程国际学术会议(ICISSE 2024)将在北京隆重举行。本次会议汇集了全球智能科学和软件工程领域的专家学者,共同探讨该领域的最新研究成果和发展趋…

这是一个简单网站,后续还会更新

1、首页效果图 代码 <!DOCTYPE html> <html> <head> <meta charset"utf-8" /> <title>爱德照明网站首页</title> <style> /*外部样式*/ charset "utf-8"…

电子式汽车机油压力传感器的接线方法及特点

电子式机油压力传感器由厚膜压力传感器芯片、信号处理电路、外壳、固定电路板装置和两根引线&#xff08;信号线和报警线&#xff09;组成。信号处理电路由电源电路、传感器补偿电路、调零电路、电压放大电路、电流放大电路、滤波电路和报警电路组成。 厚膜压力传感器是20世纪…

面试笔记——线程池

线程池的核心参数&#xff08;原理&#xff09; public ThreadPoolExecutor(int corePoolSize,int maximumPoolSize,long keepAliveTime,TimeUnit unit,BlockingQueue<Runnable> workQueue,ThreadFactory threadFactory,RejectedExecutionHandler handler)corePoolSize …

【SpringBoot】00 Maven配置及创建项目

一、Maven配置 1、下载Maven 进入官网下载&#xff1a;Maven – Welcome to Apache MavenMaven – Download Apache Maven 本文以最新版为例&#xff0c;可按需选择版本 Maven – Welcome to Apache Maven 2、解压下载好的安装包 将安装包解压到自己设置的空文件夹中 3、…