创建sa账号,在k8s集群的master节点操作
#创建一个sa账号
对sa账号授权,这样普罗米修斯才能对k8s集群有一定的权限,采集其他节点的信息。、
[root@master ~]# kubectl create serviceaccount monitor -n monitor
serviceaccount/monitor created
#把sa账号monitor通过clusterrolebing绑定到clusterrole上
[root@master prometheus]# kubectl create clusterrolebinding monitor-clusterrolebinding -n monitor --clusterrole=cluster-admin --serviceaccount=monitor:monitor
clusterrolebinding.rbac.authorization.k8s.io/monitor-clusterrolebinding created
这个clusterrole具有管理员的角色,那么这个sa就可以访问k8s上面所有的资源了。
Relabeler - The playground for Prometheus relabeling rules
Kubernetes 基于角色node自动发现 node_exporter cadvisor
scrape_interval: 15s #数据采集间隔scrape_timeout: 10s # 数据采集超时时间,默认10sevaluation_interval: 1m # 评估告警周期
scrape_configs:配置数据源,称为target,每个target用job_name命名。又分为静态配置和服务发现 。
k8s服务发现角色有很多,如果使用node就会使用kubelet提供的http端口来发现集群当中的每个node节点 。
标签重新标记:默认采集上来的数据需要重新标记:因为要通过node_exporter获取数据,要将默认的10250变为9100才能采集到node节点数据。
relabel_configs:
#重新标记- source_labels: [__address__] #配置的原始标签,匹配地址regex: '(.*):10250' #匹配带有10250端口的ip:10250replacement: '${1}:9100' #把匹配到的ip:10250的ip保留替换成${1}target_label: __address__ #新生成的地址action: replace
[root@master ~]# netstat -tpln | grep 10250
tcp6 0 0 :::10250 :::* LISTEN 482/kubelet
[root@master prometheus]# netstat -tpln | grep 9100
tcp6 0 0 :::9100 :::* LISTEN 22132/node_exporter
labelmap #匹配到下面正则表达式的标签会被保留
scrape_configs:
#scrape_configs:配置数据源,称为target,每个target用job_name命名。又分为静态配置和服务发现- job_name: 'kubernetes-node'kubernetes_sd_configs:
#使用的是k8s的服务发现- role: node
# 使用node角色,它使用默认的kubelet提供的http端口来发现集群中每个node节点。relabel_configs:
#重新标记- source_labels: [__address__] #配置的原始标签,匹配地址regex: '(.*):10250' #匹配带有10250端口的ip:10250replacement: '${1}:9100' #把匹配到的ip:10250的ip保留替换成${1}target_label: __address__ #新生成的地址action: replace- action: labelmap #匹配到下面正则表达式的标签会被保留regex: __meta_kubernetes_node_label_(.+)
node
该角色发现每个群集节点的一个目标,该地址默认为 Kubelet 的 HTTP 端口。目标地址默认为地址类型顺序 、和 中的 Kubernetes 节点对象的第一个现有地址。node``NodeInternalIP``NodeExternalIP``NodeLegacyHostIP``NodeHostName
可用的元标签:
__meta_kubernetes_node_name
:节点对象的名称。__meta_kubernetes_node_label_<labelname>
:节点对象中的每个标签。__meta_kubernetes_node_labelpresent_<labelname>
:对于节点对象的每个标签。true
__meta_kubernetes_node_annotation_<annotationname>
:节点对象中的每个注释。__meta_kubernetes_node_annotationpresent_<annotationname>
:对于节点对象的每个注释。true
__meta_kubernetes_node_address_<address_type>
:每个节点地址类型的第一个地址(如果存在)。
此外,节点的标签将设置为从 API 服务器检索的节点名称。instance
- job_name: 'kubernetes-node-cadvisor'
# 抓取cAdvisor数据,是获取kubelet上/metrics/cadvisor接口数据来获取容器的资源使用情况kubernetes_sd_configs:- role: nodescheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenrelabel_configs:- action: labelmapregex: __meta_kubernetes_node_label_(.+)- target_label: __address__replacement: kubernetes.default.svc:443- source_labels: [__meta_kubernetes_node_name]regex: (.+)target_label: __metrics_path__replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
[root@master ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 45h
上面基于node角色的服务发现,最后抓取指标的路径为 scheme + __address__ + __metrics_path__
- node_exporter:${1}:9100 + /metrics
- cadvisor:kubernetes.default.svc:443 + /api/v1/nodes/${1}/proxy/metrics/cadvisor
Kubernetes 基于角色endpoints 自动发现 Apiserver
基于不同的角色的服务发现,源标签是不一样的。
基于k8s的服务发现,这里使用的角色是endpoints
- job_name: 'kubernetes-apiserver'kubernetes_sd_configs:- role: endpointsscheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenrelabel_configs:- source_labels: [__meta_kubernetes_namespace #endpoint这个对象的名称空间,__meta_kubernetes_service_name #endpoint对象的服务名字, __meta_kubernetes_endpoint_port_name #endpoint的端口名称]action: keepregex: default;kubernetes;https
# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点,意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取 (这里配置了专门的服务发现,将注解里面的标签replace为目标 schema,path,metrics_path)
#正则匹配到的默认空间下的service名字是kubernetes,协议是https的endpoint类型保留下来- job_name: 'kubernetes-service-endpoints'kubernetes_sd_configs:- role: endpointsrelabel_configs:- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]action: keepregex: true
# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点,意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取,annotation本身也是键值结构,所以这里的源标签设置为键,而regex设置值true,当值匹配到regex设定的内容时则执行keep动作也就是保留,其余则丢弃。- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]action: replacetarget_label: __scheme__regex: (https?)
#重新设置scheme,匹配源标签__meta_kubernetes_service_annotation_prometheus_io_scheme也就是
prometheus.io/scheme annotation,如果源标签的值匹配到regex,则把值替换为__scheme__对应的值。- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]action: replacetarget_label: __metrics_path__regex: (.+)(因为有些指标路径不是默认的/metrics,有些时候应用自定义暴露的指标接口,可能不是/metrics,如果是其他接口,可以声明其他接口)# 应用中自定义暴露的指标,也许你暴露的API接口不是/metrics这个路径,那么你可以在这个POD对应的service中做一个"prometheus.io/path = /mymetrics" 声明,上面的意思就是把你声明的这个路径赋值给__metrics_path__,其实就是让prometheus来获取自定义应用暴露的metrices的具体路径,不过这里写的要和service中做好约定,如果service中这样写 prometheus.io/app-metrics-path: '/metrics' 那么你这里就要__meta_kubernetes_service_annotation_prometheus_io_app_metrics_path这样写。- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]action: replacetarget_label: __address__regex: ([^:]+)(?::\d+)?;(\d+)replacement: $1:$2
# 暴露自定义的应用的端口,就是把地址和你在service中定义的 "prometheus.io/port = <port>" 声明做一个拼接,然后赋值给__address__,这样prometheus就能获取自定义应用的端口,然后通过这个端口再结合__metrics_path__来获取指标,如果__metrics_path__值不是默认的/metrics那么就要使用上面的标签替换来获取真正暴露的具体路径。- action: labelmap #添加额外信息的regex: __meta_kubernetes_service_label_(.+)- source_labels: [__meta_kubernetes_namespace]action: replacetarget_label: kubernetes_namespace- source_labels: [__meta_kubernetes_service_name]action: replacetarget_label: kubernetes_name
[root@master ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-867b49865c-f6qbh 1/1 Running 2 45h 10.233.96.13 node2 <none> <none>
coredns-867b49865c-m9hx4 1/1 Running 2 45h 10.233.90.9 node1 <none> <none>
[root@master ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 45h
[root@master ~]# kubectl get ep -n kube-system
NAME ENDPOINTS AGE
coredns 10.233.90.9:53,10.233.96.13:53,10.233.90.9:53 + 3 more... 45h
[root@master ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 52d
kube-scheduler-prometheus-discovery ClusterIP None <none> 10259/TCP 36d
kube-state-metrics ClusterIP 10.233.15.152 <none> 8080/TCP 47d[root@master ~]# kubectl get svc coredns -o yaml -n kube-system
apiVersion: v1
kind: Service
metadata:annotations:kubectl.kubernetes.io/last-applied-configuration: |{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/port":"9153","prometheus.io/scrape":"true"},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"kube-dns","kubernetes.io/cluster-service":"true","kubernetes.io/name":"coredns"},"name":"coredns","namespace":"kube-system"},"spec":{"clusterIP":"10.233.0.3","ports":[{"name":"dns","port":53,"protocol":"UDP"},{"name":"dns-tcp","port":53,"protocol":"TCP"},{"name":"metrics","port":9153,"protocol":"TCP"}],"selector":{"k8s-app":"kube-dns"}}}prometheus.io/port: "9153"prometheus.io/scrape: "true"
# 重新打标仅抓取到的具有 "prometheus.io/scrape: true" 的annotation的端点,意思是说如果某个service具有prometheus.io/scrape = true annotation声明则抓取,annotation本身也是键值结构,所以这里的源标签设置为键,而regex设置值true,当值匹配到regex设定的内容时则执行keep动作也就是保留,其余则丢弃。
[root@master ~]# kubectl get svc coredns -n kube-system -o yaml
apiVersion: v1
kind: Service
metadata:annotations:kubectl.kubernetes.io/last-applied-configuration: |{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/port":"9153","prometheus.io/scrape":"true"},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"kube-dns","kubernetes.io/cluster-service":"true","kubernetes.io/name":"coredns"},"name":"coredns","namespace":"kube-system"},"spec":{"clusterIP":"10.233.0.3","ports":[{"name":"dns","port":53,"protocol":"UDP"},{"name":"dns-tcp","port":53,"protocol":"TCP"},{"name":"metrics","port":9153,"protocol":"TCP"}],"selector":{"k8s-app":"kube-dns"}}}prometheus.io/port: "9153"prometheus.io/scrape: "true"
[root@master prometheus]# curl 10.233.90.9:9153/metrics | head -n 10% Total % Received % Xferd Average Speed Time Time Time CurrentDload Upload Total Spent Left Speed0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.
# TYPE coredns_build_info gauge
coredns_build_info{goversion="go1.14.1",revision="1766568",version="1.6.9"} 1
# HELP coredns_cache_hits_total The count of cache hits.
# TYPE coredns_cache_hits_total counter
coredns_cache_hits_total{server="dns://:53",type="denial"} 1
# HELP coredns_cache_misses_total The count of cache misses.
# TYPE coredns_cache_misses_total counter
coredns_cache_misses_total{server="dns://:53"} 6
# HELP coredns_cache_size The number of elements in the cache.
100 12115 0 12115 0 0 4249k 0 --:--:-- --:--:-- --:--:-- 5915k
curl: (23) Failed writing body (123 != 2048)
可以看到通过服务发现endpoints角色也能抓取到CoreDns暴露的据!!!!!!!!!!!!!
Pormetheus 完整yaml文件
prometheus配置文件
[root@master prometheus]# cat prometheus-cfg.yaml
---
kind: ConfigMap
apiVersion: v1
metadata:labels:app: prometheusname: prometheus-confignamespace: monitor
data:prometheus.yml: |global:scrape_interval: 15sscrape_timeout: 10sevaluation_interval: 1mscrape_configs:- job_name: 'kubernetes-node'kubernetes_sd_configs:- role: noderelabel_configs:- source_labels: [__address__]regex: '(.*):10250'replacement: '${1}:9100'target_label: __address__action: replace- action: labelmapregex: __meta_kubernetes_node_label_(.+)- job_name: 'kubernetes-node-cadvisor'kubernetes_sd_configs:- role: nodescheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenrelabel_configs:- action: labelmapregex: __meta_kubernetes_node_label_(.+)- target_label: __address__replacement: kubernetes.default.svc:443- source_labels: [__meta_kubernetes_node_name]regex: (.+)target_label: __metrics_path__replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor- job_name: 'kubernetes-apiserver'kubernetes_sd_configs:- role: endpointsscheme: httpstls_config:ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crtbearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/tokenrelabel_configs:- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]action: keepregex: default;kubernetes;https- job_name: 'kubernetes-service-endpoints'kubernetes_sd_configs:- role: endpointsrelabel_configs:- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]action: keepregex: true- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]action: replacetarget_label: __scheme__regex: (https?)- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]action: replacetarget_label: __metrics_path__regex: (.+)- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]action: replacetarget_label: __address__regex: ([^:]+)(?::\d+)?;(\d+)replacement: $1:$2- action: labelmapregex: __meta_kubernetes_service_label_(.+)- source_labels: [__meta_kubernetes_namespace]action: replacetarget_label: kubernetes_namespace- source_labels: [__meta_kubernetes_service_name]action: replacetarget_label: kubernetes_name
prometheus deploy文件
[root@master prometheus]# cat prometheus-deploy.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:name: prometheus-servernamespace: monitorlabels:app: prometheus
spec:replicas: 1selector:matchLabels:app: prometheuscomponent: server#matchExpressions:#- {key: app, operator: In, values: [prometheus]}#- {key: component, operator: In, values: [server]}template:metadata:labels:app: prometheuscomponent: serverannotations:prometheus.io/scrape: 'false'spec:serviceAccountName: monitorcontainers:- name: prometheusimage: prom/prometheus:v2.2.1imagePullPolicy: IfNotPresentcommand:- prometheus- --config.file=/etc/prometheus/prometheus.yml- --storage.tsdb.path=/prometheus- --storage.tsdb.retention=720h- --web.enable-lifecycleports:- containerPort: 9090protocol: TCPvolumeMounts:- mountPath: /etc/prometheus/prometheus.ymlname: prometheus-configsubPath: prometheus.yml- mountPath: /prometheus/name: prometheus-storage-volumevolumes:- name: prometheus-configconfigMap:name: prometheus-configitems:- key: prometheus.ymlpath: prometheus.ymlmode: 0644- name: prometheus-storage-volumepersistentVolumeClaim:claimName: prometheus
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:name: prometheus namespace: monitor
spec:storageClassName: "managed-nfs-storage"accessModes:- ReadWriteManyresources:requests:storage: 5Gi
You have new mail in /var/spool/mail/root
prometheus server不被抓取到。
prometheus.io/scrape: 'false'