1. 配置 Kubernetes API 访问
为了让外部的 Prometheus 能够使用 kubernetes_sd_configs
进行服务发现,你需要确保 Prometheus 可以访问 Kubernetes API 服务器,并且具备足够的权限。
1.1 创建 Kubernetes Service Account 并授予权限
首先,在 Kubernetes 集群中创建一个 ServiceAccount
和对应的 ClusterRoleBinding
,以便 Prometheus 能够访问 Kubernetes API 进行服务发现。
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: prometheus
rules:
- apiGroups: [""]resources:- nodes- nodes/metrics- nodes/proxy- services- endpoints- podsverbs: ["get", "list", "watch"]
- apiGroups:- extensionsresources:- ingressesverbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:name: prometheusnamespace: prom
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: prometheus
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: prometheus
subjects:
- kind: ServiceAccountname: prometheusnamespace: prom
1.2 获取 Kubernetes API Server 的访问凭证
- 通过
kubectl
命令获取ServiceAccount
的 token:
1.24 之前的版本
kubectl -n prom get secret $(kubectl -n prom get sa/prometheus -o jsonpath="{.secrets[0].name}") -o jsonpath="{.data.token}" | base64 --decode
1.24 开始及以后
创建临时的 token,会过期
kubectl create token prometheus -n prom
创建永久 toke
在 Kubernetes 中,生成的 Token 默认是临时的。要生成永久的 Token,你需要为 ServiceAccount 创建一个与之关联的 Secret
,并确保 Token 没有过期时间。下面是生成永久 Token 的步骤。
1. 创建 ServiceAccount
首先,确保你已经为 **prometheus**
创建了一个 ServiceAccount。如果还没有,你可以使用以下命令创建:
kubectl create serviceaccount prometheus -n prom
2. 创建与 ServiceAccount 关联的 Secret
接下来,为这个 ServiceAccount 创建一个 **Secret**
,这个 **Secret**
会包含永久的 Token。
apiVersion: v1
kind: Secret
metadata:name: prometheus-tokennamespace: promannotations: kubernetes.io/service-account.name: "prometheus"
type: kubernetes.io/service-account-token
将上述 YAML 文件保存为 **prometheus-token-secret.yaml**
,然后应用它:
kubectl apply -f prometheus-token-secret.yaml
3. 获取生成的永久 Token
应用上面的配置后,Kubernetes 会自动为 **prometheus**
ServiceAccount 生成一个永久的 Token。你可以使用以下命令获取它:
kubectl get secret prometheus-token -n prom -o go-template='{{.data.token | base64decode}}'
这个命令将输出一个长字符串,即为生成的 Token。
将 token 存放在文件中
mkdir -p /etc/prometheus/token
kubectl get secret prometheus-token -n prom -o go-template='{{.data.token | base64decode}}' > /etc/prometheus/token/prometheus_bearer_token
- 记录 Kubernetes API 服务器的地址:
kubectl cluster-info
创建 ca 文件
mkdir -p /etc/prometheus/certs
kubectl get configmap -n kube-system kube-root-ca.crt -o jsonpath='{.data.ca\.crt}' > /etc/prometheus/certs/ca.crt
chown prometheus.prometheus /etc/prometheus/certs/ca.crt
kubernetes_sd_configs_130">2. **配置 Prometheus 的 **kubernetes_sd_configs
在 Prometheus 的 prometheus.yml
配置文件中,配置 kubernetes_sd_configs
使用刚才获取的 API 访问凭证来采集 exporter 暴露的指标,因为 Prometheus 在 k8s 集群外部不方便访问 k8s 内部(当然可以用 LoadBalancer、Ingress 的形式暴露,但有些情况不适合用这些种方式,因为他们自带负载均衡的效果,而采集指标是采集每一个,不希望是负载均衡的方式采集),因此,采用通过 APIserver 服务发现和代理访问的方式采集 exporter 暴露的指标。
# my global config
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.scrape_timeout: 10sevaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# Alertmanager configuration
alerting:alertmanagers:- scheme: httpstatic_configs:- targets:- localhost:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- rules/alert-rules-*.yml
- rules/record-rules-*.yml# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090"]# 监控的是 k8s 中的资源对象 node,pod,service,endpoint,ingress等- job_name: 'kube-state-metrics'scheme: httpsmetrics_path: /api/v1/namespaces/prom/services/kube-state-metrics:8080/proxy/metrics#metrics_path: /api/v1/namespaces/prom/services/kube-state-metrics:http/proxy/metricskubernetes_sd_configs:- api_server: 'https://139.196.12.198:6443'role: podbearer_token_file: /etc/prometheus/token/prometheus_bearer_tokentls_config:ca_file: /etc/prometheus/certs/ca.crtbearer_token_file: /etc/prometheus/token/prometheus_bearer_tokentls_config:ca_file: /etc/prometheus/certs/ca.crtrelabel_configs:- separator: ;regex: (.*)target_label: __address__replacement: 139.196.12.198:6443action: replace- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action: keepregex: true- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]action: replacetarget_label: __scheme__regex: (https?)- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]action: replacetarget_label: __metrics_path__regex: (.+)- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]action: replacetarget_label: __address__regex: ([^:]+)(?::\d+)?;(\d+)replacement: $1:$2# 监控 kubernetes 的 apiservers- job_name: 'kubernetes-apiservers'kubernetes_sd_configs:- role: endpointsscheme: httpstls_config:ca_file: /etc/prometheus/certs/ca.crtbearer_token_file: /etc/prometheus/token/prometheus_bearer_tokenrelabel_configs:- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]action: keepregex: default;kubernetes;https# 监控 kubelet 的 CAdvisor- job_name: 'kubernetes-cadvisor'honor_timestamps: true #表示 Prometheus 会遵循从监控目标返回的时间戳metrics_path: /metricsscheme: httpskubernetes_sd_configs:- api_server: 'https://139.196.12.198:6443'role: nodebearer_token_file: /etc/prometheus/token/prometheus_bearer_tokentls_config:ca_file: /etc/prometheus/certs/ca.crtbearer_token_file: /etc/prometheus/token/prometheus_bearer_tokentls_config:ca_file: /etc/prometheus/certs/ca.crtrelabel_configs:- action: labelmapregex: __meta_kubernetes_node_label_(.+)- separator: ;regex: (.*)target_label: __address__replacement: 139.196.12.198:6443action: replace- source_labels: [__meta_kubernetes_node_name]separator: ;regex: (.+)target_label: __metrics_path__replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisoraction: replace# 监控 pod- job_name: 'k8s-pods-metrics'scheme: httpskubernetes_sd_configs:- api_server: 'https://139.196.12.198:6443'role: podbearer_token_file: /etc/prometheus/token/prometheus_bearer_tokentls_config:ca_file: /etc/prometheus/certs/ca.crtbearer_token_file: /etc/prometheus/token/prometheus_bearer_tokentls_config:ca_file: /etc/prometheus/certs/ca.crtrelabel_configs:- separator: ;regex: (.*)target_label: __address__replacement: 139.196.12.198:6443action: replace- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action: keepregex: true- source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_pod_name]action: replacetarget_label: __metrics_path__replacement: /api/v1/namespaces/${1}/pods/${2}/proxy/metricsregex: (.+);(.+)# AlertManager- job_name: 'alertmanager'# metrics_path defaults to '/metrics'# scheme defaults to 'http'.metrics_path: /metricsstatic_configs:- targets:- localhost:9093
2.1 可选:通过 relabel_configs
进行过滤
你可以通过 relabel_configs
来进一步过滤或修改抓取的目标。例如,只抓取 node-exporter
服务的指标。
relabel_configs:- source_labels: [__meta_kubernetes_node_label_name]action: keepregex: node-exporter
3. 验证配置
- 确保 Prometheus 配置文件语法正确,并重新启动 Prometheus 服务。
- 在 Prometheus 的 Web 界面 (
http://<prometheus-server>/targets
) 中检查是否成功发现了node-exporter
节点。
curl -k -H "Authorization: Bearer xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" https://10.0.0.100:10250/metrics/cadvisor