kubernetes安装Prometheus监控集群状态

Prometheus介绍

Prometheus(普罗米修斯)是一个最初在SoundCloud上构建的监控系统,自2012年成为社区开源项目,拥有非常活跃的开发人员和用户社区,为强调开源及独立维护,Prometheus于2016年加入CNCF,成为继kubernetes之后的第二个托管项目。
官方网站

Prometheus特点

  • 多维数据模型: 由度量名称和键值对标识的时间序列数据
  • 内置时间序列数据库:TSDB
  • promQL: 一种灵活的查询语言,可以利用多维数据完成复杂查询
  • 基于HTTP的pull(拉取)方式采集时间序列数据(exporter)
  • 同时支持PushGateway组件收集数据
  • 通过服务发现或静态配置发现目标
  • 多种图形模式及仪表盘支持
  • 支持做为数据源接入Grafana

Prometheus架构

Exporters(可以自定义开发)

  • http接口
  • 定义监控项和监控项的标签(维度)
  • 按一定的数据结构组织监控数据
  • 以时间序列被收集

Prometheus Server

  • Retrieve(数据收集器)
  • TSDB(时间序列数据库)
  • Configure (static_config、 kubernetes_sd、 file_sd)
  • HTTP Server

Grafana

  • 多种多样的插件
  • 数据源(Prometheus)
  • Dashboard(PromQL)

Alertmanager

  • rules.yml (PromQL)

Prometueus与zabbix对比

Prometheuszabbix
后端golang开发,定制化难度较低后端用C开发,界面用PHP开发,定制化难度很高
更适合云环境的监控,尤其对K8S支持更好更适合监控物理机、虚拟机环境
监控数据存储在基于时间序列的数据库内,便于对已有数据进行新的聚合监控数据存储在关系型数据库内,很难从现有数据中扩展维度
自身界面相对较弱,很多配置需要修改配置文件,但可以由grafana出图图形界面相对成熟,界面上基本能完成全部的配置操作
支持更大的集群规模,速度也更快集群规模上限为10000个节点
2015年后开始快速发展,社区活跃,适用场景越来越多发展时间更长,对于很多监控场景,都有现成的解决方案

Prometueus插件安装部署

Exporters安装

在运维主机k8s-dns.boysec.cn上:

作用:监控K8S基本运行状态

下载kube-state-metrics镜像:

1
2
3
docker pull quay.io/coreos/kube-state-metrics:v1.9.7
docker tag quay.io/coreos/kube-state-metrics:v1.9.7 harbor.od.com/public/kube-state-metrics:v1.9.7
docker push harbor.od.com/public/kube-state-metrics:v1.9.7

准备资源配置清单

1
2
[root@k8s-dns ~]# mkdir /var/k8s-yaml/kube-state-metrics
[root@k8s-dns ~]# cd /var/k8s-yaml/kube-state-metrics

vim /var/k8s-yaml/kube-state-metrics/deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
labels:
grafanak8sapp: "true"
app: kube-state-metrics
name: kube-state-metrics
namespace: kube-system
spec:
selector:
matchLabels:
grafanak8sapp: "true"
app: kube-state-metrics
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
grafanak8sapp: "true"
app: kube-state-metrics
spec:
containers:
- name: kube-state-metrics
image: harbor.od.com/public/kube-state-metrics:v1.9.7
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
name: http-metrics
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
serviceAccountName: kube-state-metrics

vim /var/k8s-yaml/kube-state-metrics/rbac.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
name: kube-state-metrics
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
name: kube-state-metrics
rules:
- apiGroups:
- ""
resources:
- configmaps
- secrets
- nodes
- pods
- services
- resourcequotas
- replicationcontrollers
- limitranges
- persistentvolumeclaims
- persistentvolumes
- namespaces
- endpoints
verbs:
- list
- watch
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- list
- watch
- apiGroups:
- extensions
resources:
- daemonsets
- deployments
- replicasets
verbs:
- list
- watch
- apiGroups:
- apps
resources:
- statefulsets
verbs:
- list
- watch
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- list
- watch
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: kube-system

应用资源配置清单

在任意一台k8s运算节点执行:

1
2
[root@k8s-node01 ~]# kubectl apply -f http://k8s-yaml.od.com/kube-state-metrics/rbac.yaml
[root@k8s-node01 ~]# kubectl apply -f http://k8s-yaml.od.com/kube-state-metrics/deployment.yaml

node-exporter安装

作用:监控运算节点资源
采集机器(物理机、虚拟机、云主机等)的监控指标数据,能够采集到的指标包括CPU, 内存,磁盘,网络,文件数等信息

下载镜像

1
2
3
docker pull prom/node-exporter:v1.0.0
docker tag prom/node-exporter:v1.0.0 harbor.od.com/public/node-exporter:v1.0
docker push harbor.od.com/public/node-exporter:v1.0

准备资源配置清单

1
2
[root@k8s-dns ]# mkdir /var/k8s-yaml/node-exporter
[root@k8s-dns ]# cd /var/k8s-yaml/node-exporter

vim /var/k8s-yaml/node-exporter/daemonset.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: node-exporter
namespace: kube-system
labels:
daemon: "node-exporter"
grafanak8sapp: "true"
spec:
selector:
matchLabels:
daemon: "node-exporter"
grafanak8sapp: "true"
template:
metadata:
name: node-exporter
labels:
daemon: "node-exporter"
grafanak8sapp: "true"
spec:
volumes:
- name: proc
hostPath:
path: /proc
type: ""
- name: sys
hostPath:
path: /sys
type: ""
containers:
- name: node-exporter
image: harbor.od.com/public/node-exporter:v1.0
imagePullPolicy: IfNotPresent
args:
- --path.procfs=/host_proc
- --path.sysfs=/host_sys
ports:
- name: node-exporter
hostPort: 9100
containerPort: 9100
protocol: TCP
volumeMounts:
- name: sys
readOnly: true
mountPath: /host_sys
- name: proc
readOnly: true
mountPath: /host_proc
hostNetwork: true

应用资源配置清单

在任意一台k8s运算节点执行:

1
[root@k8s-node01 ~]# kubectl apply -f http://k8s-yaml.od.com/node-exporter/daemonset.yaml

node-exporter监听端口:

1
2
[root@k8s-node01 ~]# netstat -lntup |grep 9100
tcp6 0 0 :::9100 :::* LISTEN 19339/node_exporter

检查metrics服务

1
curl localhost:9100/metrics

cadvisor服务

作用:可以对节点机器上的资源及容器进行实时监控和性能数据采集,包括CPU使用情况、内存使用情况、网络吞吐量及文件系统使用情况

1
2
3
docker pull google/cadvisor:v0.28.3
docker tag 75f88e3ec333 harbor.od.com/public/cadvisor:v0.28.3
docker push harbor.od.com/public/cadvisor:v0.28.3

准备资源配置清单

1
2
[root@k8s-dns ~]# mkdir /var/k8s-yaml/cadvisor
[root@k8s-dns ~]# cd /var/k8s-yaml/cadvisor

vim /var/k8s-yaml/cadvisor/deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cadvisor
namespace: kube-system
labels:
app: cadvisor
spec:
selector:
matchLabels:
name: cadvisor
template:
metadata:
labels:
name: cadvisor
spec:
hostNetwork: true
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: cadvisor
image: harbor.od.com/public/cadvisor:v0.28.3
imagePullPolicy: IfNotPresent
volumeMounts:
- name: rootfs
mountPath: /rootfs
readOnly: true
- name: var-run
mountPath: /var/run
- name: sys
mountPath: /sys
readOnly: true
- name: docker
mountPath: /var/lib/docker
readOnly: true
ports:
- name: http
containerPort: 4194
protocol: TCP
readinessProbe:
tcpSocket:
port: 4194
initialDelaySeconds: 5
periodSeconds: 10
args:
- --housekeeping_interval=10s
- --port=4194
terminationGracePeriodSeconds: 30
volumes:
- name: rootfs
hostPath:
path: /
- name: var-run
hostPath:
path: /var/run
- name: sys
hostPath:
path: /sys
- name: docker
hostPath:
path: /data/docker

应用资源配置清单

修改运算节点软连接
所有运算节点上:

1
2
mount -o remount,rw /sys/fs/cgroup/
ln -s /sys/fs/cgroup/cpu,cpuacct /sys/fs/cgroup/cpuacct,cpu

在任意一台k8s运算节点执行:

1
kubectl apply -f http://k8s-yaml.od.com/cadvisor/deployment.yaml

blackbox-exporter安装

作用:监控业务容器是否存活。

blackbox_exporter是Prometheus 官方提供的 exporter 之一,可以提供 http、dns、tcp、icmp 的监控数据采集。

1
2
3
docker pull prom/blackbox-exporter:v0.15.1
docker tag prom/blackbox-exporter:v0.15.1 harbor.od.com/public/blackbox-exporter:v0.15.1
docker push harbor.od.com/public/blackbox-exporter:v0.15.1

准备资源配置清单

1
2
[root@k8s-dns ~]# mkdir /var/k8s-yaml/blackbox-export
[root@k8s-dns ~]# cd /var/k8s-yaml/blackbox-export

vim /var/k8s-yaml/blackbox-export/deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
kind: Deployment
apiVersion: apps/v1
metadata:
name: blackbox-exporter
namespace: kube-system
labels:
app: blackbox-exporter
annotations:
deployment.kubernetes.io/revision: "21"
spec:
replicas: 1
selector:
matchLabels:
app: blackbox-exporter
template:
metadata:
labels:
app: blackbox-exporter
spec:
volumes:
- name: config
configMap:
name: blackbox-exporter
defaultMode: 420
containers:
- name: blackbox-exporter
image: harbor.od.com/public/blackbox-exporter:v0.15.1
imagePullPolicy: IfNotPresent
args:
- --config.file=/etc/blackbox_exporter/blackbox.yml
- --log.level=info
- --web.listen-address=:9115
ports:
- name: blackbox-port
containerPort: 9115
protocol: TCP
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 50Mi
volumeMounts:
- name: config
mountPath: /etc/blackbox_exporter
readinessProbe:
tcpSocket:
port: 9115
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3

vim /var/k8s-yaml/blackbox-export/configmap.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: blackbox-exporter
name: blackbox-exporter
namespace: kube-system
data:
blackbox.yml: |-
modules:
http_2xx:
prober: http
timeout: 2s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2"]
valid_status_codes: [200,301,302]
method: GET
preferred_ip_protocol: "ip4"
tcp_connect:
prober: tcp
timeout: 2s

vim /var/k8s-yaml/blackbox-export/svc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
kind: Service
apiVersion: v1
metadata:
name: blackbox-exporter
namespace: kube-system
spec:
selector:
app: blackbox-exporter
ports:
- name: blackbox-port
protocol: TCP
port: 9115

vim /var/k8s-yaml/blackbox-export/ingress.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: blackbox-exporter
namespace: kube-system
spec:
rules:
- host: blackbox.od.com
http:
paths:
- path: /
backend:
serviceName: blackbox-exporter
servicePort: blackbox-port

配置dns解析

1
2
3
vi /var/named/chroot/etc/od.com.zone 
...
blackbox A 10.1.1.50

应用资源配置清单

在任意一台k8s运算节点执行:

1
2
3
4
kubectl apply -f http://k8s-yaml.od.com/blackbox-export/configmap.yaml
kubectl apply -f http://k8s-yaml.od.com/blackbox-export/deployment.yaml
kubectl apply -f http://k8s-yaml.od.com/blackbox-export/svc.yaml
kubectl apply -f http://k8s-yaml.od.com/blackbox-export/ingress.yaml

浏览器访问:
http://blackbox.od.com/

部署Prometheus-server

下载镜像

1
2
3
docker pull prom/prometheus:v2.22.0
docker tag prom/prometheus:v2.22.0 harbor.od.com/infra/prometheus:v2.22.0
docker push harbor.od.com/infra/prometheus:v2.22.0

准备prometheus的配置文件

1
2
3
4
5
6
mkdir -pv /data/nfs-volume/prometheus/{etc,prom-db}
cd /data/nfs-volume/prometheus/etc/

cp /opt/certs/ca.pem .
cp -a /opt/certs/client.pem .
cp -a /opt/certs/client-key.pem .

prometheus配置文件

vim /data/nfs-volume/prometheus/etc/prometheus.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'etcd'
tls_config:
ca_file: /data/etc/ca.pem
cert_file: /data/etc/client.pem
key_file: /data/etc/client-key.pem
scheme: https
static_configs:
- targets:
- '10.1.1.100:2379'
- '10.1.1.110:2379'
- '10.1.1.120:2379'
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kubernetes-kubelet'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __address__
replacement: ${1}:10255
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __address__
replacement: ${1}:4194
- job_name: 'kubernetes-kube-state'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [__meta_kubernetes_pod_label_grafanak8sapp]
regex: .*true.*
action: keep
- source_labels: ['__meta_kubernetes_pod_label_daemon', '__meta_kubernetes_pod_node_name']
regex: 'node-exporter;(.*)'
action: replace
target_label: nodename
- job_name: 'blackbox_http_pod_probe'
metrics_path: /probe
kubernetes_sd_configs:
- role: pod
params:
module: [http_2xx]
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_blackbox_scheme]
action: keep
regex: http
- source_labels: [__address__, __meta_kubernetes_pod_annotation_blackbox_port, __meta_kubernetes_pod_annotation_blackbox_path]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+);(.+)
replacement: $1:$2$3
target_label: __param_target
- action: replace
target_label: __address__
replacement: blackbox-exporter.kube-system:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'blackbox_tcp_pod_probe'
metrics_path: /probe
kubernetes_sd_configs:
- role: pod
params:
module: [tcp_connect]
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_blackbox_scheme]
action: keep
regex: tcp
- source_labels: [__address__, __meta_kubernetes_pod_annotation_blackbox_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __param_target
- action: replace
target_label: __address__
replacement: blackbox-exporter.kube-system:9115
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'traefik'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
action: keep
regex: traefik
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name

准备资源配置清单

1
2
[root@k8s-dns ]# mkdir /var/k8s-yaml/prometheus
[root@k8s-dns ]# cd /var/k8s-yaml/prometheus

vim /var/k8s-yaml/prometheus/deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "5"
labels:
name: prometheus
name: prometheus
namespace: infra
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 7
selector:
matchLabels:
app: prometheus
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: prometheus
spec:
nodeName: k8s-node01
containers:
- name: prometheus
image: harbor.od.com/infra/prometheus:v2.22.0
imagePullPolicy: IfNotPresent
command:
- /bin/prometheus
args:
- --config.file=/data/etc/prometheus.yml
- --storage.tsdb.path=/data/prom-db
- --storage.tsdb.retention=72h
- --storage.tsdb.min-block-duration=10m
ports:
- containerPort: 9090
protocol: TCP
volumeMounts:
- mountPath: /data
name: data
resources:
requests:
cpu: "1000m"
memory: "1.5Gi"
limits:
cpu: "2000m"
memory: "3Gi"
imagePullSecrets:
- name: harbor
securityContext:
runAsUser: 0
serviceAccountName: prometheus
volumes:
- name: data
nfs:
server: k8s-dns
path: /data/nfs-volume/prometheus

vim /var/k8s-yaml/prometheus/rbac.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
name: prometheus
namespace: infra
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
name: prometheus
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: infra

vim /var/k8s-yaml/prometheus/svc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: infra
spec:
ports:
- port: 9090
protocol: TCP
targetPort: 9090
selector:
app: prometheus

vim /var/k8s-yaml/prometheus/ingress.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: traefik
name: prometheus
namespace: infra
spec:
rules:
- host: prometheus.od.com
http:
paths:
- path: /
backend:
serviceName: prometheus
servicePort: 9090

配置DNS解析

1
2
3
vi /var/named/chroot/etc/od.com.zone 
...
prometheus A 10.1.1.50

应用资源配置清单

在任意一台k8s运算节点执行:

1
2
3
4
kubectl apply -f http://k8s-yaml.od.com/prometheus/rbac.yaml
kubectl apply -f http://k8s-yaml.od.com/prometheus/deployment.yaml
kubectl apply -f http://k8s-yaml.od.com/prometheus/svc.yaml
kubectl apply -f http://k8s-yaml.od.com/prometheus/ingress.yaml

浏览器访问
http://prometheus.od.com

blackbox变化

blackbox 监控服务是否存活

在dubbo-demo-service控制器上加annotations,并重启pod,监控生效

TCP

1
2
3
4
"annotations": {
"blackbox_port": "20880",
"blackbox_scheme": "tcp"
}

http

1
2
3
4
5
"annotations": {
"blackbox_path": "/hello?name=health",
"blackbox_port": "8080",
"blackbox_scheme": "http"
}

jvm

1
2
3
4
5
"annotations": {
"prometheus_io_scrape": "true",
"prometheus_io_port": "12346",
"prometheus_io_path": "/"
}

小试牛刀-Prometheus监控traefik-ingress

在traefik的pod控制器上加annotations,并重启pod,监控生效

JSON修改方法:

1
2
3
4
5
"annotations": {
"prometheus_io_scheme": "traefik",
"prometheus_io_path": "/metrics",
"prometheus_io_port": "8080"
}

YAML修改方法:

1
2
3
4
annotations:
prometheus_io_path: /metrics
prometheus_io_port: '8080'
prometheus_io_scheme: traefik

修改DaemonSet

Prometheus查看