1. 程式人生 > 實用技巧 >Prometheus監控神器-Kubernetes篇(一)

Prometheus監控神器-Kubernetes篇(一)

在Kubernetes中手動部署Statefulset型別的Prometheus、Alertmanager叢集,並使用StorageClass來持久化資料。
本篇使用StorageClass來持久化資料,搭建Statefulset的Prometheus聯邦叢集,對於資料持久化,方案眾多,如Thanos、M3DB、InfluxDB、VictorMetric等,根據自己的需求​進行選擇,後面會詳細講解針對資料持久化的具體細節。

部署一個對外可以訪問的Prometheus,首先要建立Prometheus所在的Namespace,然後在建立Prometheus使用的RBAC規則,建立Prometheus的 ConfigMap 來儲存配置檔案。
建立SVC繫結固定叢集IP,建立Statefulset有狀態的Prometheus容器的Pod,最後建立Ingress 實現外部域名訪問Prometheus。

如果Kubernetes版本比較舊的話,為了便於測試,可以進行升級一下,使用 sealos 自動部署工具快速一鍵部署高可用叢集,對於是否使用kuboard,針對自己需求去部署。

環境

我的本地環境使用的 sealos 一鍵部署,主要是為了便於測試。

OS Kubernetes HostName IP Service
Ubuntu 18.04 1.17.7 sealos-k8s-m1 192.168.1.151 node-exporter prometheus-federate-0
Ubuntu 18.04 1.17.7 sealos-k8s-m2 192.168.1.152 node-exporter grafana alertmanager-0
Ubuntu 18.04 1.17.7 sealos-k8s-m3 192.168.1.150 node-exporter alertmanager-1
Ubuntu 18.04 1.17.7 sealos-k8s-node1 192.168.1.153 node-exporter prometheus-0 kube-state-metrics
Ubuntu 18.04 1.17.7 sealos-k8s-node2 192.168.1.154 node-exporter prometheus-1
Ubuntu 18.04 1.17.7 sealos-k8s-node2 192.168.1.155 node-exporter prometheus-2

給master跟node加標籤

prometheus

kubectl label node sealos-k8s-node1 k8s-app=prometheus
kubectl label node sealos-k8s-node2 k8s-app=prometheus
kubectl label node sealos-k8s-node3 k8s-app=prometheus

federate

kubectl label node sealos-k8s-m1 k8s-app=prometheus-federate

alertmanager

kubectl label node sealos-k8s-m2 k8s-app=alertmanager
kubectl label node sealos-k8s-m3 k8s-app=alertmanager

建立對應的部署目錄

mkdir /data/manual-deploy/ && cd /data/manual-deploy/
mkdir alertmanager grafana ingress-nginx kube-state-metrics node-exporter prometheus
部署 Prometheus

建立Prometheus的storageclass配置檔案

cat prometheus-data-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: prometheus-lpv
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
建立Prometheus的sc的pv配置檔案,同時指定了排程節點。

在需要排程的Prometheus的node上建立目錄與賦權

mkdir /data/prometheus
chown -R 65534:65534 /data/prometheus

cat prometheus-federate-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-lpv-0
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:

  • ReadWriteOnce
    persistentVolumeReclaimPolicy: Retain
    storageClassName: prometheus-lpv
    local:
    path: /data/prometheus
    nodeAffinity:
    required:
    nodeSelectorTerms:
    • matchExpressions:
      • key: kubernetes.io/hostname
        operator: In
        values:
        • sealos-k8s-node1

apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-lpv-1
spec:
capacity:
storage: 20Gi
volumeMode: Filesystem
accessModes:

  • ReadWriteOnce
    persistentVolumeReclaimPolicy: Retain
    storageClassName: prometheus-lpv
    local:
    path: /data/prometheus
    nodeAffinity:
    required:
    nodeSelectorTerms:
    • matchExpressions:
      • key: kubernetes.io/hostname
        operator: In
        values:
        • sealos-k8s-node2

apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-lpv-2
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:

  • ReadWriteOnce
    persistentVolumeReclaimPolicy: Retain
    storageClassName: prometheus-lpv
    local:
    path: /data/prometheus
    nodeAffinity:
    required:
    nodeSelectorTerms:
    • matchExpressions:
      • key: kubernetes.io/hostname
        operator: In
        values:
        • sealos-k8s-node3
          建立Prometheus的RBAC檔案。

cat prometheus-rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1 # api的version
kind: ClusterRole # 型別
metadata:
name: prometheus
rules:

  • apiGroups: [""]
    resources: # 資源
    • nodes
    • nodes/proxy
    • services
    • endpoints
    • pods
      verbs: ["get", "list", "watch"]
  • apiGroups:
    • extensions
      resources:
    • ingresses
      verbs: ["get", "list", "watch"]
  • nonResourceURLs: ["/metrics"]
    verbs: ["get"]

apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus # 自定義名字
namespace: kube-system # 名稱空間

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef: # 選擇需要繫結的Role
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects: # 物件

  • kind: ServiceAccount
    name: prometheus
    namespace: kube-system
    建立Prometheus的configmap配置檔案。

cat prometheus-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-system
data:
prometheus.yml: |
global:
scrape_interval: 30s
evaluation_interval: 30s
external_labels:
cluster: "01"
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: _meta_kubernetes_node_label(.+)
- target_label: address
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: metrics_path
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: _meta_kubernetes_node_label(.+)
- target_label: address
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: metrics_path
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
metric_relabel_configs:
- action: replace
source_labels: [id]
regex: '/machine.slice/machine-rkt\x2d([\]+)\.+/([^/]+).service$'
target_label: rkt_container_name
replacement: '${2}-${1}'
- action: replace
source_labels: [id]
regex: '^/system.slice/(.+).service$'
target_label: systemd_service_name
replacement: '${1}'
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: metrics_path
regex: (.+)
- source_labels: [address, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?