k8s集群安装prometheus

安装准备

k8s集群：最好3节点，单节点也可以。

[root@k8s-master01 charts]# kubectl get nodes -o wide
NAME           STATUS   ROLES    AGE    VERSION    INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION           CONTAINER-RUNTIME
k8s-master01   Ready    master   359d   v1.18.15   192.168.3.180   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.5
k8s-master02   Ready    master   359d   v1.18.15   192.168.3.181   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.5
k8s-master03   Ready    master   359d   v1.18.15   192.168.3.182   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.5
k8s-node01     Ready    node     357d   v1.18.15   192.168.3.183   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.5
k8s-node02     Ready    node     357d   v1.18.15   192.168.3.184   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.55

k8s集群已安装helm和nginx-ingress

[root@k8s-master01 ~]# helm version
version.BuildInfo{Version:"v3.5.0", GitCommit:"32c22239423b3b4ba6706d450bd044baffdcf9e6", GitTreeState:"clean", GoVersion:"go1.15.6"}
[root@k8s-master01 ~]# helm list|grep nginx
ingress-nginx   default         1               2021-02-02 19:35:03.810016289 +0800 CST deployed        ingress-nginx-3.22.0    0.43.0
[root@k8s-master01 ~]# kubectl get svc|grep nginx
ingress-nginx-controller             ClusterIP   10.110.65.147    <none>        80/TCP,443/TCP   359d
ingress-nginx-controller-admission   ClusterIP   10.106.137.172   <none>        443/TCP          359

k8s集群已安装存储类，并设置为默认存储

[root@k8s-master01 ~]# kubectl get storageclass
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  359d

这里我们选择用helm方式安装，我们选择的chart地址：https://prometheus-community.github.io/helm-charts/kube-prometheus-stack , 这个chart默认在 https://prometheus-community.github.io/helm-charts 上。

helm安装prometheus

配置helm仓库

[root@k8s-master01 ~]# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

helm安装prometheus

时下最新的版本是30.1.0，该版本默认安装有坑，暂时不考虑，我们这里我们选择安装版本13.4.1

[root@k8s-master01 ~]# kubectl create ns monitor
[root@k8s-master01 ~]# helm install prometheus prometheus-community/kube-prometheus-stack  --version=13.4.1 -n monitor
NAME: prometheus
LAST DEPLOYED: Fri Jan 28 07:49:38 2022
NAMESPACE: monitor
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace monitor get pods -l "release=prometheus"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
[root@k8s-master01 charts]#

通过上述命令，将以默认的配置在kubernetes中部署prometheus。默认情况下，chart会安装alertmamanger、grafana、prometheus、prometheus-operator、metrics这5个pod，另外会根据k8s集群节点数安装等数量的node-exporter。

确认安装状态

查看创建的pod

[root@k8s-master01 ~]# kubectl get po -n monitor
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0   2/2     Running   0          6m12s
prometheus-grafana-7d988fdbfd-rxkgp                      2/2     Running   0          6m25s
prometheus-kube-prometheus-operator-66c48678d6-fb7kv     1/1     Running   0          6m25s
prometheus-kube-state-metrics-c65b87574-848nt            1/1     Running   0          6m25s
prometheus-prometheus-kube-prometheus-prometheus-0       2/2     Running   0          6m12s
prometheus-prometheus-node-exporter-ltpm9                1/1     Running   0          6m25s
prometheus-prometheus-node-exporter-m5dsw                1/1     Running   0          6m25s
prometheus-prometheus-node-exporter-pl95j                1/1     Running   0          6m25s
prometheus-prometheus-node-exporter-qfflh                1/1     Running   0          6m25s
prometheus-prometheus-node-exporter-s7hx4                1/1     Running   0          6m25s

查看创建的svc

[root@k8s-master01 ~]# kubectl get svc -n monitor
NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                     ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   13m
prometheus-grafana                        ClusterIP   10.105.120.240   <none>        80/TCP                       13m
prometheus-kube-prometheus-alertmanager   ClusterIP   10.102.236.253   <none>        9093/TCP                     13m
prometheus-kube-prometheus-operator       ClusterIP   10.106.0.53      <none>        443/TCP                      13m
prometheus-kube-prometheus-prometheus     ClusterIP   10.109.139.214   <none>        9090/TCP                     13m
prometheus-kube-state-metrics             ClusterIP   10.97.203.125    <none>        8080/TCP                     13m
prometheus-operated                       ClusterIP   None             <none>        9090/TCP                     13m
prometheus-prometheus-node-exporter       ClusterIP   10.106.174.214   <none>        9100/TCP                     13m

查看创建的pv，pvc

[root@k8s-master01 ~]# kubectl get pv,pvc -n monitor   
No resources found in monitor namespace.

查看创建的ingress

[root@k8s-master01 ~]# kubectl get ingress -n monitor   
No resources found in monitor namespace.

启用ingress和持久化存储

默认安装情况下，该chart并没有对外开放访问接口，且没有启用持久化存储。我们需要设置一些参数，以启用ingress和持久化存储。

helm install prometheus prometheus-community/kube-prometheus-stack  --version=13.4.1 \
--set alertmanager.ingress.enabled=true \
--set alertmanager.ingress.hosts[0]=alertmanager.k8s.com \
--set alertmanager.ingress.paths[0]=/ \
--set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.spec.resources.requests.storage=5Gi \
--set grafana.ingress.enabled=true \
--set grafana.ingress.hosts[0]=grafana.k8s.com \
--set grafana.ingress.paths[0]=/ \
--set prometheus.ingress.enabled=true \
--set prometheus.ingress.hosts[0]=prometheus.k8s.com \
--set prometheus.ingress.paths[0]=/ \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=5Gi \
-n monitor

查看创建的pv，pvc

[root@k8s-master01 ~]# kubectl get pv,pvc -n monitor
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                                                                    STORAGECLASS   REASON   AGE
persistentvolume/pvc-5af2ea16-f8ec-4a8b-afc5-4f3dfe5787f4   5Gi        RWO            Delete           Bound    monitor/alertmanager-prometheus-kube-prometheus-alertmanager-db-alertmanager-prometheus-kube-prometheus-alertmanager-0   local-path              21m
persistentvolume/pvc-725f81b5-13ae-4f81-902d-9fccf6532663   5Gi        RWO            Delete           Bound    monitor/prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0           local-path              3m29s

NAME                                                                                                                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/alertmanager-prometheus-kube-prometheus-alertmanager-db-alertmanager-prometheus-kube-prometheus-alertmanager-0   Bound    pvc-5af2ea16-f8ec-4a8b-afc5-4f3dfe5787f4   5Gi        RWO            local-path     22m
persistentvolumeclaim/prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0           Bound    pvc-725f81b5-13ae-4f81-902d-9fccf6532663   5Gi        RWO            local-path     4m11s

查看创建的ingress

[root@k8s-master01 ~]# kubectl get ingress -n monitor      
NAME                                      CLASS    HOSTS                  ADDRESS         PORTS   AGE
prometheus-grafana                        <none>   grafana.k8s.com        10.110.65.147   80      5m13s
prometheus-kube-prometheus-alertmanager   <none>   alertmanager.k8s.com   10.110.65.147   80      5m13s
prometheus-kube-prometheus-prometheus     <none>   prometheus.k8s.com     10.110.65.147   80      5m13s

该chart也支持其他设置，更多配置信息可以通过以下命令查看

helm show values prometheus-community/kube-prometheus-stack  --version=13.4.1

访问prometheus相关系统

ip域名映射处理

在访问prometheus、grafana和alertmanage系统之前，我们需要在使用客户端访问电脑建立ip和相关域名的映射关系如下：

192.168.3.210	grafana.k8s.com	alertmanager.k8s.com prometheus.k8s.com

其中192.168.3.210这个IP可以通过下面这条命令获得

[root@k8s-master01 ~]# kubectl get configmap cluster-info -n kube-public -o yaml|grep server
        server: https://192.168.3.210:6444

提示：win10或win7电脑中维护ip和相关域名的文件位置在 C:\Windows\System32\drivers\etc\hosts。

访问prometheus

访问 http://prometheus.k8s.com

访问grafana

访问 http://grafana.k8s.com

默认用户名admin，密码是prom-operator，其中密码也可以通过以下命令获得

[root@k8s-master01 ~]# kubectl get secrets -n monitor prometheus-grafana  -o yaml|grep admin-password|grep -v '{}'|awk '{print $2}'|base64 -d
prom-operator[root@k8s-master01 ~]#

输入用户名和密码，登录后进入首页如下

从首页我们点击左边栏的Dashboards按钮

显示页面如下

点开上图“Default”文件夹，我们可以看到已经内置了很多面板如下，感兴趣的可以点开看看

访问alertmanager

访问 http://alertmanager.k8s.com

总结

到这里，我们的监控prometheus系统就算是部署成功了，k8s集群部署了prometheus监控，就像给我们开发者安装了火眼金睛，从此k8s集群所有的服务器节点状态、所有的CPU、内存、存储资源、网络使用情况全都可以一览无遗；另外，我们除了可以实时监控所有服务器节点运行的pod、job状态，还能够基于prometheus进行扩展，实现更多的监控场景。

如何正确使用k8s集群，让我们先从安装监控系统prometheus开始。

目录CONTENT