Centos 单机部署 Prometheus
文章目录
此文档为归档文件,不保证有效,且供参考
环境说明
- 操作系统: CentOS Linux release 7.9.2009 (Core)
- Prometheus Version: 2.25.0
Prometheus
软件包下载
|
|
设置至环境变量内
|
|
添加为服务启动
|
|
上面示例中,prometheus 监听在 127.0.0.1:9091 之上,外部无法访问 127.0.0.1 地址,且默认 Prometheus 为做任何加密处理。这里演示使用
nginx
虚拟主机配置代理功能
&用户密码
实现访问,其他还有很多工具可以实现,这里不多赘述,可以自行 百度 搜索。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
yum install httpd-tools -y htpasswd -c /etc/prometheus/.auth admin # 交互式创建 admin 账号和密码 vim /usr/local/nginx/conf/vhost/prometheus.conf # nginx 添加 prometheus.conf 虚拟主机配置文件,类容如下所示。 server { listen 1800; server_name prometheus.treesir.pub; charset utf-8; location / { auth_basic "Prometheus"; auth_basic_user_file /etc/prometheus/.auth; proxy_pass http://127.0.0.1:9091; } } [root@Myvps ~]# nginx -t nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful [root@Myvps ~]# nginx -s reload # 重载配置
-
添加至systemctl 服务
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
chown prometheus:prometheus -R /var/lib/prometheus/ cat > /usr/lib/systemd/system/prometheus.service << EOF [Unit] Description=Prometheus Documentation=https://prometheus.io/ After=network.target [Service] Type=simple User=prometheus ExecStart=/usr/local/prometheus/prometheus \ --config.file=/etc/prometheus/prometheus.yml \ --storage.tsdb.path=/var/lib/prometheus \ --web.external-url=http://prometheus.treesir.pub:1800/prometheus \ --storage.tsdb.retention.time=168h \ --web.enable-lifecycle \ --storage.tsdb.no-lockfile \ --web.route-prefix=/prometheus \ --web.listen-address=0.0.0.0:19091 \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl start prometheus && systemctl enable prometheus && systemctl status prometheus
删除不必要的标签 参考配置
|
|
AlertManager 安装部署
|
|
-
Alertmanager 启动
1
nohup alertmanager --config.file=/etc/alertmanager/alertmanager.yml --web.external-url=http://hdkj.alertmanager.com >/tmp/alertmanager.log 2>&1 & #启动
-
添加为服务自启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
cat > /usr/lib/systemd/system/alertmanager.service <<EOF [Unit] Description=alertmanager Documentation=https://github.com/prometheus/alertmanager After=network.target [Service] Type=simple User=root ExecStart=/usr/local/bin/php-fpm-exporter --addr 0.0.0.0:9190 --endpoint http://127.0.0.1:9010/status Restart=on-failure [Install] WantedBy=multi-user.target EOF
-
prometheus dingtalk webhook
1 2 3 4 5 6 7 8 9 10
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v1.4.0/prometheus-webhook-dingtalk-1.4.0.linux-amd64.tar.gz mv prometheus-webhook-dingtalk-1.4.0.linux-amd64 /usr/local/webhook-dingtalk cat /etc/profile export PATH=/usr/local/webhook-dingtalk:$PATH # 添加到环境变量至 /etc/profile 中永久生效 mkdir -p /etc/webhook-dingtalk cp /usr/local/webhook-dingtalk/config.example.yml /etc/webhook-dingtalk/config.yml # 添加配置文件
添加自定义机器人, 选择密钥加签
将默认配置文件中的,token 替换为刚才生成的 token,如配置了密钥加签还需要将 加签密钥,添加到 secret:
自动中
|
|
测试启动
|
|
可以看的,发送消息成功了,只是我们发送的消息 和 alert 模板里面的值不匹配,导致渲染消息没有成功
。
webhook-dingtalk 配置为服务自启动
|
|
-
dingtalk
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
cat > /etc/alertmanager/alertmanager.yml << EOF global: resolve_timeout: 5m route: receiver: webhook group_wait: 30s group_interval: 5m repeat_interval: 4h group_by: [alertname] routes: - receiver: webhook group_wait: 10s match: team: node receivers: - name: webhook webhook_configs: - url: http://127.0.0.1:8060/dingtalk/webhook1/send send_resolved: true EOF
alertmange 配置服务自启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
alertmanager \ --config.file=/etc/alertmanager/alertmanager.yml \ --web.external-url=http://alertmanager.treesir.pub \ --web.listen-address="127.0.0.1:9093" # 测试启动 cat > /usr/lib/systemd/system/alertmanager.service <<EOF [Unit] Description=alertmanager Documentation=https://github.com/prometheus/alertmanager After=network.target [Service] Type=simple User=root ExecStart=/usr/local/alertmanager/alertmanager \ --config.file=/etc/alertmanager/alertmanager.yml \ --web.external-url=http://alertmanager.treesir.pub \ --web.listen-address=127.0.0.1:9093 Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl enable alertmanager.service systemctl start alertmanager.service systemctl status alertmanager.service
配置linux _node_exporter
|
|
常用公式
|
|
自定义数据持久查询
|
|
服务发现
-
基于文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
cd /etc/prometheus mkdir -pv targets/{linux_nodes,docker_nodes,win_nodes} [root@localhost prometheus]# cat prometheus.yml global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: - 192.168.8.131:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: - "rules/node_alerts.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: - job_name: 'Prometheus' static_configs: - targets: ['192.168.8.122:9090'] labels: instance: '192.168.8.122:9090' - job_name: 'linux_node' file_sd_configs: - files: - targets/linux_nodes/*.json refresh_interval: 1m - job_name: 'docker' file_sd_configs: - files: - targets/docker_nodes/*.json refresh_interval: 1m - job_name: 'win_node' file_sd_configs: - files: - targets/win_nodes/*.json refresh_interval: 1m - job_name: 'alertmanager' static_configs: - targets: ['192.168.8.131:9093'] labels: instance: '192.168.8.131:9093' [root@localhost prometheus]# cat targets/docker_nodes/docker_nodes.json [{ "targets": [ "192.168.8.122:9999", "192.168.8.131:9999" ] }] [root@localhost prometheus]# cat targets/linux_nodes/linux_nodes.json [{ "targets": [ "192.168.8.131:9100", "192.168.8.122:9100" ] }] [root@localhost prometheus]# cat targets/win_nodes/yangzun_node.json [{ "targets": [ "192.168.8.66:9182" ] }] [root@localhost prometheus]# promtool check config prometheus.yml Checking prometheus.yml SUCCESS: 1 rule files found Checking rules/node_alerts.yml SUCCESS: 3 rules found /usr/sbin/lsof -n -P -t -i :9090 |xargs kill -HUP // 也可以使用下面的这种方式(YAML) # cat /etc/prometheus/targets/nodes/demo.json - targets: - "192.168.20.172:8080" - "192.168.20.173:8080" - "192.168.20.174:8080"
alertmanager 设置钉钉告警 参考链接
|
|
配置黑盒监控
(下载地址)[https://github.com/prometheus/blackbox_exporter]
|
|
-
blackbox_exporter 添加至自启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
cat > /usr/lib/systemd/system/blackbox_exporter.service <<EOF [Unit] Description=blackbox_exporter Documentation=https://github.com/prometheus/blackbox_exporter After=network.target [Service] Type=simple User=root ExecStart=/usr/local/bin/blackbox_exporter --config.file=/etc/exporter/blackbox.yml --web.listen-address=192.168.8.122:9115 Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload && systemctl start blackbox_exporter && systemctl status blackbox_exporter #启动 systemctl enable blackbox_exporter #加入开机自启动 lsof -i :9115
-
docker 启动
1 2 3 4 5 6 7 8 9 10 11 12 13
mkdir -p /application/black-box-exporter/config wget -O /application/black-box-exporter/config/blackbox.yml https://raw.githubusercontent.com/prometheus/blackbox_exporter/master/blackbox.yml docker run -d \ -p 9115:9115 --name blackbox_exporter \ --restart always \ --net=host \ -v /application/black-box-exporter/config:/config prom/blackbox-exporter:master \ --config.file=/config/blackbox.yml \ --web.external-url=/black-box
配置php-fpm_exporter
|
|
-
添加至systemd服务 及开机自启动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
#添加开机自启动 cat > /usr/lib/systemd/system/php-fpm-exporter.service <<EOF [Unit] Description=php-fpm-exporter Documentation=https://github.com/hipages/php-fpm_exporter After=network.target [Service] Type=simple User=root ExecStart=/usr/local/bin/php-fpm-exporter --addr 0.0.0.0:9190 --endpoint http://127.0.0.1:9010/status Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload && systemctl start php-fpm-exporter && systemctl status php-fpm-exporter systemctl enable php-fpm-exporter lsof -i :9090
win_exporter 安装配置
|
|
grafana 启动
|
|