AlertManager 安装和配置钉钉、邮件

邹阳
2023-12-01

一、AlertManager 安装和配置

1、创建软件包目录
[bigdata@tsp3dev01 ~]$ mkdir -p /software/alertmanager
2、下载安装包
  1. 下载地址:https://github.com/prometheus/alertmanager/releases/download/v0.22.1/alertmanager-0.22.1.linux-amd64.tar.gz
  2. 上传安装包到/software/alertmanager备用
3、解压到安装目录
[bigdata@tsp3dev01 ~]$ sudo tar xzvf /software/alertmanager/alertmanager-0.22.1.linux-amd64.tar.gz -C /usr/local/
4、修改权限
[bigdata@tsp3dev01 alertmanager]$ cd /usr/local/
[bigdata@tsp3dev01 local]$ chown -R bigdata:bigdata /usr/local/alertmanager
5、创建软链接
[bigdata@tsp3dev01 local]$ ln -s /usr/local/alertmanager-0.22.1.linux-amd64 /usr/local/alertmanager
6、配置服务

创建数据存储目录 mkdir -p /data/prometheus/alertmanager/data

配置服务 vim /usr/lib/systemd/system/alertmanager.service,内容如下:

[Unit]
Description=Alertmanager
After=network.target

[Service]
Type=simple
User=bigdata
ExecStart=/usr/local/alertmanager/alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml --storage.path=/data/alertmanager/data
Restart=on-failure

[Install]
WantedBy=multi-user.target
7、启动服务
systemctl enable alertmanager.service   # 开机启动
systemctl start alertmanager.service	# 启动服务
systemctl status alertmanager.service	# 状态查看
systemctl restart alertmanager.service	# 重启服务

8、查看systemctl 日志
[bigdata@tsp3dev01 alertmanager]$ sudo journalctl -f
9、遇到问题及解决

启动服务时发现network.target 没有启动,网卡文件内容为空

[bigdata@tsp3dev01 alertmanager]$ sudo vim /etc/sysconfig/network-scripts/ifcfg-ens33

新建内容

TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens160
UUID=e16d7a36-61b5-4e7d-9a29-ada939b11762
DEVICE=ens160
ONBOOT=yes
IPADDR=10.6.215.39
PREFIX=24
GATEWAY=10.6.215.254
DNS1=10.10.10.241
DNS2=10.10.10.242
IPV6_PRIVACY=no

启动network systemctl start network

systemctl enable network   	# 开机启动
systemctl start network		# 启动服务
systemctl status network	# 状态查看

10、查看页面

访问 10.6.215.39:9093,可以连接

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ldFmBiyg-1623840421237)(…/…/Users/msi-pc/AppData/Roaming/Typora/typora-user-images/image-20210601160732394.png)]

11、配置
global:#配置邮箱、url、微信等
route: #配置路由树
  - receiver: #从接受组(与route同级别)中选择接受
  - group_by:[]#填写标签的key,通过相同的key不同的value来判断   ===研究rules中的标签值 
  - continue: false #告警是否去继续路由子节点
  - match: [labelname:labelvalue,labelname1,labelvalue1] #通过标签去匹配这次告警是否符合这个路由节点,???必须全部匹配才可以告警???待测试。
  - match_re: [labelname:regex] #通过正则表达是匹配标签,意义同上
  - group_wait: 30s  #组内等待时间,同一分组内收到第一个告警等待多久开始发送,目标是为了同组消息同时发送,不占用告警信息,默认30s
  - group_interval: 5m #当组内已经发送过一个告警,组内若有新增告警需要等待的时间,默认为5m,这条要确定组内信息是影响同一业务才能设置,若分组不合理,可能导致告警延迟,造成影响
  - repeat_inteval: 4h #告警已经发送,且无新增告警,若重复告警需要间隔多久 默认4h 属于重复告警,时间间隔应根据告警的严重程度来设置
  routes:
     - route:#路由子节点 配置信息跟主节点的路由信息一致

同时发送钉钉邮件的例子:

global:
  #每一分钟检查一次是否恢复
  resolve_timeout: 1m

  smtp_smarthost: 'smtp.qq.com:465'
  smtp_from: '1051872464@qq.com'
  smtp_auth_username: '1051872464@qq.com'
  smtp_auth_password: 'drfcmefthkmdbdag'
  smtp_require_tls: false
route:
    #设置默认接收人
   receiver: 'webhook'
   #组告警等待时间。也就是告警产生后等待10s,如果有同组告警一起发出
   group_wait: 10s
   #两组告警的间隔时间
   group_interval: 10s
   #重复告警的间隔时间,减少相同微信告警的发送频率
   repeat_interval: 1h
   #采用哪个标签来作为分组依据
   group_by: [alertname]
   routes:
   - receiver: webhook
     group_wait: 10s
     match:
       alertname: dingding_alertname
   - receiver: mail
     group_wait: 10s
     match:
      #  team: mail
      alertname: mail_alertname
receivers:
- name: 'webhook'
  webhook_configs:
  - url: http://localhost:8060/dingtalk/ops_dingding/send 
- name: 'mail'
  email_configs:
   - to: 'zhangjian201@faw.com.cn'
12、测试
curl -XPOST http://localhost:9093/api/v1/alerts -d '
[
  {
    "labels": {
       "alertname": "DiskRunningFull",
       "dev": "sda1",
       "instance": "中文测试",
       "route": "WEBHOOK"
     },
     "annotations": {
        "info": "The disk sda1 is running full",
        "summary": "please check the instance example1"
      }
  }
]
'

二、Alertmanager实现钉钉告警

1、下载prometheus-webhook-dingtalk
# 下载安装包到 software 目录
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v0.3.0/prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz

# 解压到安装目录
tar -zxf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz -C /opt/software

# 创建软连接
ln -s prometheus-webhook-dingtalk-0.3.0.linux-amd64 prometheus-webhook-dingtalk
2、获取钉钉的dingtalk
3、配置服务服务
vim /etc/systemd/system/prometheus-webhook-dingtalk.service
#添加如下内容
[Unit]
Description=prometheus-webhook-dingtalk
After=network-online.target

[Service]
Restart=on-failure
ExecStart=/opt/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk --ding.profile=ops_dingding=自己钉钉机器人的Webhook地址

[Install]
WantedBy=multi-user.target
4、启动prometheus-webhook-dingtalk 服务
systemctl daemon-reload 

systemctl start prometheus-webhook-dingtalk 

ss -tnl | grep 8060
5、测试
curl   -H "Content-Type: application/json"  -d '{ "version": "4", "status": "firing", "description":"description_content"}'  http://localhost:8060/dingtalk/ops_dingding/send

三、http post 发送告警

1、告警服务

import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.qisi.pojo.AlertManagerData;
import lombok.extern.slf4j.Slf4j;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;

import java.io.IOException;
import java.nio.charset.StandardCharsets;

/**
 * AlertManager 告警服务,
 */
@Slf4j
public class AlertManagerService implements ISendService{

    private static String URI ;

    private CloseableHttpClient httpClient = HttpClientBuilder.create().build();

    public AlertManagerService(String host, int port){
        URI = String.format("http://%s:%d/api/v1/alerts", host, port);
    }

    public void send(AlertManagerData data) {
        JSONArray arr = new JSONArray();
        int statusCode = 0;
        try {
            arr.add(data);
            HttpPost httpPost = new HttpPost(URI);

            StringEntity se = new StringEntity(JSONObject.toJSONString(arr), StandardCharsets.UTF_8);

            se.setContentEncoding("utf8");
            se.setContentType("application/json");
            httpPost.addHeader("Content-type","application/json; charset=utf-8");
            httpPost.setHeader("Accept", "application/json");
            httpPost.setEntity(se);

            CloseableHttpResponse response = httpClient.execute(httpPost);
            statusCode = response.getStatusLine().getStatusCode();
            log.info("Code:{},Send Alert Data Succeed:{}", statusCode, arr.toString());
        }catch (Exception e){
            log.error("Code:{},Send Alert Data Failed:{}", statusCode, arr.toString());
            e.printStackTrace();
        }
    }

    public void close() throws IOException {
        httpClient.close();
    }

}

2、告警内容实体类

import lombok.AllArgsConstructor;
import lombok.Data;

/**
 * 告警内容实体类,保存发送到 AlertManager 的数据
 */
@Data
@AllArgsConstructor
public class AlertManagerData {

    private Labels labels;
    private Annotations annotations;

    public enum Route {
        webhook, mail
    }

    @Data
    @AllArgsConstructor
    public static class Labels {

        private String alertname;
        private String dev;
        private String instance;
        private Route route;

    }

    @Data
    @AllArgsConstructor
    public static class Annotations {

        private String info;
        private String summary;

    }
}
 类似资料: