告警相关

本节讲解告警策略相关API

字段说明

  • name:策略名称
  • nid: 策略关联的对象树节点id
  • excl_nid: 排除关联对象树节点下的子节点id
  • tags: 监控指标的tags
  • priority: 告警等级,可以设置1,2,3
  • alert_dur: 告警统计周期,单位为秒
  • enable_stime:策略生效开始时间
  • enable_etime:策略生效终止时间
  • enable_days_of_week:策略生效日期
  • exprs
    • eopt:操作符,枚举[=,!=,>,>=,<,<=]
    • func:告警函数,支持all happen max min avg sum diff pdiff nodata
    • metric:监控指标
    • params:告警函数需要的参数
    • threshold:告警函数需要的阈值
  • recovery_dur:持续多少秒则产生恢复event,0表示立即产生恢复event
  • recovery_notify:0 发送恢复通知 1不发送恢复通知
  • converge:告警通知收敛,第1个值表示收敛周期,单位秒,第2个值表示周期内允许发送告警次数
  • notify_group:告警信息接收组
  • notify_user:告警信息接收人
  • callback:告警触发之后的回调地址
  • need_upgrade:是否配置告警升级 0表示否 1表示是
  • alert_upgrade
    • duration: 告警持续多久触发升级,单位为秒
    • level:升级的告警等级
    • users: 升级之后发送的告警信息接收人
    • groups: 升级之后发送的告警信息接收组

创建策略

POST /api/portal/stra

请求样例

  1. {
  2. "name": "all必触发",
  3. "nid": 21,
  4. "excl_nid": null,
  5. "priority": 3,
  6. "alert_dur": 60,
  7. "exprs": [
  8. {
  9. "eopt": "!=",
  10. "func": "all",
  11. "metric": "cpu.idle",
  12. "params": [],
  13. "threshold": 0
  14. }
  15. ],
  16. "tags": [],
  17. "recovery_dur": 0,
  18. "recovery_notify": 1,
  19. "alert_upgrade": {
  20. "duration": 60,
  21. "level": 1,
  22. "users": [],
  23. "groups": []
  24. },
  25. "converge": [3600,1],
  26. "notify_group": [],
  27. "notify_user": [5],
  28. "callback": "",
  29. "enable_stime": "00:00",
  30. "enable_etime": "23:59",
  31. "enable_days_of_week": [0,1,2,3,4,5,6],
  32. "need_upgrade": 0,
  33. "id": 13
  34. }

返回样例

  1. {
  2. "err":"",
  3. "dat": {"id": 1}
  4. }

更新策略

PUT /api/portal/stra

请求样例

  1. {
  2. "id":1,
  3. "name": "all必触发",
  4. "nid": 21,
  5. "excl_nid": null,
  6. "priority": 3,
  7. "alert_dur": 60,
  8. "exprs": [
  9. {
  10. "eopt": "!=",
  11. "func": "all",
  12. "metric": "cpu.idle",
  13. "params": [],
  14. "threshold": 0
  15. }
  16. ],
  17. "tags": [],
  18. "recovery_dur": 0,
  19. "recovery_notify": 1,
  20. "alert_upgrade": {
  21. "duration": 60,
  22. "level": 1,
  23. "users": [],
  24. "groups": []
  25. },
  26. "converge": [3600,1],
  27. "notify_group": [],
  28. "notify_user": [5],
  29. "callback": "",
  30. "enable_stime": "00:00",
  31. "enable_etime": "23:59",
  32. "enable_days_of_week": [0,1,2,3,4,5,6],
  33. "need_upgrade": 0,
  34. "id": 13
  35. }

返回样例

  1. {
  2. "err":"",
  3. "dat":"ok"
  4. }

删除策略

DELETE /api/portal/stra

请求样例

  1. {
  2. "ids":[4]
  3. }

返回样例

  1. {
  2. "err":"",
  3. "dat":"ok"
  4. }

查看所有策略

GET /api/portal/stra?nid=1
nid:服务树节点id,选填,不填则获取所有策略

返回样例

  1. {
  2. "dat": [
  3. {
  4. "id": 1,
  5. "name": "io.util大于90%",
  6. "category": 1,
  7. "nid": 100,
  8. "alert_dur": 600,
  9. "recovery_dur": 120,
  10. "enable_stime": "00:00",
  11. "enable_etime": "23:59",
  12. "priority": 3,
  13. "callback": "",
  14. "creator": "root",
  15. "created": "2019-03-06T16:47:16+08:00",
  16. "last_updator": "root",
  17. "last_updated": "2019-03-06T16:47:16+08:00",
  18. "excl_nid": [99],
  19. "exprs": [
  20. {
  21. "eopt": ">",
  22. "func": "abs",
  23. "metric": "qps",
  24. "params": [3],
  25. "threshold": 10
  26. }
  27. ],
  28. "tags": [
  29. {
  30. "tkey": "host",
  31. "topt": "=",
  32. "tval": ["nightingale.host1"]
  33. }
  34. ],
  35. "enable_days_of_week": [0,1,2,3,4,5,6],
  36. "converge": [60,3],
  37. "recovery_notify": 1,
  38. "notify_group": [1,3],
  39. "notify_user": [1,3],
  40. "leaf_nids": null,
  41. "need_upgrade":1,
  42. "alert_upgrade":{
  43. "users":[1,3],
  44. "groups":[1,3],
  45. "duration":1000,
  46. "level":1
  47. }
  48. },
  49. ],
  50. "err": ""
  51. }

查看单个策略

GET /api/portal/stra/:sid

返回样例

  1. {
  2. "dat":
  3. {
  4. "id": 1,
  5. "name": "io.util大于90%",
  6. "category": 1,
  7. "nid": 100,
  8. "alert_dur": 600,
  9. "recovery_dur": 120,
  10. "enable_stime": "00:00",
  11. "enable_etime": "23:59",
  12. "priority": 3,
  13. "callback": "",
  14. "creator": "root",
  15. "created": "2019-03-06T16:47:16+08:00",
  16. "last_updator": "root",
  17. "last_updated": "2019-03-06T16:47:16+08:00",
  18. "excl_nid": [99],
  19. "exprs": [
  20. {
  21. "eopt": ">",
  22. "func": "abs",
  23. "metric": "qps",
  24. "params": [3],
  25. "threshold": 10
  26. }
  27. ],
  28. "tags": [
  29. {
  30. "tkey": "host",
  31. "topt": "=",
  32. "tval": ["nightingale.host1"]
  33. }
  34. ],
  35. "enable_days_of_week": [0,1,2,3,4,5,6],
  36. "converge": [60,3],
  37. "recovery_notify": 1,
  38. "notify_group": [1,3],
  39. "notify_user": [1,3],
  40. "leaf_nids": null,
  41. "need_upgrade":1,
  42. "alert_upgrade":{
  43. "users":[1,3],
  44. "groups":[1,3],
  45. "duration":1000,
  46. "level":1
  47. }
  48. },
  49. "err": ""
  50. }

查看所有生效策略

GET /api/portal/stras/effective?all=1

返回样例

  1. {
  2. "dat": [
  3. {
  4. "id": 1,
  5. "name": "io.util大于90%",
  6. "category": 1,
  7. "nid": 100,
  8. "alert_dur": 600,
  9. "recovery_dur": 120,
  10. "enable_stime": "00:00",
  11. "enable_etime": "23:59",
  12. "priority": 3,
  13. "callback": "",
  14. "creator": "root",
  15. "created": "2019-03-06T16:47:16+08:00",
  16. "last_updator": "root",
  17. "last_updated": "2019-03-06T16:47:16+08:00",
  18. "excl_nid": [99],
  19. "exprs": [
  20. {
  21. "eopt": ">",
  22. "func": "abs",
  23. "metric": "qps",
  24. "params": [3],
  25. "threshold": 10
  26. }
  27. ],
  28. "tags": [
  29. {
  30. "tkey": "host",
  31. "topt": "=",
  32. "tval": ["nightingale.host1"]
  33. }
  34. ],
  35. "enable_days_of_week": [0,1,2,3,4,5,6],
  36. "converge": [60,3],
  37. "recovery_notify": 1,
  38. "notify_group": [1,3],
  39. "notify_user": [1,3],
  40. "leaf_nids": null,
  41. "need_upgrade":1,
  42. "alert_upgrade":{
  43. "users":[1,3],
  44. "groups":[1,3],
  45. "duration":1000,
  46. "level":1
  47. }
  48. },
  49. ],
  50. "err": ""
  51. }

查看未恢复告警历史

GET /api/portal/event/cur?limit=10&nodepath=mon.monapi&p=1&stime=1584427852&etime=1584435052
参数说明

  • limit 返回个数
  • p 页数
  • nodepath 对象树节点
  • priorities 告警级别
  • stime 筛选范围起始时间
  • etime 筛选范围终止时间

返回样例

  1. {
  2. "dat": {
  3. "list": [
  4. {
  5. "id": 16,
  6. "sid": 14,
  7. "sname": "某磁盘无法正常读写",
  8. "node_path": "mon.monapi",
  9. "nid": 21,
  10. "endpoint": "192.168.1.2",
  11. "priority": 1,
  12. "event_type": "alert",
  13. "category": 1,
  14. "hashid": 775538174592195020,
  15. "etime": 1584435048,
  16. "value": "disk.rw.error: 3",
  17. "info": " disk.rw.error(all,60s) > 0",
  18. "tags": "mount=/home",
  19. "created": "2020-02-21T23:32:31+08:00",
  20. "detail": [
  21. {
  22. "metric": "disk.rw.error",
  23. "tags": {
  24. "mount": "/home"
  25. },
  26. "points": [
  27. {
  28. "timestamp": 1584435048,
  29. "value": 3
  30. },
  31. {
  32. "timestamp": 1584435028,
  33. "value": 3
  34. },
  35. {
  36. "timestamp": 1584435008,
  37. "value": 3
  38. }
  39. ]
  40. }
  41. ],
  42. "users": [
  43. "qinyening"
  44. ],
  45. "groups": [],
  46. "status": [
  47. "已收敛"
  48. ],
  49. "claimants": [
  50. "qinyening",
  51. "root"
  52. ],
  53. "need_upgrade": 0,
  54. "alert_upgrade": {
  55. "users": null,
  56. "groups": [],
  57. "duration": 60,
  58. "level": 1
  59. }
  60. }
  61. ],
  62. "total": 1
  63. },
  64. "err": ""
  65. }

查看全部告警历史

GET /api/portal/event/his?limit=10&nodepath=mon.monapi&p=1&stime=1584427852&etime=1584435052
参数说明

  • limit 返回个数
  • p 页数
  • nodepath 对象树节点
  • priorities 告警级别
  • stime 筛选范围起始时间
  • etime 筛选范围终止时间

返回样例

  1. {
  2. "dat": {
  3. "list": [
  4. {
  5. "id": 16,
  6. "sid": 14,
  7. "sname": "某磁盘无法正常读写",
  8. "node_path": "mon.monapi",
  9. "nid": 21,
  10. "endpoint": "192.168.1.2",
  11. "priority": 1,
  12. "event_type": "alert",
  13. "category": 1,
  14. "hashid": 775538174592195020,
  15. "etime": 1584435048,
  16. "value": "disk.rw.error: 3",
  17. "info": " disk.rw.error(all,60s) > 0",
  18. "tags": "mount=/home",
  19. "created": "2020-02-21T23:32:31+08:00",
  20. "detail": [
  21. {
  22. "metric": "disk.rw.error",
  23. "tags": {
  24. "mount": "/home"
  25. },
  26. "points": [
  27. {
  28. "timestamp": 1584435048,
  29. "value": 3
  30. },
  31. {
  32. "timestamp": 1584435028,
  33. "value": 3
  34. },
  35. {
  36. "timestamp": 1584435008,
  37. "value": 3
  38. }
  39. ]
  40. }
  41. ],
  42. "users": [
  43. "qinyening"
  44. ],
  45. "groups": [],
  46. "status": [
  47. "已收敛"
  48. ],
  49. "claimants": [
  50. "qinyening",
  51. "root"
  52. ],
  53. "need_upgrade": 0,
  54. "alert_upgrade": {
  55. "users": null,
  56. "groups": [],
  57. "duration": 60,
  58. "level": 1
  59. }
  60. }
  61. ],
  62. "total": 1
  63. },
  64. "err": ""
  65. }

创建告警屏蔽

POST /api/portal/maskconf
请求样例

  1. {
  2. "btime": 1584436015,
  3. "etime": 1584439615,
  4. "cause": "快速屏蔽",
  5. "metric": "cpu.idle",
  6. "endpoints": [
  7. "192.168.1.2"
  8. ],
  9. "nid": 28
  10. }

解除告警屏蔽

DELETE /api/portal/maskconf/:id

  • id:屏蔽配置id

查看告警屏蔽

GET /api/portal/node/:id/maskconf

返回样例

  1. {
  2. "dat": [
  3. {
  4. "id": 6,
  5. "nid": 28,
  6. "node_path": "didi.mon.judge",
  7. "metric": "cpu.idle",
  8. "tags": "",
  9. "cause": "快速屏蔽",
  10. "user": "qinyening",
  11. "btime": 1584436521,
  12. "etime": 1584440121,
  13. "endpoints": [
  14. "192.1681.2"
  15. ]
  16. }
  17. ],
  18. "err": ""
  19. }