Zabbix agent 2 - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

告警OnCall事件中心建设方法白皮书

中心建设方法一站式处理值班 OnCall，智能降噪北京快猫星云科技有限公司前言市面上有众多监控系统，刨去商业软件不说，开源的就有 Nagios、Zabbix、Open-Falcon、 Nightingale、Grafana、Prometheus、Elastalert 等等，还有云厂商提供的监控系统，比如华为云的云监控、腾讯云的云监控、阿里云的监控、腾讯云的云监控、阿里云的云监控，甚至有些云厂商会提供多个割裂的监控系统，比如阿里云不但有云监控，还有 ARMS，还有 SLS。大部分公司都不会只使用一套监控系统，网络设备的监控可能采用的 Zabbix，Kubernetes 的监控可能用的 Prometheus（Kubernetes 可能有多套，以至于 Prometheus 可能有多套）或者 Nightingale，日志的监控可能用的 Elasta 把告警中心收到的所有告警，按照时间维度做收敛，比如按照分钟颗粒度，一分钟内所有告警收敛成一个故障，下一分钟所有告警收敛成另一个故障。显然，一个故障内的多个告警相互之间可能没有关联关系，所以这种收敛方法不是太好。 2、根据时间 + 标签做收敛除了时间维度，再加上某个标签作为收敛维度，比如机器标签，某个时间段内所有 A 机器的告警收敛成一个故障，所有 B 机器的告警收敛成另一个故障。或者按照服务维度，某个时间段内所有

0 码力 | 23 页 | 1.75 MB | 1 年前
3
B站统⼀监控系统的设计,演进与实践分享

prometheus server1 server2 server3 prometheus IDC HA prometheus server1 server2 server3 prometheus IDC Federation pr s s s pr I pr s s s pr I IDC1 IDC2 prometheus prometheus filter数据降低使⽤用成本 agent prometheus target target target alert_manager 告警平服务 cache db平台 rms资外围系统监控⽬目规则⽣生告警规 api 规则管理理获取监控⽬目标 IDC_1 agent prometheus target target target IDC_2 获取  监控⽬目标  降低使⽤用成本 agent prometheus target target target alert_manager 告警平服务 cache db平台 rms资外围系统监控⽬目规则⽣生告警规 api 规则管理理获取监控⽬目标 IDC_1 agent prometheus target target target IDC_2 获取  监控⽬目标 

0 码力 | 34 页 | 650.25 KB | 1 年前
3
1.6 利用夜莺扩展能力打造全方位监控系统

UlricQin Nightingale 众多企业已上生产，共同打磨夜莺 Server01 Server02 Agentd Agentd LoadBalance 1. 单机版Prom 2. 集群版m3db 3. 集群版n9e-tsdb 3种存储方案，按需选择 Agentd 夜莺设计实现 Agentd 数据采集第四部分监控系统的核心功能，是数据采集、存储、分析、展示，完不同的探针机器、目标机器，便于管理和知识传承 • 独创在端上流式读取日志，根据正则提取指标的机制，轻量易用，无业务侵入性 • 内置集成了多种数据库中间件的采集以及网络设备的采集，复用telegraf和datadog-agent的能力 • 支持statsd的udp协议，用于业务应用的apm监控分析夜莺数据采集 01.监控数据采集，all in one的agentd 夜莺数据采集 02. Autoconfig

0 码力 | 40 页 | 3.85 MB | 1 年前
3
PromQL 从入门到精通

255.240.0 broadcast 10.206.15.255 inet6 fe80::5054:ff:fed2:a180 prefixlen 64 scopeid 0x20 ether 52:54:00:d2:a1:80 txqueuelen 1000 (Ethernet) RX packets 457952401 bytes 启动以来收到的总的包量，TX packets 后面的值是 OS 启动以来发出去的总的包量，都是很大的值，我们通常不太关注这个值当前是多少，更关注的是最近 1 分钟收到/发出多少包，或者每秒收到/发出多少包。 1 2 3 4 5 6 7 8 而对于监控数据采集器而言，一般是周期性运行的，比如每 10 秒采集一次，每次采集网卡收到/发出的包这个数据的时候，都只能采集到当前的值，就像执行 ifconfig 中的数据是哪里来的？实际上，Prometheus 有个启动参数，--query.lookback-delta=2m 来控制这个行为，如果配置为 2m，就表示，Prometheus 会查询 2022-08-25 15:46:03 ~ 2022-08-25 15:48:03 这 2 分钟之间的数据，然后返回最新的那个。查询类型上例中的 mem_available_percent{app="clickhouse"}

0 码力 | 16 页 | 2.77 MB | 1 年前
3
Prometheus Deep Dive - Monitoring. At scale.

we can’t get rid of, we go into feature moratorium 2.3.2 is the first fully stable release in the 2.x train Richard Hartmann & Frederic Branczyk @TwitchiH & @fredbrancz Prometheus Deep Dive Introduction ACID databases... Atomicity - since 1.x Consistency - since 1.x Isolation - will happen within 2.x Durability - since 2.0 Richard Hartmann & Frederic Branczyk @TwitchiH & @fredbrancz Prometheus io/2017-munich/talks/staleness-in-prometheus-2-0/ Staleness and Isolation in Prometheus 2.0: https://promcon.io/2017-munich/talks/staleness-in-prometheus-2-0/ Social aspects of change: https://promcon

0 码力 | 34 页 | 370.20 KB | 1 年前
3
4 【王琼】容器监控架构演进王琼 YY直播

• • • • ⚫ • • • ⚫ • • 计算指标需要多少内存 https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫

0 码力 | 23 页 | 2.17 MB | 1 年前
3
Intro to Prometheus - With a dash of operations & observability

Prometheus Introduction Background Operations & observability Outro Time split 1 1/3 Prometheus 2 1/3 Observability 3 1/3 Questions Richard Hartmann & Frederic Branczyk @TwitchiH & @fredbrancz Intro

0 码力 | 19 页 | 63.73 KB | 1 年前
3
OpenMetrics - Standing on the shoulders of Titans

1027 1544554800 histogram_bucket{le=" 1" } 123 # {foo=" bar" } 42 1544554800 histogram_bucket{le=" 2" } 234 # {foo=" bar" } 23 1544554799.123 histogram_bucket{le=" 3" } 345 1544554800 # {foo=" bar" } 11

0 码力 | 21 页 | 84.83 KB | 1 年前
3

共 8 条前往

页

分类

语言

格式

告警OnCall事件中心建设方法白皮书

B站统⼀监控系统的设计,演进与实践分享

1.6 利用夜莺扩展能力打造全方位监控系统

PromQL 从入门到精通

Prometheus Deep Dive - Monitoring. At scale.

4 【王琼】容器监控架构演进王琼 YY直播

Intro to Prometheus - With a dash of operations & observability

OpenMetrics - Standing on the shoulders of Titans