Prometheus:基本概念说明及安装和简单使用

时间：2023-07-05

一: 概念 1.1 数据模型

Prometheus 从根本上将所有数据存储为时间序列：带有时间戳 value 的 stream 属于同一 metric 和同一组 label 的维度。

除此之外，Prometheus 可能会生成临时的推导派生时间序列作为查询的结果。

1.1.1 监控指标定义(Metric names & labels)

每个时间序列都由其指标名称和称为标签的可选 KV 对进行唯一标识。

指标名称指定了所测量系统的一般功能（e.g、http_requests_total 表示收到的HTTP 请求总数）。它可以包含 ASCII 字母和数字，以及下划线和冒号。

标签功能使得 Promtheus 的数据模型具备了维度这个特性：具有相同名称的指标，其任意组合的标签能够标识一个特定的维度（例如：所有使用方法 POST 到 /api/tracks 的 HTTP 请求）。查询语言（QL）允许基于这些维度进行过滤和聚合。更改任意的标签值，包括添加和删除标签，都会创建一个新的时间序列。

prometheus的所有监控指标被统一定义为：

给出一个指标的名称和一组标签，时序通常以如下形式表达：

{=,...}

时间序列数据通过 metric名和键值对来区分。
标签可以体现指标的维度特征，用于过滤和聚合。它通过标签名和标签值这种键值对的形成多种维度。

例如:

api_http_requests_total{method="POST", handler="/messages"}

1.2 指标类型(Metric types) 1.2.1 Counter(计数器类型,只增不减)

Counter是一个累积指标，他代表一个单调递增的计数器，其值只能增加或在重新启动时重置为零。通常会结合rate()方法获取该指标在某个时间段的变化率。

例如，您可以使用 counter 来表示已服务的请求数量，已完成任务数量，或错误的数量。

1.2.2 Gauge(仪表盘类型,可增可减)

Gauge是一个代表可以任意上下波动的单个数值的指标。通常用于测量值; 大部分监控数据都是这种类型的。

例如温度、当前内存使用量，还用于可能上升和下降的计数，例如并发请求数
如: CPU使用率，内存使用率，集群节点个数

1.2.3 Histogram和Summary

Histogram(直方图)对观察值（通常是请求持续时间或响应大小）进行采样，并将其计数在可配置的 buckets 中。它还提供所有观察值的总和。

//小于1.6382e+06个chunk的序列有260个(这个指标表示prometheus每个本地存储序列保存的chunk数量)prometheus_local_storage_series_chunks_persisted{le="1.6382e+06"}260

Summary类似于 histogram，一个 summary 会采样观察值（通常是请求持续事件和响应大小）。尽管它还提供了观测值的总数和所有观测值的总和，但它可以计算滑动时间窗口内的可配置分位数。

//表示有90%的同步时间是低于0.014s的Prometheus_tsdb_wal_fsync_duration_seconds{quantile="0.9"} 0.014

二者都用于凸显数据的分布情况, 也都可以统计发生的次数或者大小.
Summary和Histogram都提供了对于事件的计数_count以及值的汇总_sum。

不同在于Histogram可以通过histogram_quantile函数在服务器端计算分位数，而Sumamry的分位数则是直接在客户端进行定义。
因此对于分位数的计算，Summary在通过PromQL进行查询时有更好的性能表现，而Histogram则会消耗更多的资源。

Summary计算的指标不能再获取平均数，一般适用于独立的监控指标，例如垃圾回收时间等。

1.3 作业和实例（Jobs and instances）

在 Prometheus 术语中，一个您可以抓取的 endpoint 被称为一个 instance，通常对应到一个单独的进程。

一组同样目的的 instance 的集合，进程因扩展性或可靠性而被复制的叫做一个 Job。

例如

下面的job(api-server)具有四个实例的 API server job：job: api-serverinstance 1: 1.2.3.4:5670instance 2: 1.2.3.4:5671instance 3: 1.2.3.3:5677instance 3: 1.2.3.3:5678

1.4 自动生成标签和时序

当 Prometheus 抓取一个 target 时，它会自动在抓取的时序上附加一些标签，用以识别被抓取的 target：

job：target 所属的已配置的 job 名称。instance：target 被抓取的 URL 的 host:port 部分。

对于每一个实例抓取，Prometheus 按照以下时序存储样本：

up{job="", instance=""}。如果实例健康，则值为 1，也就是可访问，如果抓取失败则值为 0scrape_duration_seconds{job="", instance=""}：抓取持续时间。scrape_samples_post_metric_relabeling{job="", instance=""}: metric relabel 生效后剩余的样本数。scrape_samples_scraped{job="", instance=""}: target 暴露的样本数量。scrape_series_added{job="", instance=""}: 在一次抓取中新时序的大约数量。v2.10 新增。二: 安装(2.6.1)

第一步:下载Prometheus安装包, 版本是2.6.1
链接：https://pan.baidu.com/s/1UY2JwofNKN6LbwtS8CxqUg
提取码：xs0r

第二步: 将安装包上传到linux上

第三步: 在opt目录下新建文件夹prometheus

[root@localhost ~]# cd /opt[root@localhost opt]# lsapache-maven-3.5.4 elasticsearch-7.6.1 filebeat rhapache-maven-3.5.4-bin.tar.gz elasticsearch-analysis-ik-7.4.0 kibana-7.6.1-linux-x86_64containerd elasticsearch-analysis-ik-7.4.0.zip maven[root@localhost opt]# mkdir prometheus[root@localhost opt]# lsapache-maven-3.5.4 elasticsearch-7.6.1 filebeat prometheusapache-maven-3.5.4-bin.tar.gz elasticsearch-analysis-ik-7.4.0 kibana-7.6.1-linux-x86_64 rhcontainerd elasticsearch-analysis-ik-7.4.0.zip maven[root@localhost opt]#

第四步: 将压缩包移动并解压到Prometheus这个目录下面
移动

[root@localhost ~]# mv prometheus-2.6.1.linux-amd64.tar.gz /opt/prometheus/[root@localhost ~]# ls /opt/prometheus/prometheus-2.6.1.linux-amd64.tar.gz[root@localhost ~]#

解压

[root@localhost prometheus]# tar xvfz prometheus-2.6.1.linux-amd64.tar.gz prometheus-2.6.1.linux-amd64/prometheus-2.6.1.linux-amd64/LICENSEprometheus-2.6.1.linux-amd64/prometheusprometheus-2.6.1.linux-amd64/prometheus.ymlprometheus-2.6.1.linux-amd64/consoles/prometheus-2.6.1.linux-amd64/consoles/prometheus-overview.htmlprometheus-2.6.1.linux-amd64/consoles/node-overview.htmlprometheus-2.6.1.linux-amd64/consoles/index.html.exampleprometheus-2.6.1.linux-amd64/consoles/prometheus.htmlprometheus-2.6.1.linux-amd64/consoles/node-disk.htmlprometheus-2.6.1.linux-amd64/consoles/node-cpu.htmlprometheus-2.6.1.linux-amd64/consoles/node.htmlprometheus-2.6.1.linux-amd64/console_libraries/prometheus-2.6.1.linux-amd64/console_libraries/prom.libprometheus-2.6.1.linux-amd64/console_libraries/menu.libprometheus-2.6.1.linux-amd64/NOTICEprometheus-2.6.1.linux-amd64/promtool[root@localhost prometheus]#

查看文件目录

[root@localhost prometheus]# cd prometheus-2.6.1.linux-amd64/[root@localhost prometheus-2.6.1.linux-amd64]# ll总用量 92880drwxr-xr-x、2 3434 3434 38 1月 16 2019 console_librariesdrwxr-xr-x、2 3434 3434 173 1月 16 2019 consoles-rw-r--r--、1 3434 3434 11357 1月 16 2019 LICENSE-rw-r--r--、1 3434 3434 2769 1月 16 2019 NOTICE-rwxr-xr-x、1 3434 3434 57700945 1月 16 2019 prometheus-rw-r--r--、1 3434 3434 926 1月 16 2019 prometheus.yml-rwxr-xr-x、1 3434 3434 37381169 1月 16 2019 promtool[root@localhost prometheus-2.6.1.linux-amd64]#

四: 简单部署Prometheus 第一步: 查看配置文件

进入Prometheus的安装目录, 修改prometheus.yml文件

修改配置文件监控本身的健康状况

查看配置文件内容

[root@localhost prometheus-2.6.1.linux-amd64]# cat prometheus.yml # my global configglobal: scrape_interval: 15s # Set the scrape interval to every 15 seconds、Default is every 1 minute. evaluation_interval: 15s # evaluate rules every 15 seconds、The default is every 1 minute. # scrape_timeout is set to the global default (10s).# alertmanager configurationalerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.rule_files: # - "first_rules.yml" # - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:# Here it's Prometheus itself.scrape_configs: # The job name is added as a label `job=` to any timeseries scraped from this config.被采样的任意时序都会将这个 job 名称会被添加作为一个标签 `job=` - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090']

Prometheus从目标机上通过http方式拉取采样点数据, 它也可以拉取自身服务数据并监控自身的健康状况。

第二步: 启动Prometheus

# 启动 Prometheus.# 默认地, Prometheus 在 ./data 路径下存储其数据库 (flag --storage.tsdb.path).# 在安装目录里面启动./prometheus --config.file=prometheus.yml

执行

[root@localhost prometheus-2.6.1.linux-amd64]# ./prometheus --config.file=prometheus.yml

第三步:访问测试第四步: 查看暴露指标

http://http://192.168.156.132:9090/metrics

一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一: 一:

上一篇：Java中properties文件编码问题

下一篇：浅谈实际开发中常用的分布式事物处理