当前位置: 首页 > 工具软件 > Pulsar Manger > 使用案例 >

Pulsar运维监控

史烈
2023-12-01

消息队列——>用来程序的异步解耦

Queuing:消费一次,不按特定顺序

Streaming:可多次消费,按特定顺序

Pulsar同时支持以上两种

Exclusive,Failover:Streaming 流处理消费模式

Shared,Key Shared:Queue 队列消费模式

Pulsar特性:Durability(持久性),Ordering(有序),Delivery Guarantees(传递保证),High throughput(高吞吐量),Low Latency(低延迟),Unified messaging model(统一消息传递模型),Multi-tenancy(多租户),Geo-replication(跨地域复制),Highly scalable & available(高度可扩展性和可用性)

1. Apache Pulsar核心组件

Broker:计算层。处理producer,consumer交互的协议解析。

BookKeeper/Bookie:存储层。分片segment。

ZooKeeper:协调层。集群元数据的管理,Service discovery服务感知(节点新增进来,节点宕机)。

组件交互:broker,bookie都需要注册到zookeeper上。broker处理producer和consumer的读写请求,并请求bookie。

组件端口(默认端口,可更改):

Broker:TCP/6650:负责Pulsar Client的连接

Http/8080:暴露普罗米修斯的监控指标,暴露Pulsar admin的API

BookKeeper:TCP/3181:broker连接bookKeeper使用的端口

Http/8080:暴露普罗米修斯的监控指标

ZookKeeper:TCP/2181:broker,bookeeper连接使用的端口

Http/8080:暴露普罗米修斯的监控指标

tcp端口主要用于组件间内部通信以及client访问

http端口主要用于提供rest api 和暴露 prometheus 的 metrics

分布式节点个数:

Broker:至少2个

BookKeeper:至少3个

Zookeeper:奇数个,至少3个

2. Pursal上手

2.1 Pursal下载地址

下载地址:Apache Pulsar

清华Mirro:Index of /apache/pulsar

2.2 本地开发单机模式

启动命令:前台运行:bin/pulsar standalone

后台运行:bin/pulsar-daemon standalone

查看集群列表:bin/pulsar-admin clusters list

查看brokers列表:bin/pulsar-admin brokers list test

查看topic列表:bin/pulsar-admin topics list public/default

命令行生产消息:bin/pulsar-client produce my-topic --messages "hello-pulsar"

命令行消费消息:bin/pulsar-client consume my-topic --s "first-subscription"

2.3 集群模式

测试环境使用集群模式推荐使用docker

2.4 运维工具Pulsar Manager

wget https://dist.apache.org/repos/dist/release/pulsar/pulsar-manager/pulsar-manager-0.2.0/apache-pulsar-manager-0.2.0-bin.tar.gz tar -zxvf apache-pulsar-manager-0.2.0-bin.tar.gz cd pulsar-manager tar -xvf pulsar-manager.tar cd pulsar-manager cp -r ../dist ui

建议打开如下两个配置

1.application.properties

bookie.enable=true
pulsar.peek.message=true

2.bkvm.conf

bookie.enable=true

初始化用户名密码(启动前执行初始化):

CSRF_TOKEN=$(curl http://localhost:7750/pulsar-manager/csrf-token) curl \ -H "X-XSRF-TOKEN: $CSRF_TOKEN" \ -H "Cookie: XSRF-TOKEN=$CSRF_TOKEN;" \ -H 'Content-Type: application/json' \ -X PUT http://localhost:7750/pulsar-manager/users/superuser \ -d '{"name": "admin", "password": "apachepulsar", "description": "test", "email": "username@test.org"}'

执行bin/pulsar-manager启动程序

pulsar可视化:admin/apachepulsar:http://localhost:7750/ui/index.html

bookie可视化:admin/admin:http://localhost:7750/bkvm/

2.5 监控工具Prometheus&Grafana

2.5.1 Prometheus

prometheus.yml配置参考模板:apache-pulsar-grafana-dashboard/cluster.yml.template at master · streamnative/apache-pulsar-grafana-dashboard · GitHub

#
# Copyright (c) 2018 Sijie. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
​
---
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).
  external_labels:
    # TODO: replace `<cluster-name>` with the right cluster name. E.g.
    #
    # cluster: test-cluster
    cluster: <cluster-name>
​
# Load and evaluate rules in these files every 'evaluation_interval' seconds.
# rule_files:
​
scrape_configs:
​
  - job_name: "proxy"
    honor_labels: true # don't overwrite job & instance labels
    static_configs:
    - targets:
      # TODO: add the proxies to monitor
      #
      # - 'proxy1:8080'
      # - 'proxy2:8080'
      # - ...
​
  - job_name: "broker"
    honor_labels: true # don't overwrite job & instance labels
    static_configs:
    - targets:
      # TODO: add the brokers to monitor
      #
      # - 'broker1:8080'
      # - 'broker2:8080'
      # - ...
​
  - job_name: "bookie"
    honor_labels: true # don't overwrite job & instance labels
    static_configs:
    - targets:
      # TODO: add the bookies to monitor
      #
      # - 'bookie1:8000'
      # - 'bookie2:8000'
      # - ...
​
  - job_name: "zookeeper"
    honor_labels: true
    static_configs:
    - targets:
      # TODO: add the zookeeper nodes to monitor
      #
      # - 'zookeeper1:8000'
      # - 'zookeeper2:8000'
      # - ...
​
  - job_name: "node_metrics"
    honor_labels: true # don't overwrite job & instance labels
    static_configs:
    - targets:
      # TODO: add the physical machines to monitor
      #
      # - 'node1:9100'
      # - 'node2:9100'
      # - ...

修改完配置后,启动Prometheus。

Prometheus可视化:http://localhost:9090

2.5.2 Grafana

bin/grafana-server启动grafana

grafana可视化:http://localhost:3000

GitHub - streamnative/apache-pulsar-grafana-dashboard: Apache Pulsar Grafana Dashboard

执行下面这个命令

./scripts/generate_dashboards.sh <prometheus-url> <clustername>

<prometheus-url>: The url points to your prometheus servcie. E.g. http://localhost:9090

<clustername>: Your pulsar cluster name.

在grafana的Manage—>import—>Upload JSON File,在apache-pulsar-grafana-dashboard/target/dashboards目录下选择要导入的json文件。

2.5.3 Perf 压力测试

pulsar 提供了压力测试的命令行工具,使用以下命令生产消息:

  • -r:每秒生产的消息总数(所有生产者)

  • -n:生产者数量

  • -s:每条消息的大小(bytes)

  • 最后跟上 topic 名字

bin/pulsar-perf produce -r 100 -n 2 -s 1024 test-perf
​
# 输出内容,从左到右依次是:
# 每秒生产的消息数量:87.2条
# 每秒流量大小:0.7Mb
# 每秒生产失败的消息数:0
# 平均延迟:5.478ms
# 延迟中位数:4.462ms
# 95%的延迟在 11.262ms以内
# 99%的延迟在 25.802ms以内
# 99.9%的延迟在 43.757ms以内
# 99.99%的延迟在 51.956ms以内
# 最大延迟:51.956ms
​
... Throughput produced:   87.2  msg/s ---      0.7 Mbit/s --- failure      0.0 msg/s --- Latency: mean:   5.478 ms - med:   4.642 - 95pct:  11.263 - 99pct:  25.802 - 99.9pct:  43.757 - 99.99pct:  51.956 - Max:  51.956

使用以下命令消费消息:

bin/pulsar-perf consume test-perf
​
# 输出内容,从左到右依次是:
# 每秒消费的消息数量:100.007条
# 每秒流量大小:0.781Mb
# 平均延迟:9.273ms
# 延迟中位数:9ms
# 95%的延迟在 14ms以内
# 99%的延迟在 15ms以内
# 99.9%的延迟在 28ms以内
# 99.99%的延迟在 34ms以内
# 最大延迟:34ms
... Throughput received: 100.007  msg/s -- 0.781 Mbit/s --- Latency: mean: 9.273 ms - med: 9 - 95pct: 14 - 99pct: 15 - 99.9pct: 28 - 99.99pct: 34 - Max: 34

附录 Apache Pulsar入门资料

TGIP CN:GitHub - streamnative/tgip-cn: TGIP-CN (Thank God Its Pulsar) is a weekly live video streaming about Apache Pulsar in Chinese.

Bilibili:StreamNative的个人空间_哔哩哔哩_Bilibili

微信公众号:Apache Pulsar - 从入门到实践合集(假期充电包 | Apache Pulsar 从入门到实践

官方文档:Pulsar(Index of /docs),bookKeeper(Apache BookKeeper - Apache BookKeeper 4.5.0-SNAPSHOT Documentation)

样例:https://github.com/streamnative/examples

GitHub - streamnative/psat_exercise_code: pulsar summit asia workshop execise code

 类似资料: