安装Kafka前需先安装Zookeeper服务,Zookeeper部署参考:https://blog.csdn.net/qq_48671620/article/details/125154389?spm=1001.2014.3001.5501。
node01 | node02 | node03 | |
---|---|---|---|
Zookeeper | follower | leader | follower |
Kafka | kafka | kafka | kafka |
Kafka官方下载地址:https://kafka.apache.org/downloads
(1)解压安装包
tar -zxvf kafka_2.12-3.2.0.tgz -C /opt/soft/
(2)进入到/opt/soft/kafka_2.12-3.2.0/config目录,修改配置文件添加以下参数:
cd /opt/soft/kafka_2.12-3.2.0/config
vim server.properties
#broker 的全局唯一编号,不能重复,只能是数字。
broker.id=0
#处理网络请求的线程数量
num.network.threads=3
#用来处理磁盘 IO 的线程数量
num.io.threads=8
#发送套接字的缓冲区大小
socket.send.buffer.bytes=102400
#接收套接字的缓冲区大小
socket.receive.buffer.bytes=102400
#请求套接字的缓冲区大小
socket.request.max.bytes=104857600
#kafka 运行日志(数据)存放的路径,路径不需要提前创建,kafka 自动帮你创建,可以
配置多个磁盘路径,路径与路径之间可以用","分隔
log.dirs=/opt/soft/kafka_2.12-3.2.0/data
#topic 在当前 broker 上的分区个数
num.partitions=1
#用来恢复和清理 data 下数据的线程数量
num.recovery.threads.per.data.dir=1
# 每个 topic 创建时的副本数,默认时 1 个副本
offsets.topic.replication.factor=1
#segment 文件保留的最长时间,超时将被删除
log.retention.hours=168
#每个 segment 文件的大小,默认最大 1G
log.segment.bytes=1073741824
# 检查过期数据的时间,默认 5 分钟检查一次是否数据过期
log.retention.check.interval.ms=300000
#配置连接 Zookeeper 集群地址(在 zk 根目录下创建/kafka,方便管理)
zookeeper.connect=node01:2181,node02:2181,node03:2181/kafka
(3)分发安装包
ssh_do_scp.sh ~/bin/node.list /opt/soft/kafka_2.12-3.2.0/ /opt/soft/
(4)分别在node02和 node03上修改配置文件/opt/soft/kafka_2.12-3.2.0/config/server.properties
注意:broker.id 不得重复,整个集群中唯一。
vim /opt/soft/kafka_2.12-3.2.0/config/server.properties
node01 -> broker.id=0
node02 -> broker.id=1
node03 -> broker.id=2
(1)在/etc/profile文件中增加 kafka 环境变量配置
vim /etc/profile
增加如下内容:
export KAFKA_HOME=/opt/soft/kafka_2.12-3.2.0
export PATH=$PATH:$KAFKA_HOME/bin
(2)分发环境变量文件到其他节点,并 source。
ssh_do_scp.sh ~/bin/node.list /etc/profile /etc/
source /etc/profile
(1)启动Zookeeper集群
[root@node01 ~]# myzookeeper.sh start
-----------------启动node01 Zookeeper----------------
ZooKeeper JMX enabled by default
Using config: /opt/soft/zookeeper-3.7.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
-----------------启动node02 Zookeeper----------------
ZooKeeper JMX enabled by default
Using config: /opt/soft/zookeeper-3.7.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
-----------------启动node03 Zookeeper----------------
ZooKeeper JMX enabled by default
Using config: /opt/soft/zookeeper-3.7.1/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@node01 ~]#
[root@node01 ~]#
[root@node01 ~]# myzookeeper.sh status
-----------------停止node01 Zookeeper----------------
ZooKeeper JMX enabled by default
Using config: /opt/soft/zookeeper-3.7.1/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
-----------------停止node02 Zookeeper----------------
ZooKeeper JMX enabled by default
Using config: /opt/soft/zookeeper-3.7.1/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader
-----------------停止node03 Zookeeper----------------
ZooKeeper JMX enabled by default
Using config: /opt/soft/zookeeper-3.7.1/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
(2) 依次在 node01、node02、node03节点上启动 Kafka。
[root@node01 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties
[root@node01 ~]# jps
2704 Jps
2187 QuorumPeerMain
2619 Kafka
[root@node02 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties
[root@node02 ~]# jps
2400 Jps
2322 Kafka
1881 QuorumPeerMain
[root@node03 ~]# kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties
[root@node03 ~]# jps
1976 Kafka
2059 Jps
1550 QuorumPeerMain
注意:配置文件的路径到达server.properties。
(3)关闭集群命令
[root@node01 ~]# ssh_do_all.sh ~/bin/node.list "/opt/soft/kafka_2.12-3.2.0/bin/kafka-server-stop.sh"
==================== node01 ====================
Connection to node01 closed.
==================== node02 ====================
Connection to node02 closed.
==================== node03 ====================
Connection to node03 closed.
[root@node01 ~]# jpsall
=============== node01 ===============
2187 QuorumPeerMain
2764 Jps
=============== node02 ===============
2322 Kafka
1881 QuorumPeerMain
2445 Jps
=============== node03 ===============
2101 Jps
1976 Kafka
1550 QuorumPeerMain
(1)在/home/root/bin 目录下创建文件mykafka.sh 脚本文件
vim mykafka.sh
#! /bin/bash
hosts=(node01 node02 node03)
case $1 in
"start"){
for host in ${hosts[*]}
do
echo " --------启动 $host Kafka-------"
ssh $host "$KAFKA_HOME/bin/kafka-server-start.sh -daemon $KAFKA_HOME/config/server.properties"
done
};;
"stop"){
for host in ${hosts[*]}
do
echo " --------停止 $host Kafka-------"
ssh $host "$KAFKA_HOME/bin/kafka-server-stop.sh "
done
};;
esac
(2)添加执行权限
chmod +x mykafka.sh
(3)启动集群命令
mykafka.sh start
(4)停止集群命令
mykafka.sh stop
注意:停止 Kafka 集群时,一定要等 Kafka 所有节点进程全部停止后再停止 Zookeeper集群。因为 Zookeeper 集群当中记录着 Kafka 集群相关信息,Zookeeper 集群一旦先停止,Kafka 集群就没有办法再获取停止进程的信息,只能手动杀死 Kafka 进程了.
(1)查看操作主题命令参数
[root@node01 ~]# kafka-topics.sh
Create, delete, describe, or change a topic.
Option Description
------ -----------
--alter Alter the number of partitions,
replica assignment, and/or
configuration for the topic.
--at-min-isr-partitions if set when describing topics, only
show partitions whose isr count is
equal to the configured minimum.
--bootstrap-server <String: server to REQUIRED: The Kafka server to connect
connect to> to.
--command-config <String: command Property file containing configs to be
config property file> passed to Admin Client. This is used
only with --bootstrap-server option
for describing and altering broker
configs.
--config <String: name=value> A topic configuration override for the
topic being created or altered. The
following is a list of valid
configurations:
cleanup.policy
compression.type
delete.retention.ms
file.delete.delay.ms
flush.messages
flush.ms
follower.replication.throttled.
replicas
index.interval.bytes
leader.replication.throttled.replicas
local.retention.bytes
local.retention.ms
max.compaction.lag.ms
max.message.bytes
message.downconversion.enable
message.format.version
message.timestamp.difference.max.ms
message.timestamp.type
min.cleanable.dirty.ratio
min.compaction.lag.ms
min.insync.replicas
preallocate
remote.storage.enable
retention.bytes
retention.ms
segment.bytes
segment.index.bytes
segment.jitter.ms
segment.ms
unclean.leader.election.enable
See the Kafka documentation for full
details on the topic configs. It is
supported only in combination with --
create if --bootstrap-server option
is used (the kafka-configs CLI
supports altering topic configs with
a --bootstrap-server option).
--create Create a new topic.
--delete Delete a topic
--delete-config <String: name> A topic configuration override to be
removed for an existing topic (see
the list of configurations under the
--config option). Not supported with
the --bootstrap-server option.
--describe List details for the given topics.
--disable-rack-aware Disable rack aware replica assignment
--exclude-internal exclude internal topics when running
list or describe command. The
internal topics will be listed by
default
--help Print usage information.
--if-exists if set when altering or deleting or
describing topics, the action will
only execute if the topic exists.
--if-not-exists if set when creating topics, the
action will only execute if the
topic does not already exist.
--list List all available topics.
--partitions <Integer: # of partitions> The number of partitions for the topic
being created or altered (WARNING:
If partitions are increased for a
topic that has a key, the partition
logic or ordering of the messages
will be affected). If not supplied
for create, defaults to the cluster
default.
--replica-assignment <String: A list of manual partition-to-broker
broker_id_for_part1_replica1 : assignments for the topic being
broker_id_for_part1_replica2 , created or altered.
broker_id_for_part2_replica1 :
broker_id_for_part2_replica2 , ...>
--replication-factor <Integer: The replication factor for each
replication factor> partition in the topic being
created. If not supplied, defaults
to the cluster default.
--topic <String: topic> The topic to create, alter, describe
or delete. It also accepts a regular
expression, except for --create
option. Put topic name in double
quotes and use the '\' prefix to
escape regular expression symbols; e.
g. "test\.topic".
--topic-id <String: topic-id> The topic-id to describe.This is used
only with --bootstrap-server option
for describing topics.
--topics-with-overrides if set when describing topics, only
show topics that have overridden
configs
--unavailable-partitions if set when describing topics, only
show partitions whose leader is not
available
--under-min-isr-partitions if set when describing topics, only
show partitions whose isr count is
less than the configured minimum.
--under-replicated-partitions if set when describing topics, only
show under replicated partitions
--version Display Kafka version.
参数 | 描述 |
---|---|
–bootstrap-server <String: server toconnect to> | 连接的 Kafka Broker 主机名称和端口号 |
–topic <String: topic> | 操作的 topic 名称 |
–create | 创建主题 |
–delete | 删除主题 |
–alter | 修改主题 |
–list | 查看所有主题 |
–describe | 查看主题详细描述 |
–partitions <Integer: # of partitions> | 设置分区数 |
–replication-factor<Integer: replication factor> | 设置分区副本 |
–config <String: name=value> | 更新系统默认的配置 |
(2)创建 test topic
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --create --partitions 1 --replication-factor 3 --topic test
Created topic test.
选项说明:
–topic 定义topic 名
–replication-factor 定义副本数
–partitions 定义分区数
(3)查看当前服务器中的所有 topic
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --list
test
(4)查看test主题的详情
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --describe --topic test
Topic: test TopicId: ay7jWkv3RVSzFpEdXn35oQ PartitionCount: 1 ReplicationFactor: 3 Configs: segment.bytes=1073741824
Topic: test Partition: 0 Leader: 1 Replicas: 1,0,2 Isr: 1,0,2
(5)修改分区数(注意:分区数只能增加,不能减少)
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --alter --topic test --partitions 3
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --describe --topic test
Topic: test TopicId: ay7jWkv3RVSzFpEdXn35oQ PartitionCount: 3 ReplicationFactor: 3 Configs: segment.bytes=1073741824
Topic: test Partition: 0 Leader: 1 Replicas: 1,0,2 Isr: 1,0,2
Topic: test Partition: 1 Leader: 2 Replicas: 2,1,0 Isr: 2,1,0
Topic: test Partition: 2 Leader: 0 Replicas: 0,2,1 Isr: 0,2,1
(6)删除 topic(学生自己演示)
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --delete --topic test
[root@node01 ~]# kafka-topics.sh --bootstrap-server node01:9092 --list
[root@node01 ~]#
(1)查看操作生产者命令参数
[root@node01 ~]# kafka-console-producer.sh
Missing required option(s) [bootstrap-server]
Option Description
------ -----------
--batch-size <Integer: size> Number of messages to send in a single
batch if they are not being sent
synchronously. please note that this
option will be replaced if max-
partition-memory-bytes is also set
(default: 16384)
--bootstrap-server <String: server to REQUIRED unless --broker-list
connect to> (deprecated) is specified. The server
(s) to connect to. The broker list
string in the form HOST1:PORT1,HOST2:
PORT2.
--broker-list <String: broker-list> DEPRECATED, use --bootstrap-server
instead; ignored if --bootstrap-
server is specified. The broker
list string in the form HOST1:PORT1,
HOST2:PORT2.
--compression-codec [String: The compression codec: either 'none',
compression-codec] 'gzip', 'snappy', 'lz4', or 'zstd'.
If specified without value, then it
defaults to 'gzip'
--help Print usage information.
--line-reader <String: reader_class> The class name of the class to use for
reading lines from standard in. By
default each line is read as a
separate message. (default: kafka.
tools.
ConsoleProducer$LineMessageReader)
--max-block-ms <Long: max block on The max time that the producer will
send> block for during a send request.
(default: 60000)
--max-memory-bytes <Long: total memory The total memory used by the producer
in bytes> to buffer records waiting to be sent
to the server. This is the option to
control `buffer.memory` in producer
configs. (default: 33554432)
--max-partition-memory-bytes <Integer: The buffer size allocated for a
memory in bytes per partition> partition. When records are received
which are smaller than this size the
producer will attempt to
optimistically group them together
until this size is reached. This is
the option to control `batch.size`
in producer configs. (default: 16384)
--message-send-max-retries <Integer> Brokers can fail receiving the message
for multiple reasons, and being
unavailable transiently is just one
of them. This property specifies the
number of retries before the
producer give up and drop this
message. This is the option to
control `retries` in producer
configs. (default: 3)
--metadata-expiry-ms <Long: metadata The period of time in milliseconds
expiration interval> after which we force a refresh of
metadata even if we haven't seen any
leadership changes. This is the
option to control `metadata.max.age.
ms` in producer configs. (default:
300000)
--producer-property <String: A mechanism to pass user-defined
producer_prop> properties in the form key=value to
the producer.
--producer.config <String: config file> Producer config properties file. Note
that [producer-property] takes
precedence over this config.
--property <String: prop> A mechanism to pass user-defined
properties in the form key=value to
the message reader. This allows
custom configuration for a user-
defined message reader.
Default properties include:
parse.key=false
parse.headers=false
ignore.error=false
key.separator=\t
headers.delimiter=\t
headers.separator=,
headers.key.separator=:
null.marker= When set, any fields
(key, value and headers) equal to
this will be replaced by null
Default parsing pattern when:
parse.headers=true and parse.key=true:
"h1:v1,h2:v2...\tkey\tvalue"
parse.key=true:
"key\tvalue"
parse.headers=true:
"h1:v1,h2:v2...\tvalue"
--request-required-acks <String: The required `acks` of the producer
request required acks> requests (default: -1)
--request-timeout-ms <Integer: request The ack timeout of the producer
timeout ms> requests. Value must be non-negative
and non-zero. (default: 1500)
--retry-backoff-ms <Long> Before each retry, the producer
refreshes the metadata of relevant
topics. Since leader election takes
a bit of time, this property
specifies the amount of time that
the producer waits before refreshing
the metadata. This is the option to
control `retry.backoff.ms` in
producer configs. (default: 100)
--socket-buffer-size <Integer: size> The size of the tcp RECV size. This is
the option to control `send.buffer.
bytes` in producer configs.
(default: 102400)
--sync If set message send requests to the
brokers are synchronously, one at a
time as they arrive.
--timeout <Long: timeout_ms> If set and the producer is running in
asynchronous mode, this gives the
maximum amount of time a message
will queue awaiting sufficient batch
size. The value is given in ms. This
is the option to control `linger.ms`
in producer configs. (default: 1000)
--topic <String: topic> REQUIRED: The topic id to produce
messages to.
--version Display Kafka version.
参数 | 描述 |
---|---|
–bootstrap-server <String: server toconnect to> | 连接的 Kafka Broker 主机名称和端口号 |
–topic <String: topic> | 操作的 topic 名称 |
(2)发送消息
[root@node01 ~]# kafka-console-producer.sh --bootstrap-server node01:9092 --topic test
>hello kafka
>hello zookeeper
**(1)查看操作消费者命令参数 **
[root@node01 ~]# kafka-console-consumer.sh
This tool helps to read data from Kafka topics and outputs it to standard output.
Option Description
------ -----------
--bootstrap-server <String: server to REQUIRED: The server(s) to connect to.
connect to>
--consumer-property <String: A mechanism to pass user-defined
consumer_prop> properties in the form key=value to
the consumer.
--consumer.config <String: config file> Consumer config properties file. Note
that [consumer-property] takes
precedence over this config.
--enable-systest-events Log lifecycle events of the consumer
in addition to logging consumed
messages. (This is specific for
system tests.)
--formatter <String: class> The name of a class to use for
formatting kafka messages for
display. (default: kafka.tools.
DefaultMessageFormatter)
--from-beginning If the consumer does not already have
an established offset to consume
from, start with the earliest
message present in the log rather
than the latest message.
--group <String: consumer group id> The consumer group id of the consumer.
--help Print usage information.
--include <String: Java regex (String)> Regular expression specifying list of
topics to include for consumption.
--isolation-level <String> Set to read_committed in order to
filter out transactional messages
which are not committed. Set to
read_uncommitted to read all
messages. (default: read_uncommitted)
--key-deserializer <String:
deserializer for key>
--max-messages <Integer: num_messages> The maximum number of messages to
consume before exiting. If not set,
consumption is continual.
--offset <String: consume offset> The offset to consume from (a non-
negative number), or 'earliest'
which means from beginning, or
'latest' which means from end
(default: latest)
--partition <Integer: partition> The partition to consume from.
Consumption starts from the end of
the partition unless '--offset' is
specified.
--property <String: prop> The properties to initialize the
message formatter. Default
properties include:
print.timestamp=true|false
print.key=true|false
print.offset=true|false
print.partition=true|false
print.headers=true|false
print.value=true|false
key.separator=<key.separator>
line.separator=<line.separator>
headers.separator=<line.separator>
null.literal=<null.literal>
key.deserializer=<key.deserializer>
value.deserializer=<value.
deserializer>
header.deserializer=<header.
deserializer>
Users can also pass in customized
properties for their formatter; more
specifically, users can pass in
properties keyed with 'key.
deserializer.', 'value.
deserializer.' and 'headers.
deserializer.' prefixes to configure
their deserializers.
--skip-message-on-error If there is an error when processing a
message, skip it instead of halt.
--timeout-ms <Integer: timeout_ms> If specified, exit if no message is
available for consumption for the
specified interval.
--topic <String: topic> The topic to consume on.
--value-deserializer <String:
deserializer for values>
--version Display Kafka version.
--whitelist <String: Java regex DEPRECATED, use --include instead;
(String)> ignored if --include specified.
Regular expression specifying list
of topics to include for consumption.
参数 | 描述 |
---|---|
–bootstrap-server <String: server toconnect to> | 连接的 Kafka Broker 主机名称和端口号 |
–topic <String: topic> | 操作的 topic 名称 |
–from-beginning | 从头开始消费 |
–group <String: consumer group id> | 指定消费者组名称 |
**(2)消费消息 **
☆ 消费 first 主题中的数据
[root@node01 ~]# kafka-console-consumer.sh --bootstrap-server node01:9092 --topic test
[root@node01 ~]# kafka-console-producer.sh --bootstrap-server node01:9092 --topic test
>hello scala
>hello kafka
[root@node01 ~]# kafka-console-consumer.sh --bootstrap-server node01:9092 --topic test
hello scala
hello kafka
Processed a total of 2 messages
☆ 把主题中所有的数据都读取出来(包括历史数据)。
[root@node01 ~]# kafka-console-consumer.sh --bootstrap-server node02:9092 --from-beginning --topic test
hello kafka
hello zookeeper
hello scala
hello kafka
Processed a total of 4 messages