当前位置: 首页 > 工具软件 > packetbeat > 使用案例 >

packetbeat解析pcap文件数据丢包现象及解决方式

禄仲渊
2023-12-01

1、packetbeat抓取网口包丢包

        当packetbeat采用抓取网口码流并解析的方式时,很容易发现,当网口流量过大(消息并发量较大)通过kibana前端统计消息数量,存在丢失的问题。这个问题与网口流量速率有关,并发越大,丢包率越高,这应该是packetbeat本身的pipeline机制有关,如果pipeline缓冲区满则默认丢弃新来的event,经过实验还没能找到不丢包的方法,最多就是消息并发量小一点,丢包率会小一点。这篇文章主要是想描述packetbeat通过解析pcap文件的方式丢包问题。

2、pcaketbeat解析pcap文件丢包

        当packetbeat直接抓取网口包并解析时,运行packetbeat是阻塞的,而当packetbeat指定pcap文件进行解析时,是非阻塞的,这就导致,当配置了上报event缓冲区数目且不为0的情况下,如果packetbeat解析完了整个pcap文件,但缓冲区还有未上报的event,此时packetbeat解析完就停止了,没有上报的event就会被丢弃不再上报,这就导致kibana前端统计消息数量缺失的问题。

解决办法是,调整packetbeat的配置文件,配置event上报数量最小值为0,这样配置的效果就是,当缓冲区存在未上报的event,立即上报,当packetbeat停止时,缓冲区也不存在未上报的event了。

贴一下相关配置:

# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
# If this options is not defined, the hostname is used.
#name:

# The tags of the shipper are included in their own field with each
# transaction published. Tags make it easy to group servers by different
# logical properties.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output. Fields can be scalar values, arrays, dictionaries, or any nested
# combination of these.
#fields:
#  env: staging

# If this option is set to true, the custom fields are stored as top-level
# fields in the output document instead of being grouped under a fields
# sub-dictionary. Default is false.
#fields_under_root: false

# Internal queue configuration for buffering events to be published.
queue:
  # Queue type by name (default 'mem')
  # The memory queue will present all available events (up to the outputs
  # bulk_max_size) to the output, the moment the output is ready to server
  # another batch of events.
  mem:
    # Max number of events the queue can buffer.
    events: 4096

    # Hints the minimum number of events stored in the queue,
    # before providing a batch of events to the outputs.
    # The default value is set to 2048.
    # A value of 0 ensures events are immediately available
    # to be sent to the outputs.
    flush.min_events: 0

    # Maximum duration after which events are available to the outputs,
    # if the number of events stored in the queue is < `flush.min_events`.
    flush.timeout: 1s

  # The disk queue stores incoming events on disk until the output is
  # ready for them. This allows a higher event limit than the memory-only
  # queue and lets pending events persist through a restart.
  #disk:
    # The directory path to store the queue's data.
    #path: "${path.data}/diskqueue"

    # The maximum space the queue should occupy on disk. Depending on
    # input settings, events that exceed this limit are delayed or discarded.
    #max_size: 10GB

    # The maximum size of a single queue data file. Data in the queue is
    # stored in smaller segments that are deleted after all their events
    # have been processed.
    #segment_size: 1GB

    # The number of events to read from disk to memory while waiting for
    # the output to request them.
    #read_ahead: 512

    # The number of events to accept from inputs while waiting for them
    # to be written to disk. If event data arrives faster than it
    # can be written to disk, this setting prevents it from overflowing
    # main memory.
    #write_ahead: 2048

    # The duration to wait before retrying when the queue encounters a disk
    # write error.
    #retry_interval: 1s

    # The maximum length of time to wait before retrying on a disk write
    # error. If the queue encounters repeated errors, it will double the
    # length of its retry interval each time, up to this maximum.
    #max_retry_interval: 30s

# Sets the maximum number of CPUs that can be executing simultaneously. The
# default is the number of logical CPUs available in the system.
#max_procs: 

But!!! 这样配置的话缺点也很明显:没有缓冲机制,有一个event就上报一次,比如数据存入elasticsearch,此时每个event都会导致packetbeat和es交互一次,对应cpu占用就会很高。 

 类似资料: