3.1.29 VSphere 和 ESXI 监控

优质
小牛编辑
123浏览
2023-12-01

falcon-vsphere

这是一个适用于Open-Falcon的,用于监控Vsphere及由Vsphere监管的所有esxi性能指标的agent。地址:https://github.com/tpinellia/vsphere

一.特性

  1. 支持多vsphere同时采集
  2. 支持vsphere与esxi监控项归并/拆分,支持自定义endpoint或监控项头部
  3. esxi监控项包含基础跟扩展监控两类,扩展监控可选择启用,同时可配置
  4. 配置支持热加载(例如添加vsphere,增删扩展监控等等),不需要重新启动agent

    二.使用说明

  5. export WorkDir="$HOME/falcon-vsphere"
  6. mkdir -p $WorkDir
  7. tar -xzvf falcon-vsphere-x.x.x.tar.gz -C $WorkDir
  8. cd $WorkDir
  9. ./control start

    三.配置说明

    {
     "debug": false,
     "extend": "./extend.json",              #扩展监控项列表     
     "heartbeat": { 
         "enabled": true,                    #是否开启HBS
         "addr": "127.0.0.1:6030",           #HBS地址
         "timeout": 1000                     #超时时间
     },
     "transfer": {
         "enabled": true,                    #是否开启Transfer
         "addrs": [              
             "127.0.0.1:8433"                #Transfer地址,可配置多个
         ],
         "interval": 60,                     #上传时间间隔
         "timeout": 1000                     #超时时间          
     },
     "vsphere": [
         {
             "hostname": "VC-1.1.1.1",       #上传的Vsphere Endpoint名
             "ip": "1.1.1.1",                #Vsphere IP
             "addr": "https://1.1.1.1/sdk",  #Vsphere SDK地址
             "user":"vsphere1-user",         #Vsphere 用户名
             "pwd":"vsphere1-pwd",           #Vsphere 密码
             "port": 443,                    #Vsphere 端口
             "split": true,                  #是否切分。如果选择true,那么VC将与esxi的监控项单独出来,esxi将作为单独的endpoint上传,esxi endpoint名称可以添加endpointhead作为头部;如果选择false,那么esxi的监控项将与VC的监控项合并上传,esxi的监控项会收集在VC下面作为一个大的endpoint,此时可以选择配置metrichead作为esxi的监控项头部,以区分vc跟esxi的监控项
             "endpointhead": "ESXI-",        #在split为true时生效,作为esxi endpoint的扩展头部,可为空
             "metrichead":"esxi.",           #在split为false时生效,作为esxi监控项的扩展头部,以区分vc的监控项跟esxi的监控项
             "extend": true                  #是否启用扩展监控
         },
         {
             "hostname": "VC-2.2.2.2",
             "ip": "2.2.2.2",
             "addr": "https://2.2.2.2/sdk",
             "user":"vsphere2-user",
             "pwd":"vsphere2-pwd",
             "port": 443,
             "split": true,
             "endpointhead": "ESXI-",
             "metrichead":"esxi.",
             "extend": true
         }
     ]
    }
    

    四.监控项说明

  10. 基础监控项
监控项名称说明
agent.alive默认上传1
agent.power1:关机;2:开机;3:待机;4:未知,表示主机断开连接或者无响应
agent.status1:状态未知;2:实体没问题;3:实体肯定有问题;4:实体可能有问题
agent.uptime开机时间
cpu.busycpu使用百分比
cpu.free.averagecpu空闲(单位:HZ)
cpu.totalcpu总量(单位:HZ)
cpu.usage.averagecpu使用(单位:HZ)
datastore.totalReadLatency.averageAverage amount of time for a read operation from the datastore. Total latency = kernel latency + device latency.
datastore.totalWriteLatency.averageAverage amount of time for a write operation to the datastore. Total latency = kernel latency + device latency.
df.bytes.free.percent存储器空闲比例(单块存储器)
df.bytes.free存储器空余量(单块存储器)
df.bytes.total存储器总量(单块存储器)
df.bytes.used.Percent存储器使用率(单块存储器)
df.bytes.used存储器使用量(单块存储器)
df.statistics.free存储器空余量(所有存储器)
df.statistics.free.percent存储器空闲比例(所有存储器)
df.statistics.total存储器总量(所有存储器)
df.statistics.used存储器使用量(所有存储器)
df.statistics.used.percent存储器使用率(所有存储器)
mem.memfree内存空闲量
mem.memfree.percent内存空闲百分比
mem.memtotal总内存
mem.memused内存使用量
mem.memused.percent内存使用百分比
net.bytesRx.averageAverage amount of data received per second.
net.bytesTx.averageAverage amount of data transmitted per second.
  1. 扩展监控项

类型声明

NAMEDESCRIPTION
absoluteRepresents an actual value, level, or state of the counter. For example, the “uptime” counter (system group) represents the actual number of seconds since startup. The “capacity” counter represents the actual configured size of the specified datastore. In other words, number of samples, samplingPeriod, and intervals have no bearing on an “absolute” counter“s value.
deltaRepresents an amount of change for the counter during the samplingPeriod as compared to the previous interval. The first sampling interval
rateRepresents a value that has been normalized over the samplingPeriod, enabling values for the same counter type to be compared, regardless of interval. For example, the number of reads per second.
  • hbr类监控项
监控项名称描述类型
hbr.hbrNetTx.averageAverage amount of data transmitted per secondrate
hbr.hbrNetRx.averageKilobytes per second of outgoing host-based replication network traffic (for this virtual machine or host)rate
hbr.hbrNumVms.averageNumber of powered-on virtual machines running on this host that currently have host-based replication protection enabled.absolute
  • rescpu类监控项
监控项名称描述类型
rescpu.runav15.latestCPU running average over 15 minutesabsolute
rescpu.actav1.latestCPU running average over 1 minuteabsolute
rescpu.actpk15.latestCPU active peak over 15 minutesabsolute
rescpu.actav5.latestCPU running average over 5 minutesabsolute
rescpu.runpk1.latestCPU running peak over 1 minuteabsolute
rescpu.maxLimited5.latestAmount of CPU resources over the limit that were refused, average over 5 minutesabsolute
rescpu.actpk1.latestCPU active peak over 1 minuteabsolute
rescpu.sampleCount.latestGroup CPU sample count.absolute
rescpu.samplePeriod.latestGroup CPU sample period.absolute
rescpu.maxLimited1.latestAmount of CPU resources over the limit that were refused, average over 1 minuteabsolute
rescpu.runpk15.latestCPU running peak over 15 minutesabsolute
rescpu.maxLimited15.latestAmount of CPU resources over the limit that were refused, average over 15 minutesabsolute
rescpu.runav5.latestCPU running average over 5 minutesabsolute
rescpu.runav1.latestCPU running average over 1 minuteabsolute
rescpu.actav15.latestCPU active average over 15 minutesabsolute
rescpu.actpk5.latestCPU active peak over 5 minutesabsolute
rescpu.runpk5.latestCPU running peak over 5 minutesabsolute
  • storagePath类监控项
监控项名称描述类型
storagePath.totalReadLatency.averageAverage amount of time for a read issued on the storage path. Total latency = kernel latency + device latency.absolute
storagePath.commandsAveraged.averageAverage number of commands issued per second on the storage path during the collection intervalrate
storagePath.numberReadAveraged.averageAverage number of read commands issued per second on the storage path during the collection intervalrate
storagePath.write.averageRate of writing data on the storage pathrate
storagePath.maxTotalLatency.latestHighest latency value across all storage paths used by the hostabsolute
storagePath.read.averageRate of reading data on the storage pathrate
storagePath.numberWriteAveraged.averageAverage number of write commands issued per second on the storage path during the collection intervalrate
storagePath.totalWriteLatency.averageAverage amount of time for a write issued on the storage path. Total latency = kernel latency + device latency.absolute
  • storageAdapter类监控项
监控项名称描述类型
storageAdapter.read.averageRate of reading data by the storage adapterrate
storageAdapter.numberReadAveraged.averageAverage number of read commands issued per second by the storage adapter during the collection intervalratestorageAdapter.maxTotalLatency.latestHighest latency value across all storage adapters used by the hostrate
storageAdapter.numberWriteAveraged.averageAverage number of write commands issued per second by the storage adapter during the collection intervalrate
storageAdapter.totalWriteLatency.averageAverage amount of time for a write operation by the storage adapter. Total latency = kernel latency + device latency.absolute
storageAdapter.totalReadLatency.averageAverage amount of time for a read operation by the storage adapter. Total latency = kernel latency + device latency.absolute
storageAdapter.commandsAveraged.averageAverage number of commands issued per second by the storage adapter during the collection intervalrate
storageAdapter.write.averageRate of writing data by the storage adapterrate
  • power类监控项
监控项名称描述类型
power.power.averageCurrent power usage.rate
power.powerCap.averageMaximum allowed power usage.rate
power.energy.summationTotal energy used since last stats reset.rate
  • sys类监控项
监控项名称描述类型
sys.resourceCpuUsage.averageAmount of CPU used by the Service Console and other applications during the interval by the Service Console and other applications.rate
sys.resourceCpuUsage.maximumAmount of CPU used by the Service Console and other applications during the interval by the Service Console and other applications.rate
sys.resourceMemMapped.latestMemory mapped by the system resource groupabsolute
sys.resourceMemAllocShares.latestMemory allocation shares of the system resource groupabsolute
sys.resourceMemCow.latestMemory shared by the system resource groupabsolute
sys.resourceMemShared.latestMemory saved due to sharing by the system resource groupabsolute
sys.resourceCpuAct1.latestCPU active average over 1 minute of the system resource groupabsolute
sys.resourceCpuUsage.minimumAmount of CPU used by the Service Console and other applications during the interval by the Service Console and other applications.rate
sys.resourceMemZero.latestZero filled memory used by the system resource groupabsolute
sys.resourceCpuAct5.latestCPU active average over 5 minutes of the system resource groupabsolute
sys.resourceCpuMaxLimited1.latestCPU maximum limited over 1 minute of the system resource groupabsolute
sys.resourceFdUsage.latestNumber of file descriptors used by the system resource groupabsolute
sys.resourceMemConsumed.latestMemory consumed by the system resource groupabsolute
sys.resourceCpuRun5.latestCPU running average over 5 minutes of the system resource groupabsolute
sys.uptime.latestTotal time elapsed, in seconds, since last system startupabsolute
sys.resourceCpuAllocMin.latestCPU allocation reservation (in MHz) of the system resource groupabsolute
sys.resourceCpuAllocShares.latestCPU allocation shares of the system resource groupabsolute
sys.resourceCpuMaxLimited5.latestCPU maximum limited over 5 minutes of the system resource groupabsolute
sys.resourceMemOverhead.latestOverhead memory consumed by the system resource groupabsolute
sys.resourceCpuUsage.noneAmount of CPU used by the Service Console and other applications during the interval by the Service Console and other applications.rate
sys.resourceCpuRun1.latestCPU running average over 1 minute of the system resource groupabsolute
sys.resourceMemSwapped.latestMemory swapped out by the system resource groupabsolute
sys.resourceMemTouched.latestMemory touched by the system resource groupabsolute
sys.resourceMemAllocMax.latestMemory allocation limit (in KB) of the system resource groupabsolute
sys.resourceMemAllocMin.latestMemory allocation reservation (in KB) of the system resource groupabsolute
  • net类监控项
监控项名称描述类型
net.unknownProtos.summationNumber of frames with unknown protocol received during the sampling intervaldelta
net.packetsRx.summationTotal number of packets received on all virtual machines running on the host.delta
net.received.averageThe rate at which data is received across each physical NIC instance on the host.rate
net.errorsRx.summationNumber of packets with errors received during the sampling interval.delta
net.transmitted.averageThe rate at which data is transmitted across each physical NIC instance on the host.rate
net.usage.noneSum of data transmitted and received across all physical NIC instances connected to the host.rate
net.bytesRx.averageAverage amount of data received per second.rate
net.multicastTx.summationNumber of multicast packets transmitted during the sampling interval.delta
net.droppedTx.summationNumber of transmitted packets dropped during the collection interval.delta
net.usage.minimumSum of data transmitted and received across all physical NIC instances connected to the host.rate
net.multicastRx.summationNumber of multicast packets received during the sampling interval.delta
net.bytesTx.averageAverage amount of data transmitted per second.rate
net.broadcastRx.summationNumber of broadcast packets received during the sampling interval.delta
net.packetsTx.summationNumber of packets transmitted across each physical NIC instance on the host.delta
net.errorsTx.summationNumber of packets with errors transmitted during the sampling interval.delta
net.broadcastTx.summationNumber of broadcast packets transmitted during the sampling interval.delta
net.usage.maximumSum of data transmitted and received across all physical NIC instances connected to the host.rate
net.usage.averageSum of data transmitted and received across all physical NIC instances connected to the host.rate
net.droppedRx.summationNumber of received packets dropped during the collection interval.delta
  • disk类监控项
监控项名称描述类型
disk.usage.maximumAggregated disk I/O rate. For hosts, this metric includes the rates for all virtual machines running on the host during the collection interval.rate
disk.deviceWriteLatency.averageAverage amount of time, in milliseconds, to write to the physical deviceabsolute
disk.queueWriteLatency.averageAverage amount of time spent in the VMkernel queue, per SCSI write command, during the collection intervalabsolute
disk.commands.summationNumber of SCSI commands issued during the collection intervaldelta
disk.busResets.summationNumber of SCSI-bus reset commands issued during the collection intervaldelta
disk.numberRead.summationNumber of times data was read from each LUN on the host.delta
disk.deviceLatency.averageAverage amount of time, in milliseconds, to complete a SCSI command from the physical deviceabsolute
disk.deviceReadLatency.averageAverage amount of time, in milliseconds, to read from the physical deviceabsolute
disk.usage.averageAggregated disk I/O rate. For hosts, this metric includes the rates for all virtual machines running on the host during the collection interval.rate
disk.numberWriteAveraged.averageAverage number of write commands issued per second to the datastore during the collection interval.rate
disk.usage.noneAggregated disk I/O rate. For hosts, this metric includes the rates for all virtual machines running on the host during the collection interval.rate
disk.totalWriteLatency.averageAverage amount of time taken during the collection interval to process a SCSI write command issued by the guest OS to the virtual machineabsolute
disk.queueLatency.averageAverage amount of time spent in the VMkernel queue, per SCSI command, during the collection interval.absolute
disk.kernelLatency.averageAverage amount of time, in milliseconds, spent by VMkernel to process each SCSI commandabsolute
disk.write.averageRate at which data is written to each LUN on the host.write rate = # blocksRead per second x blockSizerate
disk.numberWrite.summationNumber of times data was written to each LUN on the host.delta
disk.commandsAveraged.averageAverage number of SCSI commands issued per second during the collection intervalrate
disk.totalLatency.averageAverage amount of time taken during the collection interval to process a SCSI command issued by the guest OS to the virtual machine.absolute
disk.totalReadLatency.averageAverage amount of time taken during the collection interval to process a SCSI read command issued from the guest OS to the virtual machineabsolute
disk.kernelWriteLatency.averageAverage amount of time, in milliseconds, spent by VMkernel to process each SCSI write commandabsolute
disk.maxTotalLatency.latestHighest latency value across all disks used by the host. Latency measures the time taken to process a SCSI command issued by the guest OS to the virtual machine. The kernel latency is the time VMkernel takes to process an IO request. The device latency is the time it takes the hardware to handle the request.absolute
disk.numberReadAveraged.averageNumber of times data was read from each LUN on the host.delta
disk.read.averageRate at which data is read from each LUN on the host.read rate = # blocksRead per second x blockSizerate
disk.queueReadLatency.averageAverage amount of time spent in the VMkernel queue, per SCSI read command, during the collection intervalabsolute
disk.usage.minimumAggregated disk I/O rate. For hosts, this metric includes the rates for all virtual machines running on the host during the collection interval.rate
disk.commandsAborted.summationNumber of SCSI commands aborted during the collection intervaldelta
disk.maxQueueDepth.averageMaximum queue depth.absolute
disk.kernelReadLatency.averageAverage amount of time, in milliseconds, spent by VMkernel to process each SCSI read commandabsolute
  • cpu类监控项
监控项名称描述类型
cpu.reservedCapacity.averageTotal CPU capacity reserved by virtual machinesabsolute
cpu.usagemhz.averageSum of the actively used CPU of all powered on virtual machines on a host. The maximum possible value is the frequency of the processors multiplied by the number of processors. For example, if you have a host with four 2GHz CPUs running a virtual machine that is using 4000MHz, the host is using two CPUs completely.4000 / (4 x 2000) = 0.50rate
cpu.usage.noneActively used CPU of the host, as a percentage of the total available CPU. Active CPU is approximately equal to the ratio of the used CPU to the available CPU. available CPU = # of physical CPUs x clock rate.100% represents all CPUs on the host. For example, if a four-CPU host is running a virtual machine with two CPUs, and the usage is 50%, the host is using two CPUs completely.rate
cpu.coreUtilization.maximumCPU utilization of the corresponding core (if hyper-threading is enabled) as a percentage during the interval (A core is utilized if either or both of its logical CPUs are utilized).rate
cpu.costop.summationTime the virtual machine is ready to run, but is unable to run due to co-scheduling constraintsdelta
cpu.totalCapacity.averageTotal CPU capacity reserved by and available for virtual machinesabsolute
cpu.latency.averagePercent of time the virtual machine is unable to run because it is contending for access to the physical CPU(s)rate
cpu.usage.averageActively used CPU of the host, as a percentage of the total available CPU. Active CPU is approximately equal to the ratio of the used CPU to the available CPU. available CPU = # of physical CPUs x clock rate.100% represents all CPUs on the host. For example, if a four-CPU host is running a virtual machine with two CPUs, and the usage is 50%, the host is using two CPUs completely.rate
cpu.utilization.maximumCPU utilization as a percentage during the interval (CPU usage and CPU utilization might be different due to power management technologies or hyper-threading)rate
cpu.coreUtilization.minimumCPU utilization of the corresponding core (if hyper-threading is enabled) as a percentage during the interval (A core is utilized if either or both of its logical CPUs are utilized).rate
cpu.wait.summationTotal CPU time spent in wait state.The wait total includes time spent the CPU Idle, CPU Swap Wait, and CPU I/O Wait states.rate
cpu.swapwait.summationCPU time spent waiting for swap-in.delta
cpu.ready.summationTime that the virtual machine was ready, but could not get scheduled to run on the physical CPU during last measurement interval. CPU ready time is dependent on the number of virtual machines on the host and their CPU loads.delta
cpu.utilization.averageCPU utilization as a percentage during the interval (CPU usage and CPU utilization might be different due to power management technologies or hyper-threading)rate
cpu.used.summationTime accounted to the virtual machine. If a system service runs on behalf of this virtual machine, the time spent by that service (represented by cpu.system) should be charged to this virtual machine. If not, the time spent (represented by cpu.overlap) should not be charged against this virtual machine.delta
cpu.utilization.noneCPU utilization as a percentage during the interval (CPU usage and CPU utilization might be different due to power management technologies or hyper-threading)rate
cpu.idle.summationTotal time that the CPU spent in an idle statedelta
cpu.coreUtilization.noneCPU utilization of the corresponding core (if hyper-threading is enabled) as a percentage during the interval (A core is utilized if either or both of its logical CPUs are utilized).rate
cpu.coreUtilization.averageCPU utilization of the corresponding core (if hyper-threading is enabled) as a percentage during the interval (A core is utilized if either or both of its logical CPUs are utilized).rate
cpu.readiness.averagePercentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU.rate
cpu.usage.maximumActively used CPU of the host, as a percentage of the total available CPU. Active CPU is approximately equal to the ratio of the used CPU to the available CPU. available CPU = # of physical CPUs x clock rate.100% represents all CPUs on the host. For example, if a four-CPU host is running a virtual machine with two CPUs, and the usage is 50%, the host is using two CPUs completely.rate
cpu.usagemhz.noneHost - Sum of the actively used CPU of all powered on virtual machines on a host. The maximum possible value is the frequency of the processors multiplied by the number of processors. For example, if you have a host with four 2GHz CPUs running a virtual machine that is using 4000MHz, the host is using two CPUs completely.4000 / (4 x 2000) = 0.50rate
cpu.usage.minimumActively used CPU of the host, as a percentage of the total available CPU. Active CPU is approximately equal to the ratio of the used CPU to the available CPU. available CPU = # of physical CPUs x clock rate.100% represents all CPUs on the host. For example, if a four-CPU host is running a virtual machine with two CPUs, and the usage is 50%, the host is using two CPUs completely.rate
cpu.demand.averageThe amount of CPU resources a virtual machine would use if there were no CPU contention or CPU limitabsolute
cpu.usagemhz.maximumHost - Sum of the actively used CPU of all powered on virtual machines on a host. The maximum possible value is the frequency of the processors multiplied by the number of processors. For example, if you have a host with four 2GHz CPUs running a virtual machine that is using 4000MHz, the host is using two CPUs completely.4000 / (4 x 2000) = 0.50rate
cpu.utilization.minimumCPU utilization as a percentage during the interval (CPU usage and CPU utilization might be different due to power management technologies or hyper-threading)rate
cpu.usagemhz.minimumHost - Sum of the actively used CPU of all powered on virtual machines on a host. The maximum possible value is the frequency of the processors multiplied by the number of processors. For example, if you have a host with four 2GHz CPUs running a virtual machine that is using 4000MHz, the host is using two CPUs completely.4000 / (4 x 2000) = 0.50rate
  • mem类监控项
监控项名称描述类型
mem.llSwapOut.maximumAmount of memory swapped-out to host cacheabsolute
mem.swapin.maximumSum of swapin values for all powered-on virtual machines on the host.bsolute
mem.compressed.averageAmount of memory reserved by userworlds.absolute
mem.overhead.noneTotal of all overhead metrics for powered-on virtual machines, plus the overhead of running vSphere services on the host.absolute
mem.heap.maximumVMkernel virtual address space dedicated to VMkernel main heap and related dataabsolute
mem.unreserved.noneAmount of memory that is unreserved. Memory reservation not used by the Service Console, VMkernel, vSphere services and other powered on VMs’ user-specified memory reservations and overhead memory. This statistic is no longer relevant to virtual machine admission control, as reservations are now handled through resource pools.absolute
mem.reservedCapacity.averageTotal amount of memory reservation used by powered-on virtual machines and vSphere services on the host.absolute
mem.swapoutRate.averageRate at which memory is being swapped from active memory to disk during the current interval. This counter applies to virtual machines and is generally more useful than the swapout counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics.rate
mem.vmmemctl.noneThe sum of all vmmemctl values for all powered-on virtual machines, plus vSphere services on the host. If the balloon target value is greater than the balloon value, the VMkernel inflates the balloon, causing more virtual machine memory to be reclaimed. If the balloon target value is less than the balloon value, the VMkernel deflates the balloon, which allows the virtual machine to consume additional memory if needed.absolute
mem.vmfs.pbc.sizeMax.latestMaximum size the VMFS Pointer Block Cache can grow toabsolute
mem.consumed.maximumAmount of machine memory used on the host. Consumed memory includes Includes memory used by the Service Console, the VMkernel, vSphere services, plus the total consumed metrics for all running virtual machines. host consumed memory = total host memory - free host memoryabsolute
mem.swapin.minimumSum of swapin values for all powered-on virtual machines on the host.absolute
mem.vmmemctl.maximumThe sum of all vmmemctl values for all powered-on virtual machines, plus vSphere services on the host. If the balloon target value is greater than the balloon value, the VMkernel inflates the balloon, causing more virtual machine memory to be reclaimed. If the balloon target value is less than the balloon value, the VMkernel deflates the balloon, which allows the virtual machine to consume additional memory if needed.absolute
mem.llSwapUsed.maximumSpace used for caching swapped pages in the host cacheabsolute
mem.sharedcommon.averageAmount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host.Subtract this metric from the shared metric to gauge how much machine memory is saved due to sharing: shared - sharedcommon = machine memory (host memory) savings (KB)absolute
mem.active.minimumSum of all active metrics for all powered-on virtual machines plus vSphere services (such as COS, vpxa) on the host.absolute
mem.sysUsage.noneAmount of host physical memory used by VMkernel for core functionality, such as device drivers and other internal uses. Does not include memory used by virtual machines or vSphere services.absolute
mem.swapin.averageSum of swapin values for all powered-on virtual machines on the host.absolute
mem.zero.noneSum of zero metrics for all powered-on virtual machines, plus vSphere services on the host.absolute
mem.llSwapOut.averageAmount of memory swapped-out to host cacheabsolute
mem.granted.minimumSum of all granted metrics for all powered-on virtual machines, plus machine memory for vSphere services on the host.absolute
mem.granted.noneSum of all granted metrics for all powered-on virtual machines, plus machine memory for vSphere services on the host.absolute
mem.consumed.minimumAmount of machine memory used on the host. Consumed memory includes Includes memory used by the Service Console, the VMkernel, vSphere services, plus the total consumed metrics for all running virtual machines.host consumed memory = total host memory - free host memoryabsolute
mem.vmmemctl.minimumThe sum of all vmmemctl values for all powered-on virtual machines, plus vSphere services on the host. If the balloon target value is greater than the balloon value, the VMkernel inflates the balloon, causing more virtual machine memory to be reclaimed. If the balloon target value is less than the balloon value, the VMkernel deflates the balloon, which allows the virtual machine to consume additional memory if needed.absolute
mem.consumed.averageAmount of machine memory used on the host. Consumed memory includes Includes memory used by the Service Console, the VMkernel, vSphere services, plus the total consumed metrics for all running virtual machines.host consumed memory = total host memory - free host memoryabsolute
mem.active.noneSum of all active metrics for all powered-on virtual machines plus vSphere services (such as COS, vpxa) on the host.absolute
mem.vmfs.pbc.overhead.latestAmount of VMFS heap used by the VMFS PB Cacheabsolute
mem.swapout.averageSum of swapout metrics from all powered-on virtual machines on the host.absolute
mem.usage.minimumPercentage of available machine memory:consumed ÷ machine-memory-sizeabsolute
mem.consumed.noneAmount of machine memory used on the host. Consumed memory includes Includes memory used by the Service Console, the VMkernel, vSphere services, plus the total consumed metrics for all running virtual machines.host consumed memory = total host memory - free host memoryabsolute
mem.heapfree.noneFree address space in the VMkernel main heap.Varies based on number of physical devices and configuration options. There is no direct way for the user to increase or decrease this statistic. For informational purposes only: not useful for performance monitoring.absolute
mem.vmfs.pbc.size.latestSpace used for holding VMFS Pointer Blocks in memoryabsolute
mem.llSwapOut.noneAmount of memory swapped-out to host cacheabsolute
mem.totalCapacity.averageTotal amount of memory reservation used by and available for powered-on virtual machines and vSphere services on the hostabsolute
mem.sysUsage.minimumAmount of host physical memory used by VMkernel for core functionality, such as device drivers and other internal uses. Does not include memory used by virtual machines or vSphere services.absolute
mem.granted.averageSum of all granted metrics for all powered-on virtual machines, plus machine memory for vSphere services on the host.absolute
mem.activewrite.averageEstimate for the amount of memory actively being written to by the virtual machine.absolute
mem.unreserved.averageAmount of memory that is unreserved. Memory reservation not used by the Service Console, VMkernel, vSphere services and other powered on VMs’ user-specified memory reservations and overhead memory. This statistic is no longer relevant to virtual machine admission control, as reservations are now handled through resource pools.absolute
mem.llSwapIn.averageAmount of memory swapped-in from host cacheabsolute
mem.overhead.maximumTotal of all overhead metrics for powered-on virtual machines, plus the overhead of running vSphere services on the host.absolute
mem.llSwapOut.minimumAmount of memory swapped-out to host cacheabsolute
mem.shared.noneSum of all shared metrics for all powered-on virtual machines, plus amount for vSphere services on the host. The host's shared memory may be larger than the amount of machine memory if memory is overcommitted (the aggregate virtual machine configured memory is much greater than machine memory). The value of this statistic reflects how effective transparent page sharing and memory overcommitment are for saving machine memory.absolute
mem.usage.averagePercentage of available machine memory:consumed ÷ machine-memory-sizeabsolute
mem.llSwapInRate.averageRate at which memory is being swapped from host cache into active memoryrate
mem.sysUsage.averageAmount of host physical memory used by VMkernel for core functionality, such as device drivers and other internal uses. Does not include memory used by virtual machines or vSphere services.absolute
mem.lowfreethreshold.averageThreshold of free host physical memory below which ESX/ESXi will begin reclaiming memory from virtual machines through ballooning and swappingabsolute
mem.zero.averageSum of zero metrics for all powered-on virtual machines, plus vSphere services on the host.absolute
mem.vmfs.pbc.capMissRatio.latestTrailing average of the ratio of capacity misses to compulsory misses for the VMFS PB Cacheabsolute
mem.state.latestOne of four threshold levels representing the percentage of free memory on the host. The counter value determines swapping and ballooning behavior for memory reclamation.0 (high) Free memory >= 6% of machine memory minus Service Console memory;1 (soft) 4%;2 (hard) 2%; 3 (low) 1%; 0 (high) and 1 (soft): Ballooning is favored over swapping.2 (hard) and 3 (low): Swapping is favored over ballooning.absolute
mem.heap.averageVMkernel virtual address space dedicated to VMkernel main heap and related dataabsolute
mem.heapfree.averageFree address space in the VMkernel main heap.Varies based on number of physical devices and configuration options. There is no direct way for the user to increase or decrease this statistic. For informational purposes only: not useful for performance monitoring.absolute
mem.granted.maximumSum of all granted metrics for all powered-on virtual machines, plus machine memory for vSphere services on the host.absolute
mem.llSwapIn.maximumAmount of memory swapped-in from host cacheabsolute
mem.llSwapUsed.noneSpace used for caching swapped pages in the host cacheabsolute
mem.swapout.minimumSum of swapout metrics from all powered-on virtual machines on the host.absolute
mem.active.averageSum of all active metrics for all powered-on virtual machines plus vSphere services (such as COS, vpxa) on the host.absolute
mem.llSwapIn.minimumAmount of memory swapped-in from host cacheabsolute
mem.overhead.minimumTotal of all overhead metrics for powered-on virtual machines, plus the overhead of running vSphere services on the host.absolute
mem.llSwapUsed.minimumSpace used for caching swapped pages in the host cacheabsolute
mem.swapinRate.averageRate at which memory is swapped from disk into active memory during the interval. This counter applies to virtual machines and is generally more useful than the swapin counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics.rate
mem.heapfree.maximumFree address space in the VMkernel main heap.Varies based on number of physical devices and configuration options. There is no direct way for the user to increase or decrease this statistic. For informational purposes only: not useful for performance monitoring.absolute
mem.shared.averageSum of all shared metrics for all powered-on virtual machines, plus amount for vSphere services on the host. The host's shared memory may be larger than the amount of machine memory if memory is overcommitted (the aggregate virtual machine configured memory is much greater than machine memory). The value of this statistic reflects how effective transparent page sharing and memory overcommitment are for saving machine memory.absolute
mem.zero.maximumSum of zero metrics for all powered-on virtual machines, plus vSphere services on the host.absolute
mem.swapout.maximumSum of swapout metrics from all powered-on virtual machines on the host.absolute
mem.swapin.noneSum of swapin values for all powered-on virtual machines on the host.absolute
mem.heap.minimumVMkernel virtual address space dedicated to VMkernel main heap and related dataabsolute
mem.decompressionRate.averageRate of memory decompression for the virtual machine.rate
mem.compressionRate.averageRate of memory compression for the virtual machine.rate
mem.shared.minimumSum of all shared metrics for all powered-on virtual machines, plus amount for vSphere services on the host. The host's shared memory may be larger than the amount of machine memory if memory is overcommitted (the aggregate virtual machine configured memory is much greater than machine memory). The value of this statistic reflects how effective transparent page sharing and memory overcommitment are for saving machine memory.absolute
mem.unreserved.maximumAmount of memory that is unreserved. Memory reservation not used by the Service Console, VMkernel, vSphere services and other powered on VMs’ user-specified memory reservations and overhead memory. This statistic is no longer relevant to virtual machine admission control, as reservations are now handled through resource pools.absolute
mem.zero.minimumSum of zero metrics for all powered-on virtual machines, plus vSphere services on the host.absolute
mem.llSwapIn.noneAmount of memory swapped-in from host cacheabsolute
mem.swapused.minimumAmount of memory that is used by swap. Sum of memory swapped of all powered on VMs and vSphere services on the host.absolute
mem.sharedcommon.minimumAmount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host.Subtract this metric from the shared metric to gauge how much machine memory is saved due to sharing: shared - sharedcommon = machine memory (host memory) savings (KB)absolute
mem.active.maximumSum of all active metrics for all powered-on virtual machines plus vSphere services (such as COS, vpxa) on the host.absolute
mem.vmfs.pbc.workingSet.latestAmount of file blocks whose addresses are cached in the VMFS PB Cacheabsolute
mem.latency.averagePercentage of time the virtual machine is waiting to access swapped or compressed memoryabsolute
mem.unreserved.minimumAmount of memory that is unreserved. Memory reservation not used by the Service Console, VMkernel, vSphere services and other powered on VMs’ user-specified memory reservations and overhead memory. This statistic is no longer relevant to virtual machine admission control, as reservations are now handled through resource pools.absolute
mem.vmfs.pbc.workingSetMax.latestMaximum amount of file blocks whose addresses are cached in the VMFS PB Cacheabsolute
mem.llSwapUsed.averageSpace used for caching swapped pages in the host cacheabsolute
mem.sharedcommon.maximumAmount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host.Subtract this metric from the shared metric to gauge how much machine memory is saved due to sharing:shared - sharedcommon = machine memory (host memory) savings (KB)absolute
mem.swapout.noneSum of swapout metrics from all powered-on virtual machines on the host.absolute
mem.swapused.maximumAmount of memory that is used by swap. Sum of memory swapped of all powered on VMs and vSphere services on the host.absolute
mem.usage.nonePercentage of available machine memory:consumed ÷ machine-memory-sizeabsolute
mem.heapfree.minimumFree address space in the VMkernel main heap.Varies based on number of physical devices and configuration options. There is no direct way for the user to increase or decrease this statistic. For informational purposes only: not useful for performance monitoring.absolute
mem.heap.noneVMkernel virtual address space dedicated to VMkernel main heap and related dataabsolute
mem.vmmemctl.averageThe sum of all vmmemctl values for all powered-on virtual machines, plus vSphere services on the host. If the balloon target value is greater than the balloon value, the VMkernel inflates the balloon, causing more virtual machine memory to be reclaimed. If the balloon target value is less than the balloon value, the VMkernel deflates the balloon, which allows the virtual machine to consume additional memory if needed.absolute
mem.sysUsage.maximumAmount of host physical memory used by VMkernel for core functionality, such as device drivers and other internal uses. Does not include memory used by virtual machines or vSphere services.absolute
mem.llSwapOutRate.averageRate at which memory is being swapped from active memory to host cacherate
mem.swapused.noneAmount of memory that is used by swap. Sum of memory swapped of all powered on VMs and vSphere services on the host.absolute
mem.overhead.averageTotal of all overhead metrics for powered-on virtual machines, plus the overhead of running vSphere services on the host.absolute
mem.swapused.averageAmount of memory that is used by swap. Sum of memory swapped of all powered on VMs and vSphere services on the host.absolute
mem.shared.maximumSum of all shared metrics for all powered-on virtual machines, plus amount for vSphere services on the host. The host's shared memory may be larger than the amount of machine memory if memory is overcommitted (the aggregate virtual machine configured memory is much greater than machine memory). The value of this statistic reflects how effective transparent page sharing and memory overcommitment are for saving machine memory.absolute
mem.sharedcommon.noneAmount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host.Subtract this metric from the shared metric to gauge how much machine memory is saved due to sharing: shared - sharedcommon = machine memory (host memory) savings (KB)absolute
mem.usage.maximumPercentage of available machine memory:consumed ÷ machine-memory-sizeabsolute
  • datastore类监控项
监控项名称描述类型
datastore.datastoreWriteBytes.latestStorage DRS datastore bytes writtenabsolute
datastore.datastoreIops.averageAverage amount of time for an I/O operation to the datastore or LUN across all ESX hosts accessing it.absolute
datastore.siocActiveTimePercentage.averagePercentage of time Storage I/O Control actively controlled datastore latencyabsolute
datastore.numberWriteAveraged.averageAverage number of write commands issued per second to the datastore during the collection intervalrate
datastore.totalWriteLatency.averageAverage amount of time for a write operation to the datastore. Total latency = kernel latency + device latency.absolute
datastore.datastoreNormalWriteLatency.latestStorage DRS datastore normalized write latencyabsolute
datastore.sizeNormalizedDatastoreLatency.averageStorage I/O Control size-normalized I/O latencyabsolute
datastore.datastoreVMObservedLatency.latestThe average datastore latency as seen by virtual machinesabsolute
datastore.datastoreReadLoadMetric.latestStorage DRS datastore metric for read workload modelabsolute
datastore.datastoreMaxQueueDepth.latestStorage I/O Control datastore maximum queue depthabsolute
datastore.maxTotalLatency.latestHighest latency value across all datastores used by the hostabsolute
datastore.datastoreReadIops.latestStorage DRS datastore read I/O rateabsolute
datastore.datastoreReadBytes.latestStorage DRS datastore bytes readabsolute
datastore.datastoreWriteIops.latestStorage DRS datastore write I/O rateabsolute
datastore.datastoreWriteLoadMetric.latestStorage DRS datastore metric for write workload modelabsolute
datastore.datastoreWriteOIO.latestStorage DRS datastore outstanding write requestsabsolute
datastore.datastoreReadOIO.latestStorage DRS datastore outstanding read requestsabsolute
datastore.numberReadAveraged.averageAverage number of read commands issued per second to the datastore during the collection intervalrate
datastore.totalReadLatency.averageAverage amount of time for a read operation from the datastore. Total latency = kernel latency + device latency.absolute
datastore.read.averageRate of reading data from the datastorerate
datastore.datastoreNormalReadLatency.latestStorage DRS datastore normalized read latencyabsolute
datastore.write.averageRate of writing data to the datastoreabsolute