(1)添加用户
# useradd tigk
# passwd tigk
Changing password for user tigk.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
(2)授权
个人用户的权限只可以在本home下有完整权限,其他目录需要别人授权。经常需要root用户的权限,可以通过修改sudoers文件来赋予权限,使用sudo命令。
# 赋予读写权限
# chmod -v u+w /etc/sudoers
mode of ‘/etc/sudoers’ changed from 0440 (r--r-----) to 0640 (rw-r-----)
修改sudoers文件,添加新用户信息 vi /etc/sudoers,添加内容"elastic ALL=(ALL) ALL "
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
tigk ALL=(ALL) ALL
收回权限
# chmod -v u-w /etc/sudoers
mode of ‘/etc/sudoers’ changed from 0640 (rw-r-----) to 0440 (r--r-----)
创建tigk安装目录
# su - tigk
$ mkdir /home/tigk/.local
(3) 创建目录存放TIGK相关文件
# mkdir /data/tigk
# chown tigk:tigk /data/tigk
# su - tigk
$ mkdir /data/tigk/telegraf
$ mkdir /data/tigk/influxdb
$ mkdir /data/tigk/kapacitor
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.14.4_linux_amd64.tar.gz
$ tar xf /opt/package/telegraf-1.14.4_linux_amd64.tar.gz -C /home/tigk/.local/
可执行文件在 {telegraf根目录}/usr/bin/telegraf,配置文件在安装后的etc
目录下,也可直接配置生成
查看帮助telegraf --help
生成配置文件 telegraf config > telegraf.conf
生成带cpu、memroy、http_listener和influxdb插件的配置文件
telegraf --input-filter cpu:mem:http_listener --output-filter influxdb config > telegraf.conf
执行程序 telegraf --config telegraf.conf
以后台方式启动 nohup telegraf --config telegraf > /dev/null 2>&1 &
$ cd /home/tigk/.local/telegraf/usr/bin
$ ./telegraf --help
$ ./telegraf config > telegraf.conf
$ ./telegraf --input-filter cpu:mem:http_listener --output-filter influxdb config > telegraf.conf
[tigk@fbi-local-02 ~]$ mkdir /data/tigk/telegraf/logs
$ mkdir /data/tigk/telegraf/conf
$ cp /home/tigk/.local/telegraf/usr/bin/telegraf.conf /data/tigk/telegraf/conf
$ vim /data/tigk/telegraf/conf/telegraf.conf
找到[outputs.influxdb]部分提供用户名和密码,修改内容如下
[[outputs.influxdb]]
urls = ["http://10.0.165.2:8085"]
timeout = "5s"
username = "tigk"
password = "tigk"
[agent]
logfile = "/data/tigk/telegraf/logs/telegraf.log"
启动
$ cd /home/tigk/.local/telegraf/usr/bin
$ nohup ./telegraf --config /data/tigk/telegraf/conf/telegraf.conf &
(1)获取rpm包
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.14.4-1.x86_64.rpm
(2) 安装rpm包
sudo yum localinstall telegraf-1.14.4-1.x86_64.rpm
(3)启动服务、添加开机启动
systemctl start telegraf.service
service telegraf status
systemctl enable telegraf.service
(4)查看版本,修改配置文件
telegraf --version
配置文件位置(默认配置):/etc/telegraf/telegraf.conf
修改telegraf配置文件
vim /etc/telegraf/telegraf.conf
(5)启动
service telegraf start
(1)命令展示 telegraf –h
$ ./telegraf -h
Telegraf, The plugin-driven server agent for collecting and reporting metrics.
Usage:
telegraf [commands|flags]
The commands & flags are:
config print out full sample configuration to stdout
version print the version to stdout
--aggregator-filter <filter> filter the aggregators to enable, separator is :
--config <file> configuration file to load
--config-directory <directory> directory containing additional *.conf files
--plugin-directory directory containing *.so files, this directory will be
searched recursively. Any Plugin found will be loaded
and namespaced.
--debug turn on debug logging
--input-filter <filter> filter the inputs to enable, separator is :
--input-list print available input plugins.
--output-filter <filter> filter the outputs to enable, separator is :
--output-list print available output plugins.
--pidfile <file> file to write our pid to
--pprof-addr <address> pprof address to listen on, don't activate pprof if empty
--processor-filter <filter> filter the processors to enable, separator is :
--quiet run in quiet mode
--section-filter filter config sections to output, separator is :
Valid values are 'agent', 'global_tags', 'outputs',
'processors', 'aggregators' and 'inputs'
--sample-config print out full sample configuration
--test gather metrics, print them out, and exit;
processors, aggregators, and outputs are not run
--test-wait wait up to this many seconds for service
inputs to complete in test mode
--usage <plugin> print usage for a plugin, ie, 'telegraf --usage mysql'
--version display the version and exit
Examples:
# generate a telegraf config file:
telegraf config > telegraf.conf
# generate config with only cpu input & influxdb output plugins defined
telegraf --input-filter cpu --output-filter influxdb config
# run a single telegraf collection, outputing metrics to stdout
telegraf --config telegraf.conf --test
# run telegraf with all plugins defined in config file
telegraf --config telegraf.conf
# run telegraf, enabling the cpu & memory input, and influxdb output plugins
telegraf --config telegraf.conf --input-filter cpu:mem --output-filter influxdb
# run telegraf with pprof
telegraf --config telegraf.conf --pprof-addr localhost:6060
(2)命令使用
命令 | 解释 |
---|---|
telegraf --help | 查看帮助 |
telegraf config > telegraf.conf | 标准输出生成配置文档模板 |
telegraf --input-filter cpu --output-filter influxdb config | 只生成数据采集插件为cpu、输出插件为influxdb的配置文档模板 |
telegraf --config telegraf.conf --test | 使用指定配置文件进行测试、将收集到的数据输出stdout |
telegraf --config telegraf.conf | 使用指定文件启动telegraf |
telegraf --config telegraf.conf --input-filter cpu:mem --output-filter influxdb | 按指定配置文件启动telegraf,过滤使用cpu、mem作为数据采集插件、influxdb为输出插件 |
(3)配置文档位置
安装方式 | 默认位置 | 默认补充配置文件夹 |
---|---|---|
Linux RPM包 | /etc/telegraf/telegraf.conf | /etc/telegraf/telegraf.d |
Linux Tar包 | {安装目录}/etc/telegraf/telegraf.conf | {安装目录}/etc/telegraf/telegraf.d |
(4)配置加载方式
命令默认加载telegraf.conf和/etc/telegraf/telegraf.d下的所有配置。选项—config和–config-directory可改变其行为。配置中每一个input模块,都会有对应的线程进行收集。如果有input配置重复,会造成资源浪费。
(5)配置全局tag标签
在配置文件中的[global_tags]区域定义key=“value”形式的键值对,这样收集到的metrics都会打上这样子的标签
(6)Agent配置
[agent] 区域可以对本机所有进行数据采集的agent进行配置。
属性 | 说明 |
---|---|
interval | 数据采集间隔 |
round_interval | 是否整时收集。如interval=10s,设置会使收集发生在每分钟的00,10,20,30… |
metric_batch_size | 发送到output的数据的分批大小 |
metric_buffer_limit | 发给output的数据buffer大小 |
collection_jitter | 收集数据前agent最大随机休眠时间,主要防止agent在同一时间收集数据 |
flush_interval | 发送数据到output的时间间隔 |
flush_jitter | 发送数据前最大随机休眠时间,主要防止一起发output时出现大的写高峰 |
Precision | 时间精度 |
logfile | 日志名 |
debug | 是否debug模式 |
quiet | 安静模式,只有错误消息 |
hostname | 默认os.Hostname(),设置则覆盖 |
omit_hostname | Tag中是否需要包含hostname |
(7)Input插件通用配置
属性 | 说明 |
---|---|
interval | 数据采集间隔,如果设置,会覆盖Agent的设置 |
name_override | 改变输出的measurement名字 |
name_prefix | measurement名字前缀 |
name_suffix | measurement名字后缀 |
Tags | 添加到输出measurement 的一个tag字典 |
(8)Output通用插件配置:无通用配置
(9)Measurement过滤,可以定义在input,output等插件中
属性 | 说明 |
---|---|
namepass | 只有Measurement符合此正则的数据点通过 |
namedrop | measurement符合此正则的数据点被丢弃 |
fieldpass | 只有fieldkey符合此正则的field通过 |
fielddrop | fieldkey符合此正则的field被丢弃 |
tagpass | 只有tag符合此正则的点通过 |
tagdrop | tag符合此正则的点被丢弃 |
taginclude | 只有tag符合此正则的点通过,并丢掉不符合的tag |
tagexclude | 丢掉符合正则的tag |
(10)典型配置举例
①Input - System – cpu
# Read metrics about cpu usage
[[inputs.cpu]]
## Whether to report per-cpu stats or not
percpu = true
## Whether to report total system cpu stats or not
totalcpu = true
## If true, collect raw CPU time metrics.
collect_cpu_time = false
## If true, compute and report the sum of all non-idle CPU states.
report_active = false
②Input - System – disk
# Read metrics about disk usage by mount point
[[inputs.disk]]
## By default stats will be gathered for all mount points.
## Set mount_points will restrict the stats to only the specified mount points.
# mount_points = ["/"]
## Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
③Input - System – kernel
# Get kernel statistics from /proc/stat
[[inputs.kernel]]
# no configuration
④Input - System – MEM
# Read metrics about memory usage
[[inputs.mem]]
# no configuration
⑤Input - System – netstat
# # Read TCP metrics such as established, time wait and sockets counts.
# [[inputs.netstat]]
# # no configuration
⑥Input - System – processes
# Get the number of processes and group them by status
[[inputs.processes]]
# no configuration
⑦Input - System – system
# Read metrics about system load & uptime
[[inputs.system]]
## Uncomment to remove deprecated metrics.
# fielddrop = ["uptime_format"]
⑧Input - System – ping
# # Ping given url(s) and return statistics
# [[inputs.ping]]
# ## Hosts to send ping packets to.
# urls = ["example.org"]
#
# ## Method used for sending pings, can be either "exec" or "native". When set
# ## to "exec" the systems ping command will be executed. When set to "native"
# ## the plugin will send pings directly.
# ##
# ## While the default is "exec" for backwards compatibility, new deployments
# ## are encouraged to use the "native" method for improved compatibility and
# ## performance.
# # method = "exec"
#
# ## Number of ping packets to send per interval. Corresponds to the "-c"
# ## option of the ping command.
# # count = 1
#
# ## Time to wait between sending ping packets in seconds. Operates like the
# ## "-i" option of the ping command.
# # ping_interval = 1.0
#
# ## If set, the time to wait for a ping response in seconds. Operates like
# ## the "-W" option of the ping command.
# # timeout = 1.0
#
# ## If set, the total ping deadline, in seconds. Operates like the -w option
# ## of the ping command.
# # deadline = 10
#
# ## Interface or source address to send ping from. Operates like the -I or -S
# ## option of the ping command.
# # interface = ""
#
# ## Specify the ping executable binary.
# # binary = "ping"
#
# ## Arguments for ping command. When arguments is not empty, the command from
# ## the binary option will be used and other options (ping_interval, timeout,
# ## etc) will be ignored.
# # arguments = ["-c", "3"]
#
# ## Use only IPv6 addresses when resolving a hostname.
# # ipv6 = false
⑨Input - App – procstat
# [[inputs.procstat]]
# ## PID file to monitor process
# pid_file = "/var/run/nginx.pid"
# ## executable name (ie, pgrep <exe>)
# # exe = "nginx"
# ## pattern as argument for pgrep (ie, pgrep -f <pattern>)
# # pattern = "nginx"
# ## user as argument for pgrep (ie, pgrep -u <user>)
# # user = "nginx"
# ## Systemd unit name
# # systemd_unit = "nginx.service"
# ## CGroup name or path
# # cgroup = "systemd/system.slice/nginx.service"
#
# ## Windows service name
# # win_service = ""
#
# ## override for process_name
# ## This is optional; default is sourced from /proc/<pid>/status
# # process_name = "bar"
#
# ## Field name prefix
# # prefix = ""
#
# ## When true add the full cmdline as a tag.
# # cmdline_tag = false
#
# ## Add PID as a tag instead of a field; useful to differentiate between
# ## processes whose tags are otherwise the same. Can create a large number
# ## of series, use judiciously.
# # pid_tag = false
#
# ## Method to use when finding process IDs. Can be one of 'pgrep', or
# ## 'native'. The pgrep finder calls the pgrep executable in the PATH while
# ## the native finder performs the search directly in a manor dependent on the
# ## platform. Default is 'pgrep'
# # pid_finder = "pgrep"
⑩Input – App – redis
# # Read metrics from one or many redis servers
# [[inputs.redis]]
# ## specify servers via a url matching:
# ## [protocol://][:password]@address[:port]
# ## e.g.
# ## tcp://localhost:6379
# ## tcp://:password@192.168.99.100
# ## unix:///var/run/redis.sock
# ##
# ## If no servers are specified, then localhost is used as the host.
# ## If no port is specified, 6379 is used
# servers = ["tcp://localhost:6379"]
#
# ## specify server password
# # password = "s#cr@t%"
#
# ## Optional TLS Config
# # tls_ca = "/etc/telegraf/ca.pem"
# # tls_cert = "/etc/telegraf/cert.pem"
# # tls_key = "/etc/telegraf/key.pem"
# ## Use TLS but skip chain & host verification
# # insecure_skip_verify = true
⑪Input – App – kafka_consumer
# # Read metrics from Kafka topics
# [[inputs.kafka_consumer]]
# ## Kafka brokers.
# brokers = ["localhost:9092"]
#
# ## Topics to consume.
# topics = ["telegraf"]
#
# ## When set this tag will be added to all metrics with the topic as the value.
# # topic_tag = ""
#
# ## Optional Client id
# # client_id = "Telegraf"
#
# ## Set the minimal supported Kafka version. Setting this enables the use of new
# ## Kafka features and APIs. Must be 0.10.2.0 or greater.
# ## ex: version = "1.1.0"
# # version = ""
#
# ## Optional TLS Config
# # enable_tls = true
# # tls_ca = "/etc/telegraf/ca.pem"
# # tls_cert = "/etc/telegraf/cert.pem"
# # tls_key = "/etc/telegraf/key.pem"
# ## Use TLS but skip chain & host verification
# # insecure_skip_verify = false
#
# ## SASL authentication credentials. These settings should typically be used
# ## with TLS encryption enabled using the "enable_tls" option.
# # sasl_username = "kafka"
# # sasl_password = "secret"
#
# ## SASL protocol version. When connecting to Azure EventHub set to 0.
# # sasl_version = 1
#
# ## Name of the consumer group.
# # consumer_group = "telegraf_metrics_consumers"
#
# ## Initial offset position; one of "oldest" or "newest".
# # offset = "oldest"
#
# ## Consumer group partition assignment strategy; one of "range", "roundrobin" or "sticky".
# # balance_strategy = "range"
#
# ## Maximum length of a message to consume, in bytes (default 0/unlimited);
# ## larger messages are dropped
# max_message_len = 1000000
#
# ## Maximum messages to read from the broker that have not been written by an
# ## output. For best throughput set based on the number of metrics within
# ## each message and the size of the output's metric_batch_size.
# ##
# ## For example, if each message from the queue contains 10 metrics and the
# ## output metric_batch_size is 1000, setting this to 100 will ensure that a
# ## full batch is collected and the write is triggered immediately without
# ## waiting until the next flush_interval.
# # max_undelivered_messages = 1000
#
# ## Data format to consume.
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
# data_format = "influx"
⑫Input – App – exec
# [[outputs.exec]]
# ## Command to ingest metrics via stdin.
# command = ["tee", "-a", "/dev/null"]
#
# ## Timeout for command to complete.
# # timeout = "5s"
#
# ## Data format to output.
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
# # data_format = "influx"
⑬Output – influxdb
# # Configuration for sending metrics to InfluxDB
# [[outputs.influxdb_v2]]
# ## The URLs of the InfluxDB cluster nodes.
# ##
# ## Multiple URLs can be specified for a single cluster, only ONE of the
# ## urls will be written to each interval.
# ## ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
# urls = ["http://127.0.0.1:9999"]
#
# ## Token for authentication.
# token = ""
#
# ## Organization is the name of the organization you wish to write to; must exist.
# organization = ""
#
# ## Destination bucket to write into.
# bucket = ""
#
# ## The value of this tag will be used to determine the bucket. If this
# ## tag is not set the 'bucket' option is used as the default.
# # bucket_tag = ""
#
# ## If true, the bucket tag will not be added to the metric.
# # exclude_bucket_tag = false
#
# ## Timeout for HTTP messages.
# # timeout = "5s"
#
# ## Additional HTTP headers
# # http_headers = {"X-Special-Header" = "Special-Value"}
#
# ## HTTP Proxy override, if unset values the standard proxy environment
# ## variables are consulted to determine which proxy, if any, should be used.
# # http_proxy = "http://corporate.proxy:3128"
#
# ## HTTP User-Agent
# # user_agent = "telegraf"
#
# ## Content-Encoding for write request body, can be set to "gzip" to
# ## compress body or "identity" to apply no encoding.
# # content_encoding = "gzip"
#
# ## Enable or disable uint support for writing uints influxdb 2.0.
# # influx_uint_support = false
#
# ## Optional TLS Config for use on HTTP connections.
# # tls_ca = "/etc/telegraf/ca.pem"
# # tls_cert = "/etc/telegraf/cert.pem"
# # tls_key = "/etc/telegraf/key.pem"
# ## Use TLS but skip chain & host verification
# # insecure_skip_verify = false
如获取yarn中的应用,并存入influxdb:①可利用input插件exec,执行某个脚本,使其标准输出打印符合influxdb line protocol的日志②通过脚本里利用yarn的api获取正在跑的应用
#!bin/python
import json
import urllib
import httplib
host="10.0.165.3:8088"
path="/ws/v1/cluster/apps"
data=urllib.urlencode({'state':"RUNNING","applicationTypes":"Apache Flink"})
path=path+"?"+data
headers = {"Accept":"application/json"}
conn=httplib.HTTPConnection(host)
conn.request("GET",path,headers=headers)
result=conn.getresponse()
if(result.status):
content = result.read()
apps = json.loads(content)["apps"]["app"]
for app in apps:
if("test" in app["name"] or "TEST" in app["name"] or "Test" in app["name"]):
continue
app["escaped_name"] = app["name"].replace(' ','\ ')
print "APPLICATION.RUNNING,appname=%s,appid=%s field_appname=\"%s\",field_appid=\"%s\" " % (app["escaped_name"],app["id"],app["name"],app["id"])
执行结果为APPLICATION.RUNNING,appname=iot_road_traffic,appid=application_1592979353214_0175 field_appname=“iot_road_traffic”,field_appid=“application_1592979353214_0175”
配置input插件的exec如下
[[outputs.exec]]
## Command to ingest metrics via stdin.
command = ["python", "/data/tigk/telegraf/exec/getRunningFlinkJob.py"]
# Timeout for command to complete.
timeout = "5s"
## Data format to output.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
data_format = "influx"