FANDOM


General

Metric Description Remark
user Time spent in user mode
nice Time spent in user mode with low priority (nice)
system Time spent in system mode
idle Time spent in the idle task
iowait Time waiting for I/O to complete
irq Time servicing interrupts
softirq Time servicing softirqs
steal Stolen time, which is the time spent in other operating systems when running in a virtualized environment
guest Time spent running a virtual CPU for guest operating systems under the control of the Linux kernel
guest_nice Time spent running a niced guest (virtual CPU for guest operating systems under the control of the Linux kernel)
Metric Description Remark
total
used
buffers
cached
available total - (used + buffers + cached)
active
inactive
  • Process Status Metrics
Metric Format Description Remark
pid %d The process ID
comm %s The filename of the executable, in parentheses
state %c process state R (Running), S (Sleeping), D (Waiting), Z (Zombie), T (Stopped), X (Dead)
ppid %d The PID of the parent of this process
pgrp %d The process group ID of the process
tty_nr %d The controlling terminal of the process
utime %lu Amount of time that this process has been scheduled in user mode, measured in clock ticks
stime %lu Amount of time that this process has been scheduled in kernel mode, measured in clock ticks
nice %ld The nice value [19 (low priority), -20 (high priority)]
num_threads %ld Number of threads in this process
starttime %llu The time the process started after system boot
vsize %lu Virtual memory size in bytes
rss %ld Resident Set Size: number of pages the process has in real memory
rsslim %lu Current soft limit in bytes on the rss of the process

Syslog

References

Severity Level

Severity Keyword Value Description Remarks
Emergency emerg 0 System is unusable
Alert alert 1 Action must be taken immediately
Critical crit 2 Critical conditions
Error err 3 Error conditions
Warning warning 4 Warning conditions
Notice notice 5 Normal but significant conditions
Informational info 6 Informational messages
Debug debug 7 Debug-level messages

rsyslog

References

Configuration

config = {global-directive} , {template} , {output-channel} , {rule};

(* old definition of 'rule' starting with 'selector' is replaced by the new 'rule' starting with more general 'filter' *)
rule = filter , " " , action;

filter = facility-priority-based-filter | property-based-filter | expression-based-filter;

facility-priority-based-filter = selector , { ";" , selector };
selector = facility , { "," , facility } , "." , [ "!" ] , [ "=" ] , priority;

property-based-filter = ":" , property , ", " , compare-operation , " " , string;
property = message-property | system-property | time-property;
message-property = "msg" | "rawmsg" | "rawmsg-after-pri"
                 | "source" | "fromhost" | "fromhost-ip" | "syslogtag" | "programname" | ...;

action = regular-file | name-pipe | console | remote-host
regular-file = ["-"] , ["?"] , file-name , [ ";" , template-name ]
remote-host = (( "@" , [ "(" , "z" , [1-9] , ")" ] ) | ( "@@" , [ "(" , ( "o" | ( "z" , [1-9] ) ) , ")" ] )) , ip-address , [ ":" , ip-port ]
  • Actions
    • Templates can be used with many actions. If used, the specified template is used to generate the message content (instead of the default template). To specify a template, write a semicolon after the action value immediately followed by the template name.
    • You can have multiple actions for a single selector (or more precisely a single filter of such a selector line).
  • Regular File
Configuration Examples

sysklogd

  • Syntax
facility = "auth" | "authpriv" | "cron" | "daemon" | "ftp" | "kern" | "lpr" | "mail" | "mark" | "news" | "syslog" | "user" | "uucp" | "local0" ... "local7" | "*"
priority = "debug" | "info" | "notice" | "warning" | "err" | "crit" | "alert" | "emerg" | "*" | "none"

(* Multiple facilities may be specified for a single priority pattern in one statement using the comma (",") operator to separate the facilities *)
selector = facility , { "," , facility } , "." , [ "!" ] , [ "=" ] , priority

remote machine = "@", machine name

action = regular file | named pipe | console | remote machine

(* Multiple selectors may be specified for a single action using the semicolon (";") separator. *)
rule = selector , { ";" , selector }, " ", action

Example

# Store critical stuff in critical
#
*.=crit;kern.none   /var/adm/critical

# Kernel messages are stored in the kernel file,
# critical messages and higher ones also go
# to another host and to the console
#
kern.*      /var/adm/kernel
kern.crit     @finlandia
kern.crit     /dev/console
kern.info;kern.!err   /var/adm/kernel-info

# The tcp wrapper logs with mail.info, we display
# all the connections on tty12
#
mail.=info     /dev/tty12

# Write all mail related logs to a file
#
mail.*;mail.!=info   /var/adm/mail

# Log info and notice messages to messages file
#
*.=info;*.=notice;\
mail.none /var/log/messages

# Emergency messages will be displayed using wall
#
*.=emerg      *

# Messages of the priority alert will be directed
# to the operator
#
*.alert      root,joey

*.*       @finlandia

Properties

Category Properties Description Remarks
Message msg the MSG part of the message (aka “the message” ;))
syslogtag TAG from the message
programname the “static” part of the tag, as defined by BSD syslogd
syslogfacility the facility from the message - in numerical form
syslogfacility-text the facility from the message - in text form
syslogseverity severity from the message - in numerical form
syslogseverity-text severity from the message - in text form
Property Replacer Options
Option Description Remarks
uppercase convert property to uppercase only
lowercase convert property text to lowercase only
sp-if-no-1st-sp For any field given, it returns either a single space character or no character at all. A space is returned if (and only if) the first character of the field’s content is NOT a space.

RainerScript

Configuration Objects
Object Description Remarks
global() used to set global configuration parameters
module() used to load plugins
input() the primary means of describing inputs
action() the primary means of describing actions to be carried out
timezone() used to define timezone settings

Modules

Input Modules
Module Title Description Parameters Remarks
imuxsock Unix Socket Input Module accept syslog messages from applications running on the local system via Unix sockets Socket
imklog Kernel Log Input Module reads messages from the kernel log and submits them to the syslog engine
imtcp TCP Syslog Input Module receives syslog messages via TCP Port
imudp UDP Syslog Input Module receives syslog messages via UDP Port
Output Modules
Module Title Description Parameters Remarks
omfile File Output Module writing messages to files residing inside the local file system
omjournal Systemd Journal Output provides native support for logging to the systemd journal.
omuxsock Unix Sockets Output Module supports sending syslog messages to local Unix sockets.
omfwd Forwarding Output Module traditional message forwarding via UDP and plain TCP built-in
omhttp HTTP Output Module send messages over an HTTP REST interface.
omelasticsearch Elasticsearch Output Module provides native support for logging to Elasticsearch.
omkafka Apache Kafka Output Module implements an Apache Kafka producer, permitting rsyslog to write data to Kafka.
Parser Modules
Message Modification Modules
Module Title Description Parameters Remarks
mmjsonparse JSON/CEE Structured Content Extraction Module provides support for parsing structured log messages that follow the CEE/lumberjack spec.

Plugins

Reserved Templates

Template Format Description Remarks
RSYSLOG_TraditionalFileFormat "%timegenerated% %HOSTNAME% %syslogtag%%msg%\\n" the old style default log file format
RSYSLOG_FileFormat "%TIMESTAMP% %HOSTNAME% %syslogtag%%msg:::sp-if-no-1st-sp%%msg:::drop-last-lf%\n" a modern-style logfile format

Command-line

Option Description Remark
-d Turns on debug mode. RSYSLOG_DEBUG, RSYSLOG_DEBUGLOG
-N level Do NOT run in regular mode, just check configuration file correctness. level = 0 | 1 | 2 | 3
-Q Do not resolve hostnames to IP addresses during ACL processing.
-x Disable DNS for remote messages.

Readings

Best Practices

InfluxDB

Features

  • Schemaless Design
    • You can add new measurements, tags, and fields at any time. Note that if you attempt to write data with a different type than previously used (for example, writing a string to a field that previously accepted integers), InfluxDB will reject those data.
  • Tag
    • Tags are indexed so queries on tags are performant.
    • Fields are not indexed.
    • Tag values are strings and they store metadata.

References

Typical Queries

  • CPU Usage
SELECT mean("usage_idle") * -1 + 100 FROM "cpu"
WHERE ("system" =~ /^$system$/ AND "env" =~ /^$env$/ AND "host" =~ /^$host$/ AND "cpu" = 'cpu-total') AND $timeFilter
GROUP BY TIME($__interval) fill(NONE)
  • Available Memory
SELECT mean("available_percent") FROM "mem"
WHERE ("system" =~ /^$system$/ AND "env" =~ /^$env$/ AND "host" =~ /^$host$/) AND $timeFilter
GROUP BY TIME($__interval) fill(NONE)
  • Disk Read/Write Rate (Byte/s)
SELECT derivative(mean("read_bytes"), 1s) AS "Read Rate", derivative(mean("write_bytes"), 1s) AS "Write Rate" FROM "diskio"
WHERE ("system" =~ /^$system$/ AND "env" =~ /^$env$/ AND "host" =~ /^$host$/) AND $timeFilter
GROUP BY TIME($__interval), "name" fill(NONE)
  • Network Receive/Send Rate (Byte/s)
SELECT derivative(mean("bytes_recv"), 1s) AS "Receive Rate", derivative(mean("bytes_sent"), 1s) AS "Send Rate" FROM "net"
WHERE ("system" =~ /^$system$/ AND "env" =~ /^$env$/ AND "host" =~ /^$host$/ AND "interface" =~ /^eth|lo/) AND $timeFilter
GROUP BY TIME($__interval), "interface" fill(NONE)

Readings

Telegraf

References

Configuration Options

Category Option Type Description Example Remarks
Input interval interval How often to gather this metric.
name_override string Override the base name of the measurement.
name_prefix string
name_suffix string
tags map A map of tags to apply to a specific input's measurements.
Metric Filtering/Selectors namepass string array Only metrics whose measurement name matches a pattern in this list are emitted.
namedrop string array
tagpass Only metrics that contain a tag key in the table and a tag value matching one of its patterns is emitted.
tagdrop If a match is found the metric is discarded
Metric Filtering/Modifiers fieldpass string array
fielddrop array of glob pattern strings Fields with a field key matching one of the patterns will be discarded from the metric.
taginclude array of glob pattern strings Only tags with a tag key matching one of the patterns are emitted.
tagexclude array of glob pattern strings Tags with a tag key matching one of the patterns will be discarded from the metric.

Input Plugins

Plugin Description Config Tags Fields
cpu collects standard CPU metrics as defined in man proc percpu, totalcpu, collect_cpu_time, report_active cpu
disk gathers metrics about disk usage mount_points, ignore_fs fstype, device, path, mode free, total, used, used_percent, inodes_free, inodes_total, inodes_used
diskio gathers metrics about disk traffic and timing devices, skip_serial_number, ... reads, writes, read_bytes, write_bytes, read_time, write_time, io_time, weighted_io_time, iops_in_progress name, serial
exec executes the commands on every interval and parses metrics from their output
kernel gathers info about the kernel that doesn't fit into other plugins N/A boot_time, context_switches, disk_pages_in, disk_pages_out, interrupts, processes_forked, entropy_avail N/A
mem collects system memory metrics N/A active, available, buffered, cached, free, inactive, inactive, total, used, available_percent, used_percent
net gathers metrics about network interface and protocol usage interfaces, ignore_protocol_stats bytes_sent, bytes_recv, packets_sent, packets_recv, err_in, err_out, drop_in, drop_out interface
netstat collects TCP connections state and UDP socket counts by using lsof N/A N/A
procstat monitor the system resource usage of one or more processes pid_file, exe, pattern, user, ... pid, cpu_time, cpu_usage, memory_rss, memory_vms, num_fds, num_threads, ... pid, process_name, pidfile, systemd_unit, ...
haproxy gathers statistics using the stats socket or HTTP statistics page of a HAProxy server
logparser streams and parses the given logfiles. files, from_beginning, grok Timestamp modifiers can be used to convert captures to the timestamp of the parsed metric
http_listener_v2 listens for metrics sent via HTTP. service_address, path, methods, tls_allowed_cacerts, ...
socket_listener listens for messages from streaming (tcp, unix) or datagram (udp, unixgram) protocols. service_address, max_connections, tls_allowed_cacerts, tls_cert, tls_key, data_format, ...

Output Plugins

Plugin Description Config Tags Fields
file writes telegraf metrics to files
socket_writer write to a UDP, TCP, or unix socket. address, tls_ca, tls_cert, tls_key, insecure_skip_verify, data_format

Parser Plugins

Plugin Description Fields
json parses a JSON object or an array of objects into metric fields json_string_fields, json_name_key, json_query, json_time_key, json_time_format
grok parses line delimited data using a regular expression like language.
influx

Grok

Category Name Pattern Remarks
Common NUMBER (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))) decimal
INT (?:[+-]?(?:[0-9]+))
POSINT \b(?:[1-9][0-9]*)\b positive integer
NONNEGINT \b(?:[0-9]+)\b non-negative integer
WORD \b\w+\b \w =: [a-zA-Z_0-9]
NOTSPACE \S+
DATA .*?
GREEDYDATA .*
Networking IPV4
IPV6
IP (?:%{IPV6}|%{IPV4})
HOSTNAME
IPORHOST (?:%{IP}|%{HOSTNAME})
HOSTPORT %{IPORHOST}:%{POSINT}
Syslog SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME} Syslog Dates: Month Day HH:MM:SS
SYSLOGPROG %{PROG:program}(?:\[%{POSINT:pid}\])?
SYSLOGHOST %{IPORHOST}

Readings

Samples

Basic Configuration

[global_tags]
  system = "systemX"
  env = "dev"

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  logfile = "/var/log/telegraf/telegraf.log"
  debug = false
  quiet = false
  hostname = ""
  omit_hostname = false

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/cpu
[[inputs.cpu]]
  percpu = false
  totalcpu = true
  collect_cpu_time = false
  report_active = false

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/disk
[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "overlay", "aufs", "squashfs"]

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/diskio
[[inputs.diskio]]
  devices = ["hd*", "sd*", "xvd*"]

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/kernel
[[inputs.kernel]]
  fieldpass = ["boot_time", "context_switches", "interrupts", "processes_forked", "entropy_avail"]

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/mem
[[inputs.mem]]
  fieldpass = ["available", "free", "total", "used", "available_percent", "used_percent"]

# https://github.com/influxdata/telegraf/blob/master/plugins/inputs/net/NET_README.md
[[inputs.net]]
  ignore_protocol_stats = true
  interfaces = ["eth*", "enp0s*", "lo"]
  fieldpass = ["bytes_sent", "bytes_recv", "err_in", "err_out", "drop_in", "drop_out"]

# [[inputs.processes]]

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/procstat
[[inputs.procstat]]
  pattern = "^(haproxy|/opt/ripple/bin/ripple)"
  pid_tag = true
  pid_finder = "pgrep"
  fieldpass = ["cpu_usage", "involuntary_context_switches", "memory_rss", "memory_vms", "nice_priority", "num_fds", "num_threads", "voluntary_context_switches"]

# [[inputs.swap]]

# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/system
[[inputs.system]]
  fieldpass = ["load1", "load15", "load5", "n_users", "n_cpus"]

Configuration with Custom Metric

[[inputs.exec]]
  name_override = "openfiles"
  commands = ["awk '{print $1}' /proc/sys/fs/file-nr"]
  timeout = "5s"
  data_format = "value"
  data_type = "integer"

[[inputs.exec]]
  name_override = "rippled"
  commands = ["bash /etc/telegraf/telegraf_exec_cmd_rippled.sh 127.0.0.1 5505"]
  interval = "10s"
  timeout = "5s"
  data_format = "json"
  json_string_fields = ["build_ver", "complete_ledgers", "server_state", "base_log_level"]
  [inputs.exec.tags]
    service_type="ripple"
    service = "v1"    

[[inputs.exec]]
  name_override = "rippled"
  commands = ["bash /etc/telegraf/telegraf_exec_cmd_rippled.sh 127.0.0.1 5515"]
  interval = "10s"
  timeout = "5s"
  data_format = "json"
  json_string_fields = ["build_ver", "complete_ledgers", "server_state", "base_log_level"]
  [inputs.exec.tags]
    service_type="ripple"
    service = "t1"

Configuration with Log Parsing

# HAProxy Log
[[inputs.logparser]]
  name_override = "proxy-log"
  interval = "5s"
  tagexclude = ["path"]
  fielddrop = []

  files = ["/var/log/haproxy.log"]
  from_beginning = false

  # Telegraf Logparser input plugin : https://github.com/influxdata/telegraf/tree/master/plugins/inputs/logparser
  # Logstash Builtin Grok Pattern: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns
  #
  # HAProxy Custom log format: http://cbonte.github.io/haproxy-dconv/1.8/configuration.html#8.2.4
  #
  #   %ci\ %ft\ %b\ %si\ %Tq/%TR/%Tw/%Tc/%Tr/%Ta/%Tt\ %ST\ %B\ %ts\ %sq/%bq\ %hr\ %{+Q}r
  #  
  #   %si  | server_IP                   (target address)  | IP          |
  #   %ST  | status_code                                   | numeric     |
  #   %B   | bytes_read           (from server to client)  | numeric     |
  #
  [inputs.logparser.grok]
    patterns = ['''
      ^%{SYSLOGTIMESTAMP:timestamp:ts-syslog} %{SYSLOGHOST} haproxy/%{WORD:service:tag}(?:\[%{POSINT}\])?: %{IPV4:client_ip} %{NOTSPACE:fe} %{NOTSPACE:be} (%{IPV4:server_ip}|-) %{INT:tq:int}/%{INT:treq:int}/%{INT:tw:int}/%{INT:tc:int}/%{INT:tresp:int}/%{NONNEGINT:ta:int}/%{NONNEGINT:tt:int} %{POSINT:status_code:tag} %{NONNEGINT:read_bytes:int} %{NOTSPACE} %{NOTSPACE:server_q:int}/%{NOTSPACE:be_q:int} {%{_HTTP_HEADER_HOST:header_host:tag}?(?:\|)?%{IPORHOST:x_fw_for}?} "%{_DATA_50:req}%{GREEDYDATA}"$
      ''']
    custom_patterns='''
      _HTTP_HEADER_HOST (\w|\d|\.|\:|-)*
      _DATA_50 .{0,50}
      '''

# rsyslog Log
[[inputs.logparser]]
  ## Get high severity (higher than 'warning') logs from the message Rsyslogd collected.
  name_override = "rsyslog-message"
  interval = "5s"
  tagexclude = ["path"]
  fielddrop = []

  files = ["/var/log/messages"]
  from_beginning = false

  ## template(name="TraditionalWithFacilityAndSeverityFormat" type="string"
  ##   string="%TIMESTAMP% %HOSTNAME% %syslogfacility-text% %syslogseverity-text% %syslogtag%%msg:::sp-if-no-1st-sp% %msg:::drop-last-lf%\n")
  [inputs.logparser.grok]
    patterns = ['''
      ^%{SYSLOGTIMESTAMP:timestamp:ts-syslog} %{SYSLOGHOST} %{WORD:facility} %{_HIGH_SEVERITY:severity:tag} %{SYSLOGPROG:prog}: %{GREEDYDATA:msg}$
      ''']
    ## https://www.rsyslog.com/doc/v8-stable/configuration/sysklogd_format.html#selectors
    custom_patterns='''
      _HIGH_SEVERITY (warning|err|crit|alert|emerg)
      '''

# Rippled Log
[[inputs.logparser]]
  name_override = "rippled-log"
  interval = "5s"
  tagexclude = ["path"]
  fielddrop = []

  files = ["/var/logs/rippled/rippled-v4.log"]
  from_beginning = true

  # Telegraf Logparser input plugin : https://github.com/influxdata/telegraf/tree/master/plugins/inputs/logparser
  # Logstash builtin Grok pattern: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns
  #
  # Ripple log sample
  #   2019-Jun-24 03:57:56.361477854 Validations:WRN Val for 575A8D10C5EF97F96743845C877259FAB6B9F2EF411F60E21AEFA91182A18C5B trusted/full from
  #   2019-Jun-12 08:53:38.204341778 Ledger:ERR Missing node in 62DF2BE50EF904BCF18A72180B3210A05861C697412540E1AAEC7F85EC294E96
  #
  [inputs.logparser.grok]
    patterns = ['''
      ^%{_DATA_30:timestampJ:ts-"2006-Jan-02 15:04:05.000000000"} %{_RIPPLE_LOG_PARTITION:category}:%{_RIPPLE_WARN_OR_ERROR:level} %{_DATA_200:msg}%{GREEDYDATA}$
      ''']
    custom_patterns='''
      _DATA_30 .{30}
      _RIPPLE_LOG_PARTITION [^\:]*
      _RIPPLE_WARN_OR_ERROR WRN|ERR
      _DATA_200 .{0,200}
      '''
  [inputs.logparser.tags]
    service = "v4"
   
[[outputs.file]]
  files = ["stdout"]

Configuration for Relay

  • For relay server
[[inputs.socket_listener]]
  service_address = "tcp://host1:9870"
  max_connections = 512
  data_format = "influx"
  1. For servers that need relay
[[outputs.socket_writer]]
  address = "tcp://host1:9870"
  data_format = "influx"

Prometheus

References

Collectors and Metrics

Collector Metrics References Remarks
meminfo node_memory_MemTotal
node_memory_MemFree
node_memory_Buffers
node_memory_Cached
CentOS > /proc/meminfo


SUSE > System Monitoring > Memeory

/proc/meminfo
netdev node_network_receive_bytes
node_network_transmit_bytes
/proc/net/dev

Metrics and Labels

Metric Labels Remarks
node_cpu cpu, mode
node_disk_io+* device
node_network_receive_bytes device

Queries

  • CPU Usage by Machine
100 - avg(irate(node_cpu{job="node",instance="$instance",mode="idle"}[2m]))  * 100
  • CPU Usage by Mode
(avg (irate(node_cpu{job="node",instance="$instance"}[2m]))  BY  (mode)) * 100
  • Memory Usage by Machine
(node_memory_MemTotal{instance="$instance"} - (node_memory_MemFree{instance="$instance"} + node_memory_Buffers{instance="$instance"} + node_memory_Cached{instance="$instance"}))/ node_memory_MemTotal{instance="$instance"} * 100
  • Network Send Rate by Interface
rate(node_network_transmit_bytes{job="node",instance="$instance",device=~"lo|^eth.*"}[2m])
  • Network Receive Rate by Interface
rate(node_network_receive_bytes{job="node",instance="$instance",device=~"lo|^eth.*"}[2m])

Readings

Tips and Tricks

Available configuration flags of Node Exporter

"/usr/bin/prometheus-node-exporter -h" yields the following.

  -auth.pass string
        Password for basic auth.
  -auth.user string
        Username for basic auth.
  -collector.diskstats.ignored-devices string
        Regexp of devices to ignore for diskstats. (default "^(ram|loop|fd|(h|s|v|xv)d[a-z])\\d+$")
  -collector.filesystem.ignored-mount-points string
        Regexp of mount points to ignore for filesystem collector. (default "^/(sys|proc|dev)($|/)")
  -collector.ipvs.procfs string
        procfs mountpoint. (default "/proc")
  -collector.megacli.command string
        Command to run megacli. (default "megacli")
  -collector.netdev.ignored-devices string
        Regexp of net devices to ignore for netdev collector. (default "^$")
  -collector.ntp.server string
        NTP server to use for ntp collector.
  -collector.textfile.directory string
        Directory to read text files with metrics from.
  -collectors.enabled string
        Comma-separated list of collectors to use. (default "diskstats,filesystem,loadavg,meminfo,stat,textfile,time,netdev,netstat")
  -collectors.print
        If true, print available collectors and exit.
  -debug.memprofile-file string
        Write memory profile to this file upon receipt of SIGUSR1.
  -log.level value
        Only log messages with the given severity or above. Valid levels: [debug, info, warn, error, fatal, panic]. (default info)
  -web.listen-address string
        Address on which to expose metrics and web interface. (default ":9100")
  -web.telemetry-path string
        Path under which to expose metrics. (default "/metrics")

Overriding instance label

Candidate 1 : Using relabel_configs

 - scrape_config:
    job_name
: node
    static_configs
:
      - targets
: ['10.0.0.1:9100/brick1' '10.0.02:9100/brick1']
     
    relabel_configs
:
      - source_labels
: [__address__]
        target_label
: instance
        regex
: ^[^/]*/(.*)
        replacement
: $1
      - source_labels
: [__address__]
        target_label
: __address__
        regex
: ^([^/]*).*
        replacement
: $1

Candidate 2 : Using lables for each target

 - scrape_config:
    job_name
: node
    static_configs
:
      - targets
: ['10.0.0.1:9100']  
        lables
:
         "instance":"brick1"
          "role":"peer"
      - targets
: ['10.0.0.1:9100']  
        lables
:
         "instance":"brick1"
          "role":"peer"

Companions

WMI Exporter

Grafana

References

Global Built-in Variables

Variable Description Applied For Remarks
$__interval, $interval an interval that can be used to group by time in queries automatically calculated using the time range and the width of the graph (the number of pixels) InfluxDB, MySQL, Postgres calculated automatically
$__interval_ms
$timeFilter, $__timeFilter currently selected time range as an expression which can be used in the WHERE clause for the InfluxDB data source InfluxDB added automatically
$__range Prometheus

Panels

Panel Description Remarks
Status Panel used as a centralized view for the status of component in a glance

Grafana and InfluxDB

  • Built-in Aliases
Alias Description Remarks
$m, $measurement replaced with measurement name
$col replaced with column name
$tag_tagname replaced with the value of the tagname tag

Grafana and Prometheus

Readings

Logstash

References

Configuration

Operators

Category Operator Title Remarks
Comparison ==
!=
<
>
=~
!~
in
not in
Logic and AND
or OR
nand NAND
xor XOR
! NOT Untary

Plugins

Type Name Description Remarks
Input beats Receives events from the Elastic Beats framework
Output elasticsearch Stores logs in Elasticsearch
Filter grok Parses unstructured event data into fields Logstash Grok Patterns
drop Drops all events Options : percentage
date Parses dates from fields to use as the Logstash timestamp for an event
truncate Truncates fields longer than a given length
mutate Performs mutations on fields

Monitoring APIs

API URL Description Types
Node Info curl -XGET 'localhost:9600/_node/<types>' retrieves information about the node pipelines, os, jvm
Plugins Info curl -XGET 'localhost:9600/_node/plugins?pretty' gets information about all Logstash plugins that are currently installed
Node Stats curl -XGET 'localhost:9600/_node/stats/<types>' retrieves runtime stats about Logstash jvm, process, events, pipelines, reloads, os
Hot Threads curl -XGET 'localhost:9600/_node/hot_threads?pretty' gets the current hot threads for Logstash

Readings

Filebeat

References

Inputs/Outputs

Type Name Description Remarks
Input Log Read lines from log files.
Syslog Read and parse BSD (rfc3164) event and some variant over TCP or UDP.
Output Logstash Sends events directly to Logstash by using the lumberjack protocol, which runs over TCP.

Fields

Group Description Fields Remarks
Beat Contains common beat fields available in all event types. beat.name, beat.hostname, beat.timezone, @timestamp, tags, fields
Host Info collected for the host machine host.name, host.ip, host.mac

ElastAlert

  • https://github.com/Yelp/elastalert
  • Desc. : a simple framework for alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch
  • License : Apache License 2.0
  • Sources :

AWStats

Ganglia

Graphite

Community content is available under CC-BY-SA unless otherwise noted.