SMI Cluster Level Metrics

CPU Category

node_cpu_seconds_total

Description: Seconds the cpus spent in each mode

Sample Query: avg(irate(node_cpu_seconds_total{mode=~\"irq|softirq\"}[1m])) by (instance) * 100

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: cpu

    Label Description: the cpu number

    Example: cpu0, cpu1, etc

  • Label: mode

    Label Description: the cpu mode

    Example: system, user, sotirq, irq, idle, iowait, etc

Disk Category

node_disk_bytes_read

Description: This metrics gives the total number of bytes read successfully.

Sample Query: sum(irate(node_disk_bytes_read[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the disk device

    Example: vdb, vdd, sr0

node_disk_read_time_seconds_total

Description: This metrics gives the total number of seconds spent by all reads

Sample Query: sum(irate(node_disk_read_time_seconds_total[1m])) by (instance) / sum(irate(node_disk_reads_completed_total[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the disk device

    Example: vdb, vdd, sr0

node_disk_reads_completed_total

Description: This metrics gives the total number of reads completed successfully.

Sample Query: sum(irate(node_disk_reads_completed_total[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

Labels:

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the disk device

    Example: vdb, vdd, sr0

node_disk_write_time_seconds_total

Description: This metrics gives the total number of seconds spent by all writes

Sample Query: sum(irate(node_disk_write_time_seconds_total[1m])) by (instance) / sum(irate(node_disk_writes_completed_total[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

Labels:

  • Label: job

    Label Description: the name of job

    Example: node_exporter

Labels:

  • Label: device

    Label Description: the name of the disk device

    Example: vdb, vdd, sr0

node_disk_writes_completed_total

Description: This metrics gives the total number of writes completed successfully.

Sample Query: sum(irate(node_disk_writes_completed[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

Labels:

  • Label: job

    Label Description: the name of job

    Example: node_exporter

Labels:

  • Label: device

    Label Description: the name of the disk device

    Example: vdb, vdd, sr0

node_disk_written_bytes_total

Description: This metrics gives the total number of bytes written successfully.

Sample Query: sum(irate(node_disk_written_bytes_total[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

Labels:

  • Label: job

    Label Description: the name of job

    Example: node_exporter

Labels:

  • Label: device

    Label Description: the name of the disk device

    Example: vdb, vdd, sr0

File System Category

node_filesystem_free_bytes

Description: This metrics gives the total number of bytes of the free disk space available on the instance

Sample Query: sum(node_filesystem_free_bytes{mountpoint=\"/data\"}) by (device, instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the disk device

    Example: /dev/vda3, /dev/vdb

  • Label: fstype

    Label Description: the file system type

    Example: ext4

  • Label: mountpoint

    Label Description: the file system mount directory

    Example: /data, /tootfs

node_filesystem_size_bytes

Description: This metrics gives the total number of bytes of the total disk space provisioned on the instance

Sample Query: sum(node_filesystem_size_bytes{mountpoint=\"/data\"}) by (device, instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the disk device

    Example: /dev/vda3, /dev/vdb

  • Label: fstype

    Label Description: the file system type

    Example: ext4

  • Label: mountpoint

    Label Description: the file system mount directory

    Example: /data, /tootfs

Load Category

node_load1

Description: This metrics gives the 1m load average.

Metric Type: Gauge

Data Type: Float

Sample Query: avg(irate(node_load1[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

node_load15

Description: This metrics gives the 15m load average.

Metric Type: Gauge

Data Type: Float

Sample Query: avg(irate(node_load15[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

node_load5

Description: This metrics gives the 5m load average.

Metric Type: Gauge

Data Type: Float

Sample Query: avg(irate(node_load5[1m])) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

Labels:

  • Label: job

    Label Description: the name of job

    Example: node_exporter

Memory Category

node_memory_MemFree_bytes

Description: This metrics gives the total number of bytes of the free memory available on the node

Sample Query: sum(node_memory_MemFree_bytes) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

node_memory_MemTotal_bytes

Description: This metrics gives the total number of bytes of the total memory provisioned on the node

Sample Query: sum(node_memory_MemTotal_bytes) by (instance)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

Network Category

node_network_receive_bytes_total

Description: This metrics gives the total number of bytes received over the network device

Sample Query: sum(irate(node_network_receive_bytes_total[1m])) by (device)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the network device/interface

    Example: ens3, ens4

node_network_transmit_bytes_total

Description: This metrics gives the total number of bytes sent over the network device

Sample Query: sum(irate(node_network_transmit_bytes_total[1m])) by (device)

Labels:

  • Label: instance

    Label Description: the virtual machine/instance

    Example: master-0, control-0, dra-director-1, etc

  • Label: job

    Label Description: the name of job

    Example: node_exporter

  • Label: device

    Label Description: the name of the network device/interface

    Example: ens3, ens4