Host Metrics and Dimensions

Table 1 Host metrics

Metric

Description

Value Range

Unit

Total CPU cores (cpuCoreLimit)

Total number of CPU cores that have been applied for a measured object

>= 1

Cores

Used CPU cores (cpuCoreUsed)

Number of CPU cores used by a measured object

>= 0

Cores

CPU usage (cpuUsage)

CPU usage of a measured object

0%-100%

%

Available disk space (diskAvailableCapacity)

Disk space that has not been used

>= 0

MB

Total disk space (diskCapacity)

Total disk space

>= 0

MB

Disk read rate (diskReadRate)

Volume of data read from a disk per second

>= 0

KB/s

Disk read/write status (diskRWStatus)

Read/write status of a disk

0 or 1

  • 0: read/write

  • 1: read-only

None

Disk usage (diskUsedRate)

Percentage of the used disk space to the total disk space

>= 0

%

Disk write rate (diskWriteRate)

Volume of data written into a disk per second

>= 0

KB/s

Available physical memory (freeMem)

Available physical memory of a measured object

>= 0

MB

Available virtual memory (freeVirMem)

Available virtual memory of a measured object

>= 0

MB

Physical memory usage (memUsedRate)

Percentage of the used physical memory to the total physical memory

0%-100%

%

Host status (nodeStatus)

Host status

  • 0: Normal

  • Other values: Abnormal

None

NTP offset (ntpOffset)

Offset between the local time of the host and the NTP server time. When the NTP offset is closer to 0, the local time of the host is closer to the time of the NTP server.

None

ms

NTP server status (ntpServerStatus)

Whether the host is connected to the NTP server

0 or 1

  • 0: Connected

  • 1: Unconnected

None

NTP sync status (ntpStatus)

Whether the local time of the host is synchronous with the NTP server time

0 or 1

  • 0: Synchronous

  • 1: Asynchronous

None

Processes (processNum)

Number of processes on a measured object

>= 0

None

Downlink rate (recvBytesRate)

Inbound network traffic rate of a measured object

>= 0

Byte per Second (BPS)

Downlink rate (recvPackRate)

Number of data packets received by the NIC per second

>= 0

Packet per Second (PPS)

Downlink error rate (recvErrPackRate)

Number of error packets received by the NIC per second

>= 0

PPS

Uplink rate (sendBytesRate)

Outbound network traffic rate of a measured object

>= 0

BPS

Uplink error rate (sendErrPackRate)

Number of error packets sent by the NIC per second

>= 0

PPS

Uplink rate (sendPackRate)

Number of data packets sent by the NIC per second

>= 0

PPS

Total rate (totalBytesRate)

Total inbound and outbound network traffic rate of a measured object

>= 0

BPS

Total physical memory (totalMem)

Total physical memory that has been applied for a measured object

>= 0

MB

Total virtual memory (totalVirMem)

Total virtual memory of a measured object

>= 0

MB

Virtual memory usage (virMemUsedRate)

Percentage of the used virtual memory to the total virtual memory

0%-100%

%

GPU usage (gpuUtil)

GPU usage of a measured object

0%-100%

%

Table 2 Dimensions of host metrics

Dimension

Description

clusterId

Cluster ID

clusterName

Cluster name

gpuName

GPU name

gpuID

GPU ID

hostID

Host ID

nameSpace

Cluster namespace

nodeIP

Host IP address

nodeName

Host name