Monitoring Metrics¶
You can check the status and available resources of a cluster and learn about its real-time resource consumption through the GaussDB(DWS) monitoring items.
Table 1 describes GaussDB(DWS) monitoring metrics.
Monitored Object  | Metric  | Description  | Value Range  | Monitoring Period (Raw Data)  | 
|---|---|---|---|---|
Cluster Overview  | Cluster Status  | Status of a cluster.  | Normal/Abnormal/Degraded  | 30s  | 
Nodes  | Number of available nodes and total number of nodes (Available/Total) in a cluster.  | >= 0  | 60s  | |
CNs  | Number of CNs in a cluster.  | >= 0  | 60s  | |
Databases  | Number of created databases in a cluster.  | >= 0  | 90s  | |
Resource Consumption  | CPU Usage  | Average real-time CPU usage of all nodes in a cluster.  | 0% to 100%  | 30s  | 
Memory Usage  | Average real-time memory usage of all nodes in a cluster.  | 0% to 100%  | 30s  | |
Disk Usage  | Average real-time disk usage of all nodes in a cluster.  | 0% to 100%  | 30s  | |
Disk I/O  | Average real-time disk I/O of all nodes in a cluster.  | >= 0 KB/s  | 30s  | |
Network I/O  | Average real-time network I/O of all NICs in a cluster.  | >= 0 KB/s  | 30s  | |
Top 5 Time-Consuming Queries  | Query ID  | ID of a query, which is automatically generated by the database.  | >= 0  | 180s  | 
SQL Statement  | Query statement executed by a user.  | String  | 180s  | |
Execution Time  | Execution time of a query statement (unit: ms).  | >= 0 ms  | 180s  | |
Top 5 Queries with Most Data Written to Disk  | Query ID  | ID of a query, which is automatically generated by the database.  | >= 0  | 180s  | 
SQL Statement  | Query statement executed by a user.  | String  | 180s  | |
Data Written to Disk  | Data to be written to disks after a user runs a statement (unit: MB).  | >= 0 MB  | 180s  | |
Cluster Resource Metrics  | CPU Usage  | Average CPU usage of all nodes in a cluster.  | 0% to 100%  | 30s  | 
Memory Usage  | Average memory usage of all nodes in a cluster.  | 0% to 100%  | 30s  | |
Disk Usage  | Average usage of all disks in a cluster.  | 0% to 100%  | 30s  | |
Disk I/O Usage  | Average I/O usage of all disks in a cluster.  | 0% to 100%  | 30s  | |
Network I/O Usage  | Average I/O usage of all NICs in a cluster.  | 0% to 100%  | 30s  | |
Key Database Metrics  | Cluster Status  | Cluster running status.  | Normal/Degraded/Abnormal  | 30s  | 
Cluster Abnormal CNs  | Number of abnormal CNs in the cluster  | >= 0  | 60s  | |
Cluster Read-only  | Whether the cluster is in the read-only state  | Yes/No  | 30s  | |
Concurrent Sessions  | Number of concurrent sessions in a cluster within a specified period.  | >= 0  | 30s  | |
Concurrent Queries  | Number of concurrent queries in a cluster within a specified period.  | >= 0  | 30s  | |
Node Monitoring-Overview  | Node Name  | Name of a node in a cluster.  | String  | 30s  | 
CPU Usage  | CPU usage of a host.  | 0% to 100%  | 30s  | |
Memory Usage  | Memory usage of a host.  | 0% to 100%  | 30s  | |
Average Disk Usage (%)  | Disk usage of a host.  | 0% to 100%  | 30s  | |
IP Address  | Service IP address of a host.  | String  | 30s  | |
Disk I/O  | Disk I/O of a host (unit: KB/s)  | >= 0 KB/s  | 30s  | |
TCP Protocol Stack Retransmission Rate  | Retransmission rate of TCP packets per unit time.  | 0% to 100%  | 30s  | |
Status  | Running status of a host  | Online/Offline  | 30s  | |
Node Monitoring-Disks  | Node Name  | Name of a node in a cluster.  | String  | 30s  | 
Disk Name  | Name of a disk on a host.  | String  | 30s  | |
Disk Capacity  | Disk capacity of the host (unit: GB)  | >= 0 GB  | 30s  | |
Disk Usage  | Disk usage of a host.  | 0% to 100%  | 30s  | |
Disk Read Rate  | Disk read rate of the host (unit: KB/s)  | >= 0 KB/s  | 30s  | |
Disk Write Rate  | Disk write rate of the host (unit: KB/s)  | >= 0 KB/s  | 30s  | |
I/O Wait Time (await, ms)  | Average waiting time for each I/O request (unit: ms)  | >= 0 ms  | 30s  | |
I/O Service Time (svctm, ms)  | Average processing time for each I/O request (unit: ms)  | >= 0 ms  | 30s  | |
I/O Utility (util, %)  | Disk I/O usage of a host.  | 0% to 100%  | 30s  | |
Node Monitoring-Network  | Node Name  | Name of a node in a cluster.  | String  | 30s  | 
NIC Name  | Name of the NIC on a host.  | String  | 30s  | |
NIC Status  | NIC status.  | up/down  | 30s  | |
NIC Speed  | Working rate of a NIC, in Mbit/s.  | >= 0  | 30s  | |
Received Packets  | Number of received packets of a NIC.  | >= 0  | 30s  | |
Sent Packets  | Number of sent packets of a NIC.  | >= 0  | 30s  | |
Lost Packets Received  | Number of received lost packets of a NIC.  | >= 0  | 30s  | |
Receive Rate  | Number of bytes received by a NIC per unit of time (KB/s).  | >= 0 KB/s  | 30s  | |
Transmit Rate  | Number of bytes sent by a NIC per unit of time (unit: KB/s)  | >= 0 KB/s  | 30s  | |
Database Monitoring  | Database Name  | Name of the database created by a user in a cluster.  | String  | 60s  | 
Usage  | Used capacity of the current database (unit: GB).  | >= 0 GB  | 86400s  | |
Users  | Number of users in the current database.  | >= 0  | 30s  | |
Sessions  | Number of sessions in the current database.  | >= 0  | 30s  | |
Applications  | Number of applications in the current database.  | >= 0  | 30s  | |
Queries  | Number of active queries in the current database.  | >= 0  | 30s  | |
Scanning Rows  | Number of rows returned by the full table scan query in the current database.  | >= 0  | 60s  | |
Index Query Rows  | Number of rows returned by the index query in the current database.  | >= 0  | 60s  | |
Inserted Rows  | Number of rows inserted in the current database.  | >= 0  | 60s  | |
Updated Rows  | Number of rows updated in the current database.  | >= 0  | 60s  | |
Deleted Rows  | Number of rows deleted from the current database.  | >= 0  | 60s  | |
Executed Transactions  | Number of transaction executions on the current database.  | >= 0  | 60s  | |
Transaction Rollbacks  | Number of transactions in the current database that have been rolled back.  | >= 0  | 60s  | |
Deadlocks  | Number of deadlocks detected in the current database.  | >= 0  | 60s  | |
Temporary Files  | Number of temporary files created in the current database.  | >= 0  | 60s  | |
Temporary File Capacity  | Size of temporary files written by the current database, in GB.  | >= 0  | 60s  | |
Performance Monitoring  | Cluster CPU Usage  | Average CPU usage of all nodes in a cluster.  | 0% to 100%  | 30s  | 
Cluster Memory Usage  | Average memory usage of all nodes in a cluster.  | 0% to 100%  | 30s  | |
Cluster Disk Usage  | Average disk usage of all nodes in a cluster.  | 0% to 100%  | 30s  | |
Cluster Disk I/O  | Average I/O of all disks in a cluster.  | 0% to 100%  | 30s  | |
Cluster Network I/O  | Average I/O of all NICs in a cluster.  | 0% to 100%  | 30s  | |
Cluster Status  | Historical trend of the cluster status.  | Normal/Abnormal/Degraded  | 30s  | |
Cluster Read-only  | Historical trend of the cluster read-only status change trend.  | Yes/No  | 30s  | |
Cluster Abnormal CNs  | Historical trend of the number of abnormal CNs in the cluster.  | >= 0  | 60s  | |
Cluster Abnormal DNs  | Historical trend of the number of abnormal DNs in the cluster.  | >= 0  | 60s  | |
Cluster CPU Usage of DNs  | Average CPU usage of all DNs in a cluster.  | 0% to 100%  | 60s  | |
Cluster Sessions  | Historical trend of the number of sessions in a cluster.  | >= 0  | 30s  | |
Cluster Queries  | Historical change trend of the number of queries in the cluster.  | >= 0  | 30s  | |
Cluster Deadlocks  | Historical trend of the number of deadlocks in a cluster.  | >= 0  | 60s  | |
Cluster TPS  | Average number of transactions per second of all databases in a cluster. Formula: (delta_xact_commit + delta_xact_rollback)/current_collect_rate  | >=0  | 60s  | |
Cluster QPS  | Average number of concurrent requests per second of all databases in a cluster. Formula: delta_query_count/current_collect_rate  | >= 0  | 60s  | |
Database Sessions  | Historical trend of the number of sessions on a single database in a cluster.  | >= 0  | 30s  | |
Database Queries  | Historical trend of the number of queries on a single database in a cluster.  | >= 0  | 30s  | |
Database Inserted Rows  | Historical trend of the number of rows inserted into a single database in a cluster.  | >= 0  | 60s  | |
Database Updated Rows  | Historical trend of the number of updated rows in a single database in a cluster.  | >= 0  | 60s  | |
Database Deleted Rows  | Historical trend of the number of deleted rows in a single database in a cluster.  | >= 0  | 60s  | |
Database Capacity  | Historical trend of the capacity in a single database in a cluster.  | >= 0  | 86400s  | |
Live Session  | Session ID  | ID of the current session (query thread ID).  | String  | 30s  | 
User Name  | Name of the user who executes the current session.  | String  | 30s  | |
Database Name  | Name of the database connected to the current session.  | String  | 30s  | |
Session Duration  | Duration of the current session (unit: ms).  | >= 0 ms  | 30s  | |
Application Name  | Name of the application that creates the current session.  | String  | 30s  | |
Queries  | Number of SQL statements executed in the current session.  | >= 0  | 30s  | |
Latest Query Duration  | Duration for executing the previous SQL statement in the current session.  | >= 0 ms  | 30s  | |
Client IP Address  | IP address of the client that initiates the current session.  | String  | 30s  | |
Connected CN  | Connected CN of the current session.  | String  | 30s  | |
Session Status  | Execution status of the current session.  | Running/Idle/Retry  | 30s  | |
Real-Time Query  | Query ID  | Query ID of a current query statement, which is a unique identifier allocated by the kernel to each query statement.  | String  | 30s  | 
User Name  | Name of the user who submits the current query statement.  | String  | 30s  | |
Database Name  | Name of the database corresponding to the current query statement.  | String  | 30s  | |
Application Name  | Name of the application corresponding to the current query statement.  | String  | 30s  | |
Resource Pool  | Name of the resource pool for the current query statement.  | String  | 30s  | |
Submitted  | Timestamp when the current query statement is submitted.  | String  | 30s  | |
Blocking Time  | Waiting time before the current query statement is executed, in ms.  | >= 0  | 30s  | |
Execution Time  | Execution time of the current query statement, in ms.  | >= 0  | 30s  | |
CPU Time  | Total CPU time spent by the current query statement on all DNs, in ms.  | >= 0  | 30s  | |
CPU Time Skew  | CPU time skew of the current query statement among all DNs.  | 0% to 100%  | 30s  | |
Statement  | Query statement that is being executed.  | String  | 30s  | |
Connected CN  | Name of the CN that submits the current query statement.  | String  | 30s  | |
Client IP Address  | IP address of the client that submits the current query statement.  | String  | 30s  | |
Lane  | Lane where the current query statement is located.  | Fast Lane/Slow Lane  | 30s  | |
Query Status  | Query status of the statement that is being executed.  | String  | 30s  | |
Session ID  | Session ID of the current query statement, which is a unique identifier allocated by the kernel to each client connection.  | String  | 30s  | |
Queuing Status  | Status of the current query execution in the database, indicating whether the query is queued in the resource pool.  | Yes/No  | 30s  | |
Historical Query  | Query ID  | Query ID of a query statement, which is a unique identifier allocated by the kernel to each query statement.  | String  | 180s  | 
User Name  | Name of the user who submits a query statement.  | String  | 180s  | |
Application Name  | Application name corresponding to a query statement.  | String  | 180s  | |
Database Name  | Name of the database corresponding to a query statement.  | String  | 180s  | |
Resource Pool  | Name of the resource pool for the current query statement.  | String  | 180s  | |
Submitted  | Timestamp when a query statement is submitted.  | String  | 180s  | |
Blocking Time  | Waiting time before the query statement is executed, in ms.  | >= 0  | 180s  | |
Execution Time  | Execution time of the query statement, in ms.  | >= 0  | 180s  | |
CPU Time  | Total CPU time spent by the query statement on all DNs, in ms.  | >= 0  | 180s  | |
CPU Time Skew  | CPU time skew of a query statement executed on all DNs.  | 0% to 100%  | 180s  | |
Statement  | Query statements to be parsed  | String  | 180s  | |
Slow Instance Monitoring  | Slow Instance  | Number of slow instances detected at the current time point.  | >= 0  | 240s  | 
Detected  | Time when a slow instance is detected for the first time.  | String  | 240s  | |
Node Name  | Name of the node where the slow instance is deployed.  | String  | 240s  | |
Instance  | Name of an instance.  | String  | 240s  | |
Slow Node Detections (within 24 hours)  | Number of times that a slow instance is detected within 24 hours.  | >= 0  | 240s  | |
Resource Pool Monitoring  | Resource Pool  | Name of a resource pool in a cluster.  | String  | 120s  | 
CPU Usage  | Real-time CPU usage of a resource pool.  | 0% to 100%  | 120s  | |
CPU Resource  | CPU usage quota of a resource pool.  | 0% to 100%  | 120s  | |
Real-Time Concurrent Short Queries  | Simple concurrency in a resource pool.  | >= 0  | 120s  | |
Concurrent Short Queries  | Quota for simple concurrency in a resource pool.  | >= 0  | 120s  | |
Real-Time Concurrent Queries  | Real-time complex concurrency in a resource pool.  | >= 0  | 120s  | |
Query Concurrency  | Quota for complex concurrency in a resource pool.  | >= 0  | 120s  | |
Storage  | Storage resource quota of a resource pool.  | >= 0  | 120s  | |
Disk Usage  | Disk usage of a resource pool.  | 0% to 100%  | 120s  | |
Memory  | Memory quota of a resource pool.  | >= 0  | 120s  | |
Memory Usage  | Memory usage of a resource pool.  | 0% to 100%  | 120s  | |
Queries Waiting in a Resource Pool  | User  | Name of the user of waiting queries  | String  | 120s  | 
Application  | Name of the application to be queried.  | String  | 120s  | |
Database  | Name of the database to be queried.  | String  | 120s  | |
Queuing Status  | Execution status of a query in the database (CCN/CN/DN).  | String  | 120s  | |
Wait Time  | Waiting time for a waiting query (unit: ms).  | >= 0 ms  | 120s  | |
Resource Pool  | Resource pool of the waiting query.  | String  | 120s  | |
Statement  | Query statement for the waiting status.  | String  | 120s  | |
Circuit Breaking Queries  | Query ID  | Query ID of the circuit breaking query statement.  | String  | 120s  | 
Query Statement  | Query statement for the circuit breaking status.  | String  | 120s  | |
Blocking Time  | Blocking time before the query statement triggers circuit breaking, in ms.  | >= 0  | 120s  | |
Execution Time  | Execution time before the query statement triggers circuit breaking, in ms.  | >= 0  | 120s  | |
CPU Time  | Average CPU time consumed by each DN before the query statement triggers circuit breaking, in ms.  | >= 0  | 120s  | |
CPU Skew  | Skew rate of CPU time consumed by each DN before the query statement triggers circuit breaking.  | 0% to 100%  | 120s  | |
Exception Handling  | Handling method after the query statement triggers circuit breaking.  | Abort/Degrade  | 120s  | |
Status  | Circuit breaking handling status of a query statement.  | Executing/Completed  | 120s  | |
SQL Tuning  | Query ID  | IP address of the current query (query logic ID).  | String  | 180s  | 
Database  | Name of the database where the current query is executed.  | String  | 180s  | |
Schema Name  | Name of the current query schema.  | String  | 180s  | |
User Name  | Name of the user who performs the query.  | String  | 180s  | |
Client  | Name of the client that initiates the current query.  | String  | 180s  | |
Client IP Address  | IP address of the client that initiates the current query.  | String  | 180s  | |
Running Time  | Execution time of the current query, in ms.  | >= 0  | 180s  | |
CPU Time  | CPU time of the current query, in ms.  | >= 0  | 180s  | |
Scale-Out Started  | Start time of the current query.  | Timestamp  | 180s  | |
Completed  | End time of the current query.  | Timestamp  | 180s  | |
Details  | Details about the current query.  | String  | 180s  | |
INODE  | Inode Usage  | Disk inode usage.  | 0% to 100%  | 30s  | 
SCHEMA  | Schema Usage  | Database schema usage.  | 0% to 100%  | 3600s  |