Real-time Top SQL

You can query real-time Top SQL in real-time resource monitoring views at different levels. The real-time resource monitoring view records the resource usage (including memory, data spilled to disks, and CPU time) and performance alarm information during job running.

The following table describes the external interfaces of the real-time views.

Table 1 Real-time resource monitoring views

Level

Monitored Node

View

Query level/perf level

Current CN

GS_WLM_SESSION_STATISTICS

All CNs

PGXC_WLM_SESSION_STATISTICS

operator level

Current CN

GS_WLM_OPERATOR_STATISTICS

All CNs

PGXC_WLM_OPERATOR_STATISTICS

Precautions

  • The view level is determined by the resource monitoring level, that is, the resource_track_level configuration.

  • The perf and operator levels affect the values of the query_plan and warning columns in GS_WLM_SESSION_STATISTICS/PGXC_WLM_SESSION_INFO. For details, see SQL Self-Diagnosis.

  • Prefixes gs and pgxc indicate views showing single CN information and those showing cluster information, respectively. Common users can log in to a CN in the cluster to query only views with the gs prefix.

  • When you query this type of views, there will be network latency, because the views obtain resource usage in real time.

  • If an instance fault occurs, some Top SQL statement information may fail to be recorded in real-time resource monitoring views.

  • Top SQL statements are recorded in real-time resource monitoring views as follows:

    • Special DDL statements, such as SET, RESET, SHOW, ALTER SESSION SET, and SET CONSTRAINTS, are not recorded.

    • DDL statements, such as CREATE, ALTER, DROP, GRANT, REVOKE, and VACUUM, are recorded.

    • DML statements are recorded, including:

      • the execution of SELECT, INSERT, UPDATE, and DELETE

      • the execution of EXPLAIN ANALYZE and EXPLAIN PERFORMANCE

      • the use of the query-level or perf-level views

    • The entry statements for invoking functions and stored procedures are recorded. When the GUC parameter enable_track_record_subsql is enabled, some internal statements (except the DECLARE definition statement) of a stored procedure can be recorded. Only the internal statements delivered to DNs for execution are recorded, and the remaining internal statements are filtered out.

    • The anonymous block statement is recorded. When the GUC parameter enable_track_record_subsql is enabled, some internal statements of an anonymous block can be recorded. Only the internal statements delivered to DNs for execution are recorded, and the remaining internal statements are filtered out.

    • The cursor statements are recorded. If a cursor does not read data from the cache but triggers the condition for delivering the statement to a DN for execution, the cursor statement is recorded and the statement and execution plan are enhanced. However, if the cursor reads data from the cache, the cursor statement is not recorded. When a cursor statement is used in an anonymous block or function and the cursor reads a large amount of data from a DN but is not fully used, the monitoring information about the cursor on the DN cannot be recorded due to the current architecture limitation. The With Hold cursor syntax has a special execution logic. It executes queries during transaction committing. If a statement execution error is reported during this period of time, the aborted status of the job cannot be recorded in the TopSQL history table.

    • Statistics are not collected for jobs in the redistribution process.

    • The parameters of a statement with placeholders executed by JDBC are generally specified. However, if the length of the parameter and the original statement exceeds 64 KB, the parameter is not recorded. If the statement is a lightweight statement, it is directly delivered to the DN for execution and the parameter is not recorded.

    • In cluster 8.1.3 and later versions, the TopSQL monitoring at the query and perf levels does not affect the query performance. The default value of the GUC parameter resource_track_cost for resource monitoring of statements has been changed to 0. When you query the TopSQL real-time monitoring view, by default, all statements that are being executed are displayed.

    • In 8.1.3 and later versions, if the GUC parameter enable_track_record_subsql for querying the TopSQL monitoring view is enabled, regardless of whether the substatement monitoring function is enabled in the service statements, you can view the substatement running information in the TopSQL monitoring view.

    • You are advised not to fully enable substatement monitoring in stored procedures, that is, enable_track_record_subsql, in the 8.1.3 cluster version. Because the substatements cannot be filtered by time, fully enabling substatement monitoring may record too many substatements. As a result, archived monitoring tables occupy a large amount of disk space. In the 8.1.3 cluster version, you are advised to enable only the parameters in the corresponding session when querying real-time monitoring information or locating and analyzing some stored procedures. In 8.2.1, the GUC parameter resource_track_subsql_duration is added. The default value is 180 seconds. You can use this parameter to filter substatements to be archived by execution time. The parameter can be adjusted.

    • Due to specification restrictions, the records of the main statements that are not written to disks in the TopSQL history table are delayed. The records are displayed in the TopSQL history table only when the job is delivered next time.

    • The spill_size field at the query level (job monitoring) and operator level (operator monitoring) varies due to the statistical dimension. The spill size at the query level is the statement files spilled to disks, and the spill size at the operator level is the read and write I/O volume of a specific operator at the logical layer.

    • When the GUC parameter enable_stream_operator is set to off, the displayed operator execution information may be inaccurate.

Prerequisites

  • The GUC parameter enable_resource_track is set to on. The default value is on.

  • The GUC parameter resource_track_level is set to query, perf or operator. The default value is query.

  • Jobs whose execution cost estimated by the optimizer is greater than or equal to the value of resource_track_cost are monitored.

  • If the Cgroups function is properly loaded, you can run the gs_cgroup -P command to view information about Cgroups.

  • The GUC parameter enable_track_record_subsql specifies whether to record internal statements of a stored procedure or anonymous block.

In the preceding prerequisites, enable_resource_track is a system-level parameter that specifies whether to enable resource monitoring. resource_track_level is a session-level parameter. You can set the resource monitoring level of a session as needed. The following table describes the values of the two parameters.

Table 2 Setting the resource monitoring level to collect statistics

enable_resource_track

resource_track_level

Query-Level Information

Operator-Level Information

on(default)

none

Not collected

Not collected

query(default)

Collected

Not collected

perf

Collected

Not collected

operator

Collected

Collected

off

none/query/operator

Not collected

Not collected

Procedure

  1. Query for the real-time CPU information in the gs_session_cpu_statistics view.

    SELECT * FROM gs_session_cpu_statistics;
    
  2. Query for the real-time memory information in the gs_session_memory_statistics view.

    SELECT * FROM gs_session_memory_statistics;
    
  3. Query for the real-time resource information about the current CN in the gs_wlm_session_statistics view.

    SELECT * FROM gs_wlm_session_statistics;
    
  4. Query for the real-time resource information about all CNs in the pgxc_wlm_session_statistics view.

    SELECT * FROM pgxc_wlm_session_statistics;
    
  5. Query for the real-time resource information about job operators on the current CN in the gs_wlm_operator_statistics view.

    SELECT * FROM gs_wlm_operator_statistics;
    
  6. Query for the real-time resource information about job operators on all CNs in the pgxc_wlm_operator_statistics view.

    SELECT * FROM pgxc_wlm_operator_statistics;
    
  7. Query for the load management information about the jobs executed by the current user in the PG_SESSION_WLMSTAT view.

    SELECT * FROM pg_session_wlmstat;
    
  8. Query the job execution status of the current user on each CN in the pgxc_wlm_workload_records view (this view is available when the dynamic load function is enabled, that is, enable_dynamic_workload is set to on).

    SELECT * FROM pgxc_wlm_workload_records;