• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. SQL Reference
  5. SQL Syntax
  6. VACUUM

VACUUM

Function

VACUUM reclaims storage space occupied by tables or B-tree indexes. In normal database operation, rows that have been deleted are not physically removed from their table; they remain present until a VACUUM is done. Therefore, it is necessary to execute VACUUM periodically, especially on frequently-updated tables.

With no parameter, VACUUM processes every table in the current database that the current user has permission to vacuum. With a parameter, VACUUM processes only that table.

VACUUM ANALYZE executes a VACUUM and then an ANALYZE for each selected table. This is a handy combination form for routine maintenance scripts.

Plain VACUUM (without FULL) recycles space and makes it available for reuse. This form of the command can operate in parallel with normal reading and writing of the table, as an exclusive lock is not obtained. VACUUM FULL executes wider processing, including moving rows across blocks to compress tables so they occupy minimum number of disk blocks. This form is much slower and requires an exclusive lock on each table while it is being processed.

Precautions

  • To vacuum a table, you must ordinarily be the table's owner or the system administrator. However, database owners are allowed to VACUUM all tables in their databases, except shared catalogs. (The restriction for shared catalogs means that a true database-wide VACUUM can only be executed by the system administrator). VACUUM skips over any tables that the calling user does not have the permission to vacuum.
  • VACUUM cannot be executed inside a transaction block.
  • It is recommended that active production databases be vacuumed frequently (at least nightly), in order to remove dead rows. After adding or deleting a large number of rows, it might be a good idea to execute the VACUUM ANALYZE command for the affected table. This will update the system catalogs with the results of all recent changes, and allow the query planner to make better choices in planning queries.
  • FULL is recommended only in special scenarios. For example, you wish to physically narrow the table to decrease the occupied disk space after deleting most rows of a table. VACUUM FULL usually shrinks more table size than VACUUM. The FULL option does not clear indexes. You are advised to periodically run the REINDEX statement. Deleting all indexes, executing VACUUM FULL, and rebuilding indexes is usually a faster choice. If the physical space usage does not decrease after you run the command, check whether there are other active transactions (that have started before you delete data transactions and not ended before you run VACUUM FULL). If there are such transactions, run this command again when the transactions quit.
  • VACUUM causes a substantial increase in I/O traffic, which might cause poor performance for other active sessions. Therefore, it is sometimes advisable to use the cost-based VACUUM delay feature.
  • When VERBOSE is specified, VACUUM prints progress messages to indicate which table is currently being processed. Various statistics about the tables are printed as well. However, if the VERBOSE option is specified in VACUUM executed for column-store tables, no output will be displayed.
  • When the option list is surrounded by parentheses, the options can be written in any order. If there are no brackets, the options must be given in the order displayed in the syntax.
  • VACUUM and VACUUM FULL clear deleted tuples after the delay specified by vacuum_defer_cleanup_age.

Syntax

  • Reclaim space and update statistics information, no requirements for key words order.
    VACUUM [ ( { FULL | FREEZE | VERBOSE | {ANALYZE | ANALYSE }} [,...] ) ]
        [ table_name [ (column_name [, ...] ) ] ] [ PARTITION ( partition_name ) ];
  • Only reclaim space, do not update statistics information.
    VACUUM [ FULL [COMPACT] ] [ FREEZE ] [ VERBOSE ] [ table_name ] [ PARTITION ( partition_name ) ];
  • Reclaim space and update statistics information, and require key words order.
    VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] { ANALYZE | ANALYSE } [ VERBOSE ] 
        [ table_name [ (column_name [, ...] ) ] ] [ PARTITION ( partition_name ) ];
  • For HDFS tables, move the data in delta table to HDFS storage.
    VACUUM DELTAMERGE [ table_name ];
  • For HDFS tables, delete the empty value partition directory of HDFS table in HDFS storage.
    VACUUM HDFSDIRECTORY [ table_name ];

Parameter Description

  • FULL

    Selects "FULL" vacuum, which can reclaim more space, but takes much longer and exclusively locks the table.

    FULL options can also contain the COMPACT parameter, which is only used for the HDFS table. Specifying the COMPACT parameter improves VACUUM FULL operation performance.

    COMPACT and PARTITION cannot be used at the same time.

    NOTE:

    Using FULL will cause statistics missing. To collect statistics, add the key word ANALYZE to VACUUM FULL.

  • FREEZE

    Is equivalent to executing VACUUM with the vacuum_freeze_min_age parameter set to zero.

  • VERBOSE

    Prints a detailed vacuum activity report for each table.

  • ANALYZE | ANALYSE

    Updates statistics used by the planner to determine the most efficient way to execute a query.

  • table_name

    Indicates the name (optionally schema-qualified) of a specific table to vacuum.

    Value range: The name of a specific table to vacuum. Defaults are all tables in the current database.

  • column_name

    Indicates the name of a specific field to analyze.

    Value range: Indicates the name of a specific field to analyze. Defaults are all columns.

  • PARTITION

    HDFS table does not support PARTITION. COMPACT and PARTITION cannot be used at the same time.

  • partition_name

    Indicates the partition name of a specific table to vacuum. Defaults are all partitions.

  • DELTAMERGE

    Migrates data in the delta table to HDFS only for HDFS tables. On a single DN, if the data volume of the delta table is less than 60,000 rows, the data will not be migrated; and if the data volume is greater than or equal to 60,000 rows, the data will be migrated, and the delta table will be cleared by TRUNCATE.

  • HDFSDIRECTORY

    Deletes the empty value partition directory of HDFS table in HDFS storage for HDFS table.

Examples

-- Create an index on the tpcds.reason table:
CREATE UNIQUE INDEX ds_reason_index1 ON tpcds.reason(r_reason_sk);

-- Do VACUUM to the tpcds.reason table that has indexes:
VACUUM (VERBOSE, ANALYZE) tpcds.reason;

-- Drop an index:
DROP INDEX ds_reason_index1 CASCADE;
DROP TABLE tpcds.reason;