• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Setting Configuration Parameters
  5. Appendix B: GUC Parameter Description
  6. Parallel Data Import

Parallel Data Import

DWS provides a parallel data import function that enables a large amount of data to be imported in a fast and efficient manner. This section describes parameters for importing data in parallel.

raise_errors_if_no_files

Parameter description: Specifies whether distinguish between the problems "the number of imported file records is empty" and "the imported file does not exist". If this parameter is set to true and the problem "the imported file does not exist" occurs, DWS will report the error message "file does not exist".

This parameter is a SUSET parameter. Set it based on instructions provided in Table 1.

Value range: Boolean

  • on indicates the messages of "the number of imported file records is empty" and "the imported file does not exist" are distinguished when files are imported.
  • off indicates the messages of "the number of imported file records is empty" and "the imported file does not exist" are the same when files are imported.

Default value: off

partition_mem_batch

Parameter description: in order to optimize the inserting of column storage partition table in batches, the data is buffered during the inserting process and then written in the disk. You can specify the number of caches through partition_mem_batch. If this value is too large, it will consume much memory resources. If it is too small, the performance of inserting column-store partitioned tables in batches will deteriorate.

This parameter is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range: 1 to 65535

Default value256

partition_max_cache_size

Parameter description: In order to optimize the inserting of column-store partitioned tables in batches, data is buffered during the inserting process and then written in the disk. You can specify the data buffer cache size through partition_max_cache_size. If this value is too large, much memory will be consumed. If it is too small, the inserting performance of column-store partitioned tables will deteriorate.

This parameter is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range:

  • Column-store partitioned table: 4096 to INT_MAX/2. The unit is KB.

Default value2GB

gds_debug_mod

Parameter description: Specifies whether to enable the debug function of Gauss Data Service (GDS). This parameter is used to better locate and analyze GDS faults. After the debug function is enabled, types of packets received or sent by GDS, peer end of GDS during command interaction, and GDS session IDs are written into the corresponding logs on cluster nodes. In this way, the state switching on the Gaussdb state machine and the current state are recorded. If this function is enabled, additional log I/Os will be consumed, affecting log performance and validity. You are advised to enable this function only when locating GDS faults.

This parameter is a USERSET parameter. Set it based on instructions provided in Table 1.

Value range:

  • on indicates that the GDS debug function is enabled.
  • off indicates that the GDS debug function is disabled.

Default value: off