• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Data Import
  5. Concurrently Importing Data from OBS
  6. Importing Data

Importing Data

Context

Before importing data, you are advised to optimize your design and deployment based on the following best practices, helping maximize system resource utilization and improving data import performance.

  • In most cases, OBS data import performance is limited by concurrent network access rate. Therefore, you are advised to deploy multiple buckets on the OBS server to import data in parallel from buckets, better utilizing DN data transfer.
  • Similar to the single table import, ensure that the I/O performance is greater than the maximum network throughput in the concurrent import.
  • Set Global User Configuration (GUC) parameters raise_errors_if_no_files, partition_mem_batch, and partition_max_cache_size. Specify whether to identify, when data is imported, the situation where the number of records in imported files is zero and the situation where an imported file does not exist. Specify the number and size of data buffers.
  • If a table has an index, the index information is incrementally updated during the import, affecting data import performance. You are advised to delete the index from the target table. You can create index again after the import is complete.

Procedure

  1. Create a table in the DWS database to store the data imported from the OBS. For details about the syntax, see CREATE TABLE.

    The structure of the table must be consistent with that of the fields in the source data file. That is, the number of fields and field types must be the same. In addition, the structure of the target table must be the same as that of the foreign table. The field names can be different.

  2. (Optional) If the target table has an index, the index information is incrementally updated during the import, affecting data import performance. You are advised to delete the index from the target table before the import. You can create index again after the import is complete.
  3. Import data.

    INSERT INTO [Target table name] SELECT * FROM [Foreign table name]
    • If information similar to the following is displayed, the data has been imported: Query the error information table to check whether any data format errors occurred. For details, see Handling Import Errors.
      INSERT 0 20
    • If data fails to be loaded, troubleshoot the problem by following the instructions provided in Handling Import Errors and try again.

Example

For example, create a table named product_info.

DROP TABLE IF EXISTS product_info;
CREATE TABLE product_info
(
    product_price                integer        not null,
    product_id                   char(30)       not null,
    product_time                 date           ,
    product_level                char(10)       ,
    product_name                 varchar(200)   ,
    product_type1                varchar(20)    ,
    product_type2                char(10)       ,
    product_monthly_sales_cnt    integer        ,
    product_comment_time         date           ,
    product_comment_num          integer        ,
    product_comment_content      varchar(200)                   
) 
with (
orientation = column,
compression=middle
) 
DISTRIBUTE BY HASH (product_id);

Run the following statement to import data from the product_info_ext foreign table to the product_info table:

INSERT INTO product_info SELECT * FROM product_info_ext;