• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Data Export
  5. Exporting Data to OBS in Parallel
  6. Examples

Examples

Exporting a Table

Create two foreign tables and use them to export tables from a database to two buckets in OBS.

  1. Log in to the OBS data server through the management console. On the OBS server, create the buckets /input_data1 and /input_data2 for storing data files, and create data directories /input_data1/data and /input_data2/data, respectively, in the two buckets.
  2. On the DWS database, create the foreign tables tpcds.customer_address_ext1 and tpcds.customer_address_ext2 for the OBS data server to receive data exported from the database.

    OBS and the database are in the same region. The example DWS table to be exported is tpcds.customer_address.

    Export information is set as follows:

    • The source data file directories are /input_data1/data/ and input_data2/data/, so location of tpcds.customer_address_ext1 and tpcds.customer_address_ext2 is set to obs://input_data1/data/ and obs://input_data2/data/, respectively.

    Information about data formats is set based on the detailed data format parameters specified during data export from a database. The parameter settings are as follows:

    • format is set to CSV.
    • encoding is set to UTF-8.
    • delimiter is set to 0E08.
    • quote is set to 0x1b.
    • encrypt is set to 'on'.
    • access_key is set to the AK you have obtained. This parameter is mandatory.
    • secret_access_key is set to the SK you have obtained. This parameter is mandatory.
      NOTE:

      access_key and secret_access_key have been obtained during user creation. Replace the italic part with the actual keys.

    Based on the above settings, the foreign tables are created using the following statements:

    CREATE FOREIGN TABLE tpcds.customer_address_ext1
    (
    ca_address_sk             integer                       ,
    ca_address_id             char(16)                      ,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             decimal(5,2)                  ,
    ca_location_type          char(20)
    )
    SERVER gsmpp_server
    OPTIONS(LOCATION 'obs://input_data1/data/',
    FORMAT 'CSV',
    ENCODING 'utf8', 
    DELIMITER E'\x08', 
    QUOTE E'\x1b',
    ENCRYPT 'on',
    ACCESS_KEY 'access_key_value_to_be_replaced',
    SECRET_ACCESS_KEY 'secret_access_key_value_to_be_replaced' 
    );
    CREATE FOREIGN TABLE tpcds.customer_address_ext2
    (
    ca_address_sk             integer                       ,
    ca_address_id             char(16)                      ,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             decimal(5,2)                  ,
    ca_location_type          char(20)
    )
    SERVER gsmpp_server
    OPTIONS(LOCATION 'obs://input_data2/data/',
    FORMAT 'CSV',
    ENCODING 'utf8', 
    DELIMITER E'\x08', 
    QUOTE E'\x1b',
    ENCRYPT 'on',
    ACCESS_KEY 'access_key_value_to_be_replaced',
    SECRET_ACCESS_KEY 'secret_access_key_value_to_be_replaced'
    );

  3. In DWS, export the data table tpcds.customer_address to the foreign tables tpcds.customer_address_ext1 and tpcds.customer_address_ext2 concurrently.

    INSERT INTO tpcds.customer_address_ext1 SELECT * FROM tpcds.customer_address;
    INSERT INTO tpcds.customer_address_ext2 SELECT * FROM tpcds.customer_address;
    NOTE:

    The design of OBS foreign tables does not allow exporting files to a non-empty path. However, in concurrent export scenarios, multiple files are exported to the same path, causing an error.

    Assume that a user concurrently exports data from the same table to the same OBS foreign table, and that one SQL statement is executed to export data when another SQL statement is being executed and has not generated any file on the OBS server. In this case, certain data is overwritten although both SQL statements are successfully executed. Therefore, you are advised not to concurrently export data to the same OBS foreign table.

Concurrently Exporting Tables

Use the two foreign tables to export tables from the database to two buckets in OBS.

  1. Log in to the OBS data server through the management console. On the OBS server, create the buckets /input_data1 and /input_data2 for storing data files, and create data directories /input_data1/data and /input_data2/data, respectively, in the two buckets.
  2. In DWS, create foreign tables tpcds.customer_address_ext1 and tpcds.customer_address_ext2 for the OBS server to receive exported data.

    OBS and the database are in the same region. Tables to be exported are tpcds.customer_address1 and tpcds.customer_address2.

    Export information is set as follows:
    • The source data file directories are /input_data1/data/ and input_data2/data/, so location of tpcds.customer_address_ext1 and tpcds.customer_address_ext2 is set to obs://input_data1/data/ and obs://input_data2/data/, respectively.

    Information about data formats is set based on the detailed data format parameters specified during data export from DWS. The parameter settings are as follows:

    • format is set to CSV.
    • encoding is set to UTF-8.
    • delimiter is set to 0E08.
    • quote is set to 0x1b.
    • encrypt is set to 'on'.
    • access_key is set to the AK you have obtained. This parameter is mandatory.
    • secret_access_key is set to the SK you have obtained. This parameter is mandatory.
      NOTE:

      access_key and secret_access_key have been obtained during user creation. Replace the italic part with the actual keys.

    Based on the above settings, the foreign tables are created using the following statements:

    CREATE FOREIGN TABLE tpcds.customer_address_ext1
    (
    ca_address_sk             integer               ,
    ca_address_id             char(16)              ,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             decimal(5,2)                  ,
    ca_location_type          char(20)
    )
    SERVER gsmpp_server
    OPTIONS(LOCATION 'obs://input_data1/data/',
    FORMAT 'CSV',
    ENCODING 'utf8', 
    DELIMITER E'\x08', 
    QUOTE E'\x1b',
    ENCRYPT 'on',
    ACCESS_KEY 'access_key_value_to_be_replaced',
    SECRET_ACCESS_KEY 'secret_access_key_value_to_be_replaced'
    );
    CREATE FOREIGN TABLE tpcds.customer_address_ext2
    (
    ca_address_sk             integer               ,
    ca_address_id             char(16)              ,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             decimal(5,2)                  ,
    ca_location_type          char(20)
    )
    SERVER gsmpp_server
    OPTIONS(LOCATION 'obs://input_data2/data/',
    FORMAT 'CSV',
    ENCODING 'utf8', 
    DELIMITER E'\x08', 
    QUOTE E'\x1b',
    ENCRYPT 'on',
    ACCESS_KEY 'access_key_value_to_be_replaced',
    SECRET_ACCESS_KEY 'secret_access_key_value_to_be_replaced' 
    );

  3. In DWS, export the data tables tpcds.customer_address1 and tpcds.customer_address2 in parallel to the foreign tables tpcds.customer_address_ext1 and tpcds.customer_address_ext2, respectively.

    INSERT INTO tpcds.customer_address_ext1 SELECT * FROM tpcds.customer_address1;
    INSERT INTO tpcds.customer_address_ext2 SELECT * FROM tpcds.customer_address2;