• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Tutorial: Using GDS to Import Data
  5. Step 4: Creating a Foreign Table in the DWS Database

Step 4: Creating a Foreign Table in the DWS Database

  1. Use the SQL client tool to connect to the DWS database.
  2. Create a foreign table based on Table 1.

    DROP FOREIGN TABLE IF EXISTS product_info_ext;
    CREATE FOREIGN TABLE product_info_ext
    (
        product_price                integer        not null,
        product_id                   char(30)       not null,
        product_time                 date           ,
        product_level                char(10)       ,
        product_name                 varchar(200)   ,
        product_type1                varchar(20)    ,
        product_type2                char(10)       ,
        product_monthly_sales_cnt    integer        ,
        product_comment_time         date           ,
        product_comment_num          integer        ,
        product_comment_content      varchar(200)              
    ) 
    SERVER gsmpp_server 
    OPTIONS(
    LOCATION 'gsfs://192.168.0.90:5000/*',
    FORMAT 'CSV' ,
    DELIMITER ',',
    ENCODING 'utf8',
    HEADER 'false',
    FILL_MISSING_FIELDS 'true',
    IGNORE_EXTRA_DATA 'true'
    )
    READ ONLY
    LOG INTO product_info_err 
    PER NODE REJECT LIMIT 'unlimited';

    If the following information is displayed, the foreign table has been created.

    CREATE FOREIGN TABLE
    Table 1 Configurations in the foreign table

    Configuration Item

    Value

    Description

    SERVER

    gsmpp_server

    This parameter is always set to gsmpp_server. You do not need to change it.

    LOCATION

    gsfs://192.168.0.90:5000/*

    This parameter specifies the location of source data files.

    If SSL transmission is used, use the gsfss protocol. In this tutorial, the location is gsfss://192.168.0.90:5000/*.

    FORMATE

    CSV

    This parameter specifies the format of source data files.

    ENCODING

    UTF-8

    This parameter specifies the data encoding.

    DELIMITER

    It is set to a comma (,).

    This parameter specifies the column delimiter.

    HEADER

    false (default value)

    This parameter specifies whether a data file contains a header. This parameter is valid only for data files in CSV format. Data files in Preparing Source Data Files do not contain a header. Therefore, this parameter is set to false.

    FILL_MISSING_FIELDS

    true

    This parameter specifies how to handle the problem that the last column of a row in a source data file is lost during data import. The default value is false or off. true is used in this tutorial.

    • true/on: The last column is set to NULL. No error is reported.
    • false/off: Error "missing data for column "tt"" is reported.
    For example, the last column product_comment_content of the second row in the product_info2.csv source data file is lost. If FILL_MISSING_FIELDS is set to false or off, information similar to the following will be displayed in the error table during data import:
    missing data for column "product_comment_content"

    IGNORE_EXTRA_DATA

    true

    This parameter specifies whether to ignore excessive columns when the number of columns in a source data file exceeds that defined in the foreign table. The default value is false or off. true is used in this tutorial.

    • true/on: The excessive columns of a row are ignored. No error is reported.
    • false/off: Error "extra data after last expected column" is reported.

    For example, the number of columns in the third record in the product_info2.csv source data file is greater than that defined in the foreign table. If IGNORE_EXTRA_DATA is set to false or off, information similar to the following will be displayed in the error table during data import:

    extra data after last expected column

    PER NODE REJECT LIMIT 'value'

    unlimited

    This parameter specifies the allowed number of data format errors on each DN during data import. If the number of errors exceeds the specified value on any DN, data import will fail, an error will be reported, and the system will exit data import.

    It is set to unlimited in this tutorial, indicating that all data format errors during import are allowed.

    READ ONLY

    -

    Syntax defined in a foreign table can be used for both importing data to the DWS cluster and for exporting data from the cluster. To import data to the cluster, use READ ONLY for the foreign table. To export data, use WRITE ONLY.

    WITH error_table_name

    Error table name: product_info_err

    Data format errors during import are recorded in the table specified by product_info_err. You can query this table after the import to obtain error details.