• Data Warehouse Service

  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Tutorial: Using GDS to Import Data
  5. Step 2: Preparing Source Data

Step 2: Preparing Source Data

You can import data in TEXT, CSV, or FIXED format from a remote server to DWS. This tutorial uses data in CSV format as an example. The method is the same for TEXT and FIXED data except that the parameter settings of foreign tables are different. For details, see Using GDS to Import Data from a Remote Server.

Preparing Source Data Files

To demonstrate how to import multiple files, this tutorial uses the following three CSV data files as an example. Generally, the source data files are exported from a database. In this tutorial, the CSV source data files are manually created.

The sample files here are the same as those used in "Tutorial: Importing Data from OBS to a Cluster." If you have retained those sample files, directly upload them to the data server.

  • Data file product_info0.csv

    The file contains the following data:

    100,XHDK-A,2017-09-01,A,2017 Shirt Women,red,M,328,2017-09-04,715,good!
    205,KDKE-B,2017-09-01,A,2017 T-shirt Women,pink,L,584,2017-09-05,40,very good!
    300,JODL-X,2017-09-01,A,2017 T-shirt men,red,XL,15,2017-09-03,502,Bad.
    310,QQPX-R,2017-09-02,B,2017 jacket women,red,L,411,2017-09-05,436,It's nice.
    150,ABEF-C,2017-09-03,B,2017 Jeans Women,blue,M,123,2017-09-06,120,good.
  • Data file product_info1.csv

    The file contains the following data:

    200,BCQP-E,2017-09-04,B,2017 casual pants men,black,L,997,2017-09-10,301,good quality.
    250,EABE-D,2017-09-10,A,2017 dress women,black,S,841,2017-09-15,299,This dress fits well.
    108,CDXK-F,2017-09-11,A,2017 dress women,red,M,85,2017-09-14,22,It's really amazing to buy.
    450,MMCE-H,2017-09-11,A,2017 jacket women,white,M,114,2017-09-14,22,very good.
    260,OCDA-G,2017-09-12,B,2017 woolen coat women,red,L,2004,2017-09-15,826,Very comfortable.
  • Data file product_info2.csv

    The file contains the following data:

    980,"ZKDS-J",2017-09-13,"B","2017 Women's Cotton Clothing","red","M",112,,,
    98,"FKQB-I",2017-09-15,"B","2017 new shoes men","red","M",4345,2017-09-18,5473
    50,"DMQY-K",2017-09-21,"A","2017 pants men","red","37",28,2017-09-25,58,"good","good","good"
    80,"GKLW-l",2017-09-22,"A","2017 Jeans Men","red","39",58,2017-09-25,72,"Very comfortable."
    30,"HWEC-L",2017-09-23,"A","2017 shoes women","red","M",403,2017-09-26,607,"good!"
    40,"IQPD-M",2017-09-24,"B","2017 new pants Women","red","M",35,2017-09-27,52,"very good."
    50,"LPEC-N",2017-09-25,"B","2017 dress Women","red","M",29,2017-09-28,47,"not good at all."
    60,"NQAB-O",2017-09-26,"B","2017 jacket women","red","S",69,2017-09-29,70,"It's beautiful."
    70,"HWNB-P",2017-09-27,"B","2017 jacket women","red","L",30,2017-09-30,55,"I like it so much"
    80,"JKHU-Q",2017-09-29,"C","2017 T-shirt","red","M",90,2017-10-02,82,"very good."

CSV is short for Comma Separated Values. A .csv file is similar to a .txt or .doc file. It can also be considered a text file containing records, which are separated into columns by commas (,) or tabs. The column sequence in each record is the same. In Windows, .csv files can be opened in different applications, such as Notepad, Excel, and Notepad++.

The following describes how to generate a .csv file in Windows:

  1. Create a text file and open it in Notepad++. Copy the sample data into it. Then, check the total number of rows and check whether the data of rows is correctly separated.
  2. Choose Format > Encode in UTF-8 without BOM.
  3. Choose File > Save as.
  4. In the displayed dialog box, enter the file name and click Save.

    To identify the file type, use the file name extension .csv when entering the file name.

Uploading Source Data Files to a Data Server

  1. Log in as user root to the server storing source data files (also known as the data server or GDS server).
  2. Create the directory /input_data to store data files.

    mkdir -p /input_data

  3. Use WinSCP to upload source data files to the created directory.