• Data Warehouse Service

dws
  1. Help Center
  2. Data Warehouse Service
  3. Developer Guide
  4. Data Export
  5. Using GDS to Export Data to a Remote Server
  6. Exporting Data In Parallel Using GDS

Exporting Data In Parallel Using GDS

In high-concurrency scenarios, you can use GDS to export data from a database to a common file system.

Overview

Using foreign tables: A GDS foreign table specifies the exported file format and export mode. Data is exported in parallel through multiple DNs from the database to data files, which improves the overall data export performance. The data files cannot be directly exported to HDFS.
  • The CN only plans data export tasks and delivers the tasks to DNs. In this case, the CN is released to process other tasks.
  • In this way, the computing capacities and bandwidths of all the DNs are fully leveraged to export data.
    Figure 1 Exporting data using foreign tables

Related Concepts

  • Data file: a TEXT, CSV, or FIXED file that stores data exported from DWS.
  • Foreign table: a table that stores information, such as the format, location, and encoding format of a data file.
  • GDS: data service tool. To export data, deploy it on the server where data files are stored.
  • Table: tables in the database, including row-store tables and column-store tables from which data is exported.
  • Remote mode: Service data in a cluster is exported to hosts outside the cluster.

Export Modes

Data can be exported to DWS in Remote mode.

  • Remote mode: Service data in a cluster is exported to hosts outside the cluster.
    • In this mode, multiple GDSs are used to concurrently export data. One GDS can export data for only one cluster at a time.
    • The data export rate of a GDS that resides on the same intranet as cluster nodes is limited by the network bandwidth. A 10 GE configuration is recommended.
    • Data files in TEXT or CSV format are supported. The size of data in a single row must be less than 1 GB.

Data Export Process

Figure 2 Process of parallel data export

  

Table 1 Process description

Procedure

Description

Subtask

Plan data export.

Prepare data to be exported and plan the export path for the mode to be selected.

For details, see Planning Data Export.

-

Start GDS.

If the Remote mode is selected, install, configure, and start GDS on data servers.

For details, see Installing, Configuring, and Starting GDS.

-

Create a foreign table.

Create a foreign table to help GDS specify information about a data file. The foreign table stores information, such as the location, format, encoding, and inter-data delimiter of a data file.

For details, see Creating a GDS Foreign Table.

-

Export data.

After the foreign table is created, run the INSERT statement to efficiently export data to data files.

For details, see Exporting Data.

-

Stop GDS.

Stop GDS after data is exported.

For details, see Stopping GDS.

-