Before importing data from OBS to a cluster, prepare source data files and upload these files to OBS. If the data files have been stored on OBS, skip Step 1 in this section.
Prepare source data files to be uploaded to OBS. DWS supports only source data files in CSV, TEXT, and ORC format.
If user data cannot be saved in CSV format, store the data as any text file.
According to How Data Is Imported, when the data volume of each source data file is large, evenly split these files into multiple files before storing them to OBS. The optimal import performance is delivered when the number of files is an integer multiple of the DN quantity.
Store the source data files to be imported in the OBS bucket in advance.
Click Service List and choose Object Storage Service to open the OBS management console.
For details about how to create a bucket, see OBS Console Operation Guide > Managing Buckets > Creating a Bucket in the Object Storage Service User Guide.
For example, create two buckets named mybucket and mybucket02.
For details, see OBS Console Operation Guide > Managing Objects > Creating a Folder in the Object Storage Service User Guide.
For details, see OBS Console Operation Guide > Managing Objects > Uploading a File in the Object Storage Service User Guide.
After the source data files are uploaded to an OBS bucket, a globally unique access path is generated. The OBS path of the source data files is the value of the location parameter used for creating a foreign table.
The OBS folder path in the location parameter consists of obs://, a bucket name, and a file path. Example:
For example, the OBS paths are as follows:
obs://mybucket/input_data/product_info.0 obs://mybucket/input_data/product_info.1 obs://mybucket02/input_data/product_info.2
When importing data from OBS to a cluster, the user must have the read permission for the OBS buckets where the source data files are located. You can configure the ACL for the OBS buckets to grant the read permission to a specific user.
For details, see OBS Console Operation Guide > Bucket Permissions > Setting ACL Permissions for Buckets in the Object Storage Service User Guide.