• MapReduce Service

mrs
  1. Help Center
  2. MapReduce Service
  3. User Guide
  4. MRS Quick Start
  5. Quick Start
  6. Managing Files

Managing Files

You can create directories, delete directories, and import, export, or delete files on the File Management page in an analysis cluster with Kerberos authentication disabled.

Background

Data to be processed by MRS is stored in either OBS or HDFS. OBS provides you with massive, highly reliable, and secure data storage capabilities at a low cost. You can view, manage, and use data through OBS Console or OBS Browser.

Importing Data

MRS supports data import from the OBS system to HDFS. This function is recommended if the data size is small, because the upload speed reduces as the file size increases.

Both files and folders containing files can be imported. The operations are as follows:

  1. Log in to the MRS management console.
  2. Click in the upper-left corner on the management console and select Region and Project.
  3. Choose Cluster > Active Cluster, select a cluster, and click its name to switch to the cluster information page.
  4. Click File Management and go to the File Management tab page.
  5. Select HDFS File List.
  6. Click the data storage directory, for example, bd_app1.

    bd_app1 is just an example. The storage directory can be any directory on the page. You can create a directory by clicking Create Folder.

  7. Click Import Data to configure the paths for HDFS and OBS.
    NOTE:

    When configuring the OBS or HDFS path, click Browse, select the file path, and click OK.

    • The path for OBS
      • Must start with s3a://s3a:// is used by default.
      • Files and programs encrypted by the KMS cannot be imported.
      • Empty folders cannot be imported.
      • Directories and file names can contain letters, Chinese characters, digits, hyphens (-), or underscores (_), but cannot contain special characters (/:*?"<>|\;&,'`!{}[]$).
      • Directories and file names cannot start or end with a period (.).
      • Directories and file names cannot be empty.
      • The full path of OBS contains a maximum of 1023 characters.
    • The path for HDFS
      • It starts with /user by default.
      • Directories and file names can contain letters, Chinese characters, digits, hyphens (-), or underscores (_), but cannot contain special characters (/:*?"<>|\;&,'`!{}[]$).
      • Directories and file names cannot start or end with a period (.).
      • Directories and file names cannot be empty.
      • The full path of HDFS contains a maximum of 1023 characters.
      • The parent HDFS directory in HDFS File List is displayed in the textbox for the HDFS path by default when data is imported.
  8. Click OK.

    View the upload progress in File Operation Record. The data import operation is operated as a Distcp job by MRS. You can check whether the Distcp job is successfully executed in Job Management > Job.

Exporting Data

After data is processed and analyzed, you can either store the data in HDFS or export it to the OBS system.

Both files and folders containing files can be exported. The operations are as follows:

  1. Log in to the MRS management console.
  2. Click in the upper-left corner on the management console and select Region and Project.
  3. Choose Cluster > Active Cluster, select a cluster, and click its name to switch to the cluster information page.
  4. Click File Management and go to the File Management tab page.
  5. Select HDFS File List.
  6. Click the data storage directory, for example, bd_app1.
  7. Click Export Data and configure the paths for HDFS and OBS.
    NOTE:

    When configuring the OBS or HDFS path, click Browse, select the file path, and click OK.

    • The path for OBS
      • Must start with s3a://s3a:// is used by default.
      • Empty folders cannot be imported.
      • Directories and file names can contain letters, Chinese characters, digits, hyphens (-), or underscores (_), but cannot contain special characters (/:*?"<>|\;&,'`!{}[]$).
      • Directories and file names cannot start or end with a period (.).
      • Directories and file names cannot be empty.
      • The full path of OBS contains a maximum of 1023 characters.
    • The path for HDFS
      • It starts with /user by default.
      • Directories and file names can contain letters, Chinese characters, digits, hyphens (-), or underscores (_), but cannot contain special characters (/:*?"<>|\;&,'`!{}[]$).
      • Directories and file names cannot start or end with a period (.).
      • Directories and file names cannot be empty.
      • The full path of HDFS contains a maximum of 1023 characters.
      • The parent HDFS directory in HDFS File List is displayed in the textbox for the HDFS path by default when data is imported.
    NOTE:

    Ensure that the exported folder is not empty. If an empty folder is exported to the OBS system, the folder is exported as a file. After the folder is exported, its name is changed, for example, from test to test-$folder$, and its type is file.

  8. Click OK.

    View the upload progress in File Operation Record. The data export operation is operated as a Distcp job by MRS. You can check whether the Distcp job is successfully executed in Job Management > Job.