Creating a Version of a Training Job

Function

This API is used to create a version of a training job.

Calling this API is an asynchronous operation. The job status can be obtained by calling the APIs described in Querying a Training Job List and Querying the Details About a Training Job Version.

URI

POST /v1/{project_id}/training-jobs/{job_id}/versions

Table 1 describes the required parameters.

Table 1 Parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain the project ID, see Obtaining a Project ID.

job_id

Yes

Long

ID of a training job

Request Body

Table 2 describes the request parameters.

Table 2 Request parameters

Parameter

Mandatory

Type

Description

job_desc

No

String

Description of a training job. The value must contain 0 to 256 characters. By default, this parameter is left blank.

config

Yes

Object

Parameters for creating a training job For details, see Table 3.

Table 3 config parameters

Parameter

Mandatory

Type

Description

worker_server_num

Yes

Integer

Number of workers in a training job. Obtain the maximum value from Querying Job Resource Specifications.

app_url

Yes

String

Code directory of a training job, for example, /usr/app/. This parameter must be used together with boot_file_url. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id.

boot_file_url

Yes

String

Boot file of a training job, which needs to be stored in the code directory. Example value: /usr/app/boot.py This parameter must be used together with app_url. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id.

parameter

No

Array<Object>

Running parameters of a training job. It is a collection of label-value pairs. For details, see the sample request. This parameter is a container environment variable when a training job uses a custom image. For details, see Table 5.

data_url

No

String

OBS URL of the dataset required by a training job. By default, this parameter is left blank. For example, /usr/data/. This parameter cannot be used together with data_source or dataset_id and dataset_version_id. However, one of the parameters must exist.

dataset_id

No

String

Dataset ID of a training job. This parameter must be used together with dataset_version_id, but cannot be used together with data_url or data_source.

dataset_version_id

No

String

Dataset version ID of a training job. This parameter must be used together with dataset_id, but cannot be used together with data_url or data_source.

data_source

No

JSON Array

Dataset of a training job. This parameter cannot be used with data_url, dataset_id, or dataset_version_id. For details, see Table 4.

spec_id

Yes

Long

ID of the resource specifications selected for a training job. Obtain the ID by calling the API described in Querying Job Resource Specifications.

engine_id

Yes

Long

ID of the engine selected for a training job. The default value is 1. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id. Obtain the ID by calling the API described in Querying Job Engine Specifications.

model_id

Yes

Long

ID of the built-in model of a training job. After setting model_id, you do not need to set app_url or boot_file_url, and engine_id.

train_url

Yes

String

OBS URL of the output file of a training job. By default, this parameter is left blank. Example value: /bucket/trainUrl/

log_url

No

String

OBS URL of the logs of a training job. By default, this parameter is left blank. Example value: /usr/train/

pre_version_id

Yes

Long

ID of the previous version of a training job. You can obtain the value of version_id by calling the API described in Querying a List of Training Job Versions.

user_image_url

No

String

SWR URL of a custom image used by a training job. Example value: 100.125.5.235:20202/jobmng/custom-cpu-base:1.0

user_command

No

String

Boot command used to start the container of a custom image of a training job. The format is bash /home/work/run_train.sh python /home/work/user-job-dir/app/train.py {python_file_parameter}.

Table 4 data_source parameters

Parameter

Mandatory

Type

Description

dataset_id

No

String

Dataset ID of a training job. This parameter must be used together with dataset_version_id, but cannot be used together with data_url.

dataset_version

No

String

Dataset version ID of a training job. This parameter must be used together with dataset_id, but cannot be used together with data_url.

type

No

String

Dataset type. The value can be obs or dataset. obs and dataset cannot be used at the same time.

data_url

No

String

OBS bucket path. This parameter cannot be used together with dataset_id or dataset_version.

Table 5 parameter parameters

Parameter

Mandatory

Type

Description

label

No

String

Parameter name

value

No

String

Parameter value

Response Body

Table 6 describes the response parameters.

Table 6 Parameters

Parameter

Type

Description

is_success

Boolean

Whether the request is successful

error_message

String

Error message of a failed API call.

This parameter is not included when the API call succeeds.

error_code

String

Error code of a failed API call. For details, see Error Codes. This parameter is not included when the API call succeeds.

job_id

Long

ID of a training job

job_name

String

Name of a training job

status

Int

Status of a training job. For details about the job statuses, see Job Statuses.

create_time

Long

Timestamp when a training job is created

version_id

Long

Version ID of a training job

version_name

String

Version name of a training job

Samples

  1. The following shows how to create a job whose job_id is 10 and pre_version_id is 20.

    • Sample request

      POST    https://endpoint/v1/{project_id}/training-jobs/10/versions/
      {
          "job_desc": "This is a ModelArts job",
          "config": {
              "worker_server_num": 1,
              "app_url": "/usr/app/",
              "boot_file_url": "/usr/app/boot.py",
              "parameter": [
                  {
                      "label": "learning_rate",
                      "value": "0.01"
                  },
                  {
                      "label": "batch_size",
                      "value": "32"
                  }
              ],
              "dataset_id": "38277e62-9e59-48f4-8d89-c8cf41622c24",
              "dataset_version_id": "2ff0d6ba-c480-45ae-be41-09a8369bfc90",
              "spec_id": 1,
              "engine_id": 1,
              "train_url": "/usr/train/",
              "log_url": "/usr/log/",
              "pre_version_id": 20
          }
      }
      
  • Successful sample response

    {
        "is_success": true,
        "job_id": 10,
        "job_name": "TestModelArtsJob",
        "status": 1,
        "create_time": 1524189990635,
        "version_id": 10,
        "version_name":""V0001"
    }
    
  • Failed sample response

    {
        "is_success": false,
        "error_message": "Error string",
        "error_code": "ModelArts.0105"
    }
    

Status Code

For details about the status code, see Status Code.

Error Codes

See Error Codes.