General Text¶
Function¶
This API detects and extracts text from images and converts the text and coordinates into JSON format. It can be used in various scenarios, such as scanned documents, electronic documents, books, receipts, and forms.
Constraints and Limitations¶
Only images in PNG, JPG, JPEG, BMP, GIF, or TIFF format can be recognized.
No side of the image can be smaller than 15 or larger than 8,192 pixels.
The area to be recognized must occupy more than 80% of the image. When scanning a table, ensure that all text and its surrounding area are included in the image.
An image can be rotated to any angle.
Text in images with complex backgrounds (such as outdoor scenery or anti-counterfeit watermarks) or distorted text cannot be recognized.
Supported languages: Chinese, English, some traditional Chinese, Malay, Ukrainian, Hindi, Russian, Vietnamese, Indonesian, Thai, Arabic, German, Latin, French, Italian, Spanish, Portuguese, Romanian, Polish Amharic, Japanese, Korean, Turkish, Norwegian, Danish, and Swedish.
URI¶
POST /v2/{project_id}/ocr/general-text
Parameter | Mandatory | Description |
---|---|---|
endpoint | Yes | Endpoint, which is the request address for calling an API. The endpoint varies depending on services in different regions. For more details, see Endpoint. |
project_id | Yes | Project ID, which can be obtained by referring to Obtaining the Project ID. |
Request Parameters¶
Parameter | Mandatory | Type | Description |
---|---|---|---|
X-Auth-Token | Yes | String | User token. Used to obtain the permission to use APIs. The token is the value of X-Subject-Token in the response header in Authentication. |
Content-Type | Yes | String | MIME type of the request body. The value is application/json. |
Parameter | Mandatory | Type | Description |
---|---|---|---|
image | No | String | Set either this parameter or url. Base64-encoded image file. The image file has a size limit of 10 MB. No side of the image can be smaller than 15 or larger than 8,192 pixels. Only images in JPEG, JPG, PNG, BMP, GIF, or TIFF format can be recognized. An example is /9j/4AAQSkZJRgABAg.... If the image data contains an unnecessary prefix, the error "The image format is not supported" is reported. |
url | No | String | Set either this parameter or image. Image URL. Currently, the following URLs are supported:
Note
|
detect_direction | No | Boolean | Whether to align the tilted image. The options are as follows:
An image tilted to any angle can be aligned. If this parameter is not specified, false is used by default. If the image to be recognized is tilted, you are advised to set this parameter to true. |
quick_mode | No | Boolean | Whether to enable the quick mode. For a single-line text image (the image contains only one line of text and the text area occupies more than 50% of the image), the recognition results can be returned more quickly when this quick mode is enabled. The options are as follows:
If this parameter is not specified, false is used by default. |
character_mode | No | Boolean | Whether to enable the single-character mode. The options are as follows:
If this parameter is not transferred, the default value false is used, and information about a single character that occupies a text line is not returned. |
language | No | String | Language. If this parameter is not specified, Chinese and English will be used by default. The options are as follows:
|
single_orientation_mode | No | Boolean | Whether to enable the single direction mode. The options are as follows:
If this parameter is not specified, false is used by default. In this case, the fields in the image are recognized as in multiple directions by default. |
Response Parameters¶
Note
The status code may vary depending on the recognition results. For example, 200 indicates that the API is successfully called, and 400 indicates that the API fails to be called. The following describes the status codes and corresponding response parameters.
Status code: 200
Parameter | Type | Description |
---|---|---|
result | Recognition result This parameter is not returned when the API fails to be called. |
Parameter | Type | Description |
---|---|---|
direction | Float | Image direction
|
words_block_count | Integer | Number of detected text blocks |
words_block_list | Array of Table 6 | List of recognized text blocks. The output sequence is from left to right and from top to bottom. |
Parameter | Type | Description |
---|---|---|
words | String | Recognition result of a text block |
location | Array<Array<Integer>> | List of location information about a text block, including the 2D coordinates (x, y) of four vertexes in the text area, where the coordinate origin is the upper-left corner of the image, the X axis is horizontal, and the Y axis is vertical. |
confidence | Float | Confidence of a recognized text block |
char_list | Array of Table 7 | Single-character recognition list corresponding to a text block. The output sequence is from left to right and from top to bottom. |
Parameter | Type | Description |
---|---|---|
char | String | Recognition result of a single character |
char_location | Array<Array<Integer>> | List of location information about a single character, including the 2D coordinates (x, y) of four vertexes in the character area, where the coordinate origin is the upper-left corner of the image, the X axis is horizontal, and the Y axis is vertical. |
char_confidence | Float | Confidence of a recognized character |
Status code: 400
Parameter | Type | Description |
---|---|---|
error_code | String | Error code when calling the API failed This parameter is not returned when the API is successfully called. |
error_msg | String | Error message when the API call fails This parameter is not returned when the API is successfully called. |
Example Request¶
Transfer the Base64 code of the image for recognition. During the recognition, the tilt angle of the image is not verified, and the quick mode is disabled.
POST https://{endpoint}/v2/{project_id}/ocr/general-text Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "image":"/9j/4AAQSkZJRgABAgEASABIAAD/4RFZRXhpZgAATU0AKgAAAA...", "detect_direction":false, "quick_mode":false }
Transfer the URL of the image for recognition. During the recognition, the tilt angle of the image is not verified, and the quick mode is disabled.
POST https://{endpoint}/v2/{project_id}/ocr/general-text Request Header: Content-Type: application/json X-Auth-Token: MIINRwYJKoZIhvcNAQcCoIINODCCDTQCAQExDTALBglghkgBZQMEAgEwgguVBgkqhkiG... Request Body: { "url":"https://BucketName.obs.xxxx.com/ObjectName", "detect_direction":false, "quick_mode":false }
Example Response¶
Status code: 200
Example response for a successful request
{
"result" : {
"direction" : 67.6506,
"words_block_count" : 1,
"words_block_list" : [ {
"words": "Word",
"confidence" : 0.9999,
"location" : [ [ 517, 447 ], [ 540, 504 ], [ 505, 518 ], [ 482, 461 ] ],
"char_list" : [ {
"char": "Character",
"char_location" : [ [ 517, 447 ], [ 530, 479 ], [ 495, 493 ], [ 482, 461 ] ],
"char_confidence" : 0.9999
}, {
"char": "Character",
"char_location" : [ [ 530, 479 ], [ 540, 504 ], [ 505, 518 ], [ 495, 493 ] ],
"char_confidence" : 0.9999
} ]
} ]
}
}
Status code: 400
Example response for a failed request
{
"error_code": "AIS.0103",
"error_msg": "The image size does not meet the requirements."
}
Status Codes¶
Status Code | Description |
---|---|
200 | Response for a successful request |
400 | Response for a failed request |
See Status Codes.
Error Codes¶
See Error Codes.