Overview¶
Table 1 lists the string functions supported by DLI.
Syntax | Value Type | Description |
---|---|---|
ascii(string <str>) | BIGINT | Returns the numeric value of the first character in a string. |
concat(array<T> <a>, array<T> <b>[,...]), concat(string <str1>, string <str2>[,...]) | ARRAY or STRING | Returns a string concatenated from multiple input strings. This function can take any number of input strings. |
concat_ws(string <separator>, string <str1>, string <str2>[,...]), concat_ws(string <separator>, array<string> <a>) | ARRAY or STRUCT | Returns a string concatenated from multiple input strings that are separated by specified separators. |
char_matchcount(string <str1>, string <str2>) | BIGINT | Returns the number of characters in str1 that appear in str2. |
encode(string <str>, string <charset>) | BINARY | Returns strs encoded in charset format. |
find_in_set(string <str1>, string <str2>) | BIGINT | Returns the position (stating from 1) of str1 in str2 separated by commas (,). |
get_json_object(string <json>, string <path>) | STRING | Parses the JSON object in a specified JSON path. The function will return NULL if the JSON object is invalid. |
instr(string <str>, string <substr>) | INT | Returns the index of substr that appears earliest in str. Returns NULL if either of the arguments are NULL and returns 0 if substr does not exist in str. Note that the first character in str has index 1. |
instr1(string <str1>, string <str2>[, bigint <start_position>[, bigint <nth_appearance>]]) | BIGINT | Returns the position of str2 in str1. |
initcap(string A) | STRING | Converts the first letter of each word of a string to upper case and all other letters to lower case. |
keyvalue(string <str>,[string <split1>,string <split2>,] string <key>) | STRING | Splits str by split1, converts each group into a key-value pair by split2, and returns the value corresponding to the key. |
length(string <str>) | BIGINT | Returns the length of a string. |
lengthb(string <str>) | STRING | Returns the length of a specified string in bytes. |
levenshtein(string A, string B) | INT | Returns the Levenshtein distance between two strings, for example, levenshtein('kitten','sitting') = 3. |
locate(string <substr>, string <str>[, bigint <start_pos>]) | BIGINT | Returns the position of substr in str. |
lower(string A) , lcase(string A) | STRING | Converts all characters of a string to the lower case. |
lpad(string <str1>, int <length>, string <str2>) | STRING | Returns a string of a specified length. If the length of the given string (str1) is shorter than the specified length (length), the given string is left-padded with str2 to the specified length. |
ltrim([<trimChars>,] string <str>) | STRING | Trims spaces from the left hand side of a string. |
parse_url(string urlString, string partToExtract [, string keyToExtract]) | STRING | Returns the specified part of a given URL. Valid values of partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO. For example, parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') returns 'facebook.com'. When the second parameter is set to QUERY, the third parameter can be used to extract the value of a specific parameter. For example, parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1') returns 'v1'. |
printf(String format, Obj... args) | STRING | Prints the input in a specific format. |
regexp_count(string <source>, string <pattern>[, bigint <start_position>]) | BIGINT | Returns the number of substrings that match a specified pattern in the source, starting from the start_position position. |
regexp_extract(string <source>, string <pattern>[, bigint <groupid>]) | STRING | Matches the string source based on the pattern grouping rule and returns the string content that matches groupid. |
replace(string <str>, string <old>, string <new>) | STRING | Replaces the substring that matches a specified string in a string with another string. |
| STRING |
|
regexp_replace1(string <source>, string <pattern>, string <replace_string>[, bigint <occurrence>]) | STRING | Replaces the substring that matches pattern for the occurrence time in the source string with the specified string replace_string and returns the result string. |
regexp_instr(string <source>, string <pattern>[,bigint <start_position>[, bigint <occurrence>[, bigint <return_option>]]]) | BIGINT | Returns the start or end position of the substring that matches a specified pattern for the occurrence time, starting from start_position in the source string. |
regexp_substr(string <source>, string <pattern>[, bigint <start_position>[, bigint <occurrence>]]) | STRING | Returns the substring that matches a specified pattern for the occurrence time, starting from start_position in the source string. |
repeat(string <str>, bigint <n>) | STRING | Repeats a string for N times. |
reverse(string <str>) | STRING | Returns a string in reverse order. |
rpad(string <str1>, int <length>, string <str2>) | STRING | Right-pads str1 with str2 to the specified length. |
rtrim([<trimChars>, ]string <str>), rtrim(trailing [<trimChars>] from <str>) | STRING | Trims spaces from the right hand side of a string. |
soundex(string <str>) | STRING | Returns the soundex string from str, for example, soundex('Miller') = M460. |
space(bigint <n>) | STRING | Returns a specified number of spaces. |
substr(string <str>, bigint <start_position>[, bigint <length>]), substring(string <str>, bigint <start_position>[, bigint <length>]) | STRING | Returns the substring of str, starting from start_position and with a length of length. |
substring_index(string <str>, string <separator>, int <count>) | STRING | Truncates the string before the count separator of str. If the value of count is positive, the string is truncated from the left. If the value of count is negative, the string is truncated from the right. |
split_part(string <str>, string <separator>, bigint <start>[, bigint <end>]) | STRING | Splits a specified string based on a specified separator and returns a substring from the start to end position. |
translate(string|char|varchar input, string|char|varchar from, string|char|varchar to) | STRING | Translates the input string by replacing the characters or string specified by from with the characters or string specified by to. For example, replaces bcd in abcde with BCD using translate("abcde", "bcd", "BCD"). |
trim([<trimChars>,]string <str>), trim([BOTH] [<trimChars>] from <str>) | STRING | Trims spaces from both ends of a string. |
upper(string A), ucase(string A) | STRING | Converts all characters of a string to the upper case. |