6.9. String Functions and Operators

6.9. String Functions and Operators

String Operators

The || operator performs concatenation.

String Functions

Note

These functions assume that the input strings contain valid UTF-8 encodedUnicode code points. There are no explicit checks for valid UTF-8 andthe functions may return incorrect results on invalid UTF-8.Invalid UTF-8 data can be corrected with from_utf8().

Additionally, the functions operate on Unicode code points and not uservisible characters (or grapheme clusters). Some languages combinemultiple code points into a single user-perceived character, the basicunit of a writing system for a language, but the functions will treat eachcode point as a separate unit.

The lower() and upper() functions do not performlocale-sensitive, context-sensitive, or one-to-many mappings required forsome languages. Specifically, this will return incorrect results forLithuanian, Turkish and Azeri.

chr(n) → varchar
Returns the Unicode code point n as a single character string.
codepoint(string) → integer
Returns the Unicode code point of the only character of string.
concat(string1, …, stringN) → varchar
Returns the concatenation of string1, string2, …, stringN.This function provides the same functionality as theSQL-standard concatenation operator (||).
hammingdistance(_string1, string2) → bigint
Returns the Hamming distance of string1 and string2,i.e. the number of positions at which the corresponding characters are different.Note that the two strings must have the same length.
length(string) → bigint
Returns the length of string in characters.
levenshteindistance(_string1, string2) → bigint
Returns the Levenshtein edit distance of string1 and string2,i.e. the minimum number of single-character edits (insertions,deletions or substitutions) needed to change string1 into string2.
lower(string) → varchar
Converts string to lowercase.
lpad(string, size, padstring) → varchar
Left pads string to size characters with padstring.If size is less than the length of string, the result istruncated to size characters. size must not be negativeand padstring must be non-empty.
ltrim(string) → varchar
Removes leading whitespace from string.
replace(string, search) → varchar
Removes all instances of search from string.
replace(string, search, replace) → varchar
Replaces all instances of search with replace in string.
reverse(string) → varchar
Returns string with the characters in reverse order.
rpad(string, size, padstring) → varchar
Right pads string to size characters with padstring.If size is less than the length of string, the result istruncated to size characters. size must not be negativeand padstring must be non-empty.
rtrim(string) → varchar
Removes trailing whitespace from string.
split(string, delimiter) -> array(varchar)
Splits string on delimiter and returns an array.
split(string, delimiter, limit) -> array(varchar)
Splits string on delimiter and returns an array of size at mostlimit. The last element in the array always contain everythingleft in the string. limit must be a positive number.
splitpart(_string, delimiter, index) → varchar
Splits string on delimiter and returns the field index.Field indexes start with 1. If the index is larger than thanthe number of fields, then null is returned.
splitto_map(_string, entryDelimiter, keyValueDelimiter) → map
Splits string by entryDelimiter and keyValueDelimiter and returns a map.entryDelimiter splits string into key-value pairs. keyValueDelimiter splitseach pair into key and value. Note that entryDelimiter and keyValueDelimiter areinterpreted literally, i.e., as full string matches.
splitto_map(_string, entryDelimiter, keyValueDelimiter, function(k, v1, v2, res)) → map
Splits string by entryDelimiter and keyValueDelimiter and returns a map.entryDelimiter splits string into key-value pairs. keyValueDelimiter splitseach pair into key and value. Note that entryDelimiter and keyValueDelimiter areinterpreted literally, i.e., as full string matches. function(k, v1, v2, res)is invoked in cases of duplicate keys to resolve the value that should be in the map.

SELECT(split_to_map(‘a:1;b:2;a:3’, ‘;’, ‘:’, (k, v1, v2) -> v1)); – {“a”: “1”, “b”: “2”}SELECT(split_to_map(‘a:1;b:2;a:3’, ‘;’, ‘:’, (k, v1, v2) -> CONCAT(v1, v2))); – {“a”: “13”, “b”: “2”}

splitto_multimap(_string, entryDelimiter, keyValueDelimiter) -> map(varchar, array(varchar))
Splits string by entryDelimiter and keyValueDelimiter and returns a mapcontaining an array of values for each unique key. entryDelimiter splits stringinto key-value pairs. keyValueDelimiter splits each pair into key and value. Thevalues for each key will be in the same order as they appeared in string.Note that entryDelimiter and keyValueDelimiter are interpreted literally,i.e., as full string matches.
strpos(string, substring) → bigint
Returns the starting position of the first instance of substring instring. Positions start with 1. If not found, 0 is returned.
strpos(string, substring, instance) → bigint
Returns the position of the N-th instance of substring in string.instance must be a positive number.Positions start with 1. If not found, 0 is returned.
strrpos(string, substring) → bigint
Returns the starting position of the last instance of substring in string.Positions start with 1. If not found, 0 is returned.
strrpos(string, substring, instance) → bigint
Returns the position of the N-th instance of substring in string starting from the end of the string.instance must be a positive number.Positions start with 1. If not found, 0 is returned.
position(substring IN string) → bigint
Returns the starting position of the first instance of substring instring. Positions start with 1. If not found, 0 is returned.
substr(string, start) → varchar
Returns the rest of string from the starting position start.Positions start with 1. A negative starting position is interpretedas being relative to the end of the string.
substr(string, start, length) → varchar
Returns a substring from string of length length from the startingposition start. Positions start with 1. A negative startingposition is interpreted as being relative to the end of the string.
trim(string) → varchar
Removes leading and trailing whitespace from string.
upper(string) → varchar
Converts string to uppercase.
wordstem(_word) → varchar
Returns the stem of word in the English language.
wordstem(_word, lang) → varchar
Returns the stem of word in the lang language.

Unicode Functions

normalize(string) → varchar
Transforms string with NFC normalization form.
normalize(string, form) → varchar
Transforms string with the specified normalization form.form must be be one of the following keywords:

FormDescriptionNFDCanonical DecompositionNFCCanonical Decomposition, followed by Canonical CompositionNFKDCompatibility DecompositionNFKCCompatibility Decomposition, followed by Canonical Composition

Note

This SQL-standard function has special syntax and requiresspecifying form as a keyword, not as a string.

toutf8(_string) → varbinary
Encodes string into a UTF-8 varbinary representation.
fromutf8(_binary) → varchar
Decodes a UTF-8 encoded string from binary. Invalid UTF-8 sequencesare replaced with the Unicode replacement character U+FFFD.
fromutf8(_binary, replace) → varchar
Decodes a UTF-8 encoded string from binary. Invalid UTF-8 sequencesare replaced with replace. The replacement string replace must eitherbe a single character or empty (in which case invalid characters areremoved).