Scalar functions

Scalar functions are functions that return scalars.

String functions

concat('first_arg', second_arg, [ parameter , ... ])

Concatenates a variable number of arguments into a single string. It ignores NULL values.

Returns: text

  1. cr> select concat('foo', null, 'bar') AS col;
  2. +--------+
  3. | col |
  4. +--------+
  5. | foobar |
  6. +--------+
  7. SELECT 1 row in set (... sec)

You can also use the || operator:

  1. cr> select 'foo' || 'bar' AS col;
  2. +--------+
  3. | col |
  4. +--------+
  5. | foobar |
  6. +--------+
  7. SELECT 1 row in set (... sec)

format('format_string', parameter, [ parameter , ... ])

Formats a string similar to the C function printf. For details about the format string syntax, see formatter

Returns: text

  1. cr> select format('%s.%s', schema_name, table_name) AS fqtable
  2. ... from sys.shards
  3. ... where table_name = 'locations'
  4. ... limit 1;
  5. +---------------+
  6. | fqtable |
  7. +---------------+
  8. | doc.locations |
  9. +---------------+
  10. SELECT 1 row in set (... sec)
  1. cr> select format('%tY', date) AS year
  2. ... from locations
  3. ... group by format('%tY', date)
  4. ... order by 1;
  5. +------+
  6. | year |
  7. +------+
  8. | 1979 |
  9. | 2013 |
  10. +------+
  11. SELECT 2 rows in set (... sec)

substr('string', from, [ count ])

Extracts a part of a string. from specifies where to start and count the length of the part.

Returns: text

  1. cr> select substr('crate.io', 3, 2) AS substr;
  2. +--------+
  3. | substr |
  4. +--------+
  5. | at |
  6. +--------+
  7. SELECT 1 row in set (... sec)

char_length('string')

Counts the number of characters in a string.

Returns: integer

  1. cr> select char_length('crate.io') AS char_length;
  2. +-------------+
  3. | char_length |
  4. +-------------+
  5. | 8 |
  6. +-------------+
  7. SELECT 1 row in set (... sec)

Each character counts only once, regardless of its byte size.

  1. cr> select char_length('©rate.io') AS char_length;
  2. +-------------+
  3. | char_length |
  4. +-------------+
  5. | 8 |
  6. +-------------+
  7. SELECT 1 row in set (... sec)

length(text)

Returns the number of characters in a string.

The same as char_length.

bit_length('string')

Counts the number of bits in a string.

Returns: integer

Note

CrateDB uses UTF-8 encoding internally, which uses between 1 and 4 bytes per character.

  1. cr> select bit_length('crate.io') AS bit_length;
  2. +------------+
  3. | bit_length |
  4. +------------+
  5. | 64 |
  6. +------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select bit_length('©rate.io') AS bit_length;
  2. +------------+
  3. | bit_length |
  4. +------------+
  5. | 72 |
  6. +------------+
  7. SELECT 1 row in set (... sec)

octet_length('string')

Counts the number of bytes (octets) in a string.

Returns: integer

  1. cr> select octet_length('crate.io') AS octet_length;
  2. +--------------+
  3. | octet_length |
  4. +--------------+
  5. | 8 |
  6. +--------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select octet_length('©rate.io') AS octet_length;
  2. +--------------+
  3. | octet_length |
  4. +--------------+
  5. | 9 |
  6. +--------------+
  7. SELECT 1 row in set (... sec)

ascii(string)

Returns the ASCII code of the first character. For UTF-8, returns the Unicode code point of the characters.

Returns: int

  1. cr> SELECT ascii('a') AS a, ascii('🎈') AS b;
  2. +----+--------+
  3. | a | b |
  4. +----+--------+
  5. | 97 | 127880 |
  6. +----+--------+
  7. SELECT 1 row in set (... sec)

chr(int)

Returns the character with the given code. For UTF-8 the argument is treated as a Unicode code point.

Returns: string

  1. cr> SELECT chr(65) AS a;
  2. +---+
  3. | a |
  4. +---+
  5. | A |
  6. +---+
  7. SELECT 1 row in set (... sec)

lower('string')

Converts all characters to lowercase. lower does not perform locale-sensitive or context-sensitive mappings.

Returns: text

  1. cr> select lower('TransformMe') AS lower;
  2. +-------------+
  3. | lower |
  4. +-------------+
  5. | transformme |
  6. +-------------+
  7. SELECT 1 row in set (... sec)

upper('string')

Converts all characters to uppercase. upper does not perform locale-sensitive or context-sensitive mappings.

Returns: text

  1. cr> select upper('TransformMe') as upper;
  2. +-------------+
  3. | upper |
  4. +-------------+
  5. | TRANSFORMME |
  6. +-------------+
  7. SELECT 1 row in set (... sec)

initcap('string')

Converts the first letter of each word to upper case and the rest to lower case (capitalize letters).

Returns: text

  1. cr> select initcap('heLlo WORLD') AS initcap;
  2. +-------------+
  3. | initcap |
  4. +-------------+
  5. | Hello World |
  6. +-------------+
  7. SELECT 1 row in set (... sec)

sha1('string')

Returns: text

Computes the SHA1 checksum of the given string.

  1. cr> select sha1('foo') AS sha1;
  2. +------------------------------------------+
  3. | sha1 |
  4. +------------------------------------------+
  5. | 0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 |
  6. +------------------------------------------+
  7. SELECT 1 row in set (... sec)

md5('string')

Returns: text

Computes the MD5 checksum of the given string.

See sha1 for an example.

replace(text, from, to)

Replaces all occurrences of from in text with to.

  1. cr> select replace('Hello World', 'World', 'Stranger') AS hello;
  2. +----------------+
  3. | hello |
  4. +----------------+
  5. | Hello Stranger |
  6. +----------------+
  7. SELECT 1 row in set (... sec)

translate(string, from, to)

Performs several single-character, one-to-one translation in one operation. It translates string by replacing the characters in the from set, one-to-one positionally, with their counterparts in the to set. If from is longer than to, the function removes the occurrences of the extra characters in from. If there are repeated characters in from, only the first mapping is considered.

Synopsis:

  1. translate(string, from, to)

Examples:

  1. cr> select translate('Crate', 'Ct', 'Dk') as translation;
  2. +-------------+
  3. | translation |
  4. +-------------+
  5. | Drake |
  6. +-------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select translate('Crate', 'rCe', 'c') as translation;
  2. +-------------+
  3. | translation |
  4. +-------------+
  5. | cat |
  6. +-------------+
  7. SELECT 1 row in set (... sec)

trim({LEADING | TRAILING | BOTH} 'str_arg_1' FROM 'str_arg_2')

Removes the longest string containing characters from str_arg_1 (' ' by default) from the start, end, or both ends (BOTH is the default) of str_arg_2.

Synopsis:

  1. trim([ [ {LEADING | TRAILING | BOTH} ] [ str_arg_1 ] FROM ] str_arg_2)

Examples

  1. cr> select trim(BOTH 'ab' from 'abcba') AS trim;
  2. +------+
  3. | trim |
  4. +------+
  5. | c |
  6. +------+
  7. SELECT 1 row in set (... sec)
  1. cr> select trim('ab' from 'abcba') AS trim;
  2. +------+
  3. | trim |
  4. +------+
  5. | c |
  6. +------+
  7. SELECT 1 row in set (... sec)
  1. cr> select trim(' abcba ') AS trim;
  2. +-------+
  3. | trim |
  4. +-------+
  5. | abcba |
  6. +-------+
  7. SELECT 1 row in set (... sec)

ltrim(text, [ trimmingText ])

Removes set of characters which are matching trimmingText (' ' by default) to the left of text.

  1. cr> select ltrim('xxxzzzabcba', 'xz') AS ltrim;
  2. +-------+
  3. | ltrim |
  4. +-------+
  5. | abcba |
  6. +-------+
  7. SELECT 1 row in set (... sec)

rtrim(text, [ trimmingText ])

Removes set of characters which are matching trimmingText (' ' by default) to the right of text.

  1. cr> select rtrim('abcbaxxxzzz', 'xz') AS rtrim;
  2. +-------+
  3. | rtrim |
  4. +-------+
  5. | abcba |
  6. +-------+
  7. SELECT 1 row in set (... sec)

quote_ident(text)

Returns: text

Quotes a provided string argument. Quotes are added only if necessary. For example, if the string contains non-identifier characters, keywords or would be case-folded. Embedded quotes are properly doubled.

The quoted string can be used as an identifier in an SQL statement.

  1. cr> select quote_ident('Column name') AS quoted;
  2. +---------------+
  3. | quoted |
  4. +---------------+
  5. | "Column name" |
  6. +---------------+
  7. SELECT 1 row in set (... sec)

left('string', len)

Returns the first len characters of string when len > 0, otherwise all but last len characters.

Synopsis:

  1. left(string, len)

Examples:

  1. cr> select left('crate.io', 5) AS col;
  2. +-------+
  3. | col |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)
  1. cr> select left('crate.io', -3) AS col;
  2. +-------+
  3. | col |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)

right('string', len)

Returns the last len characters in string when len > 0, otherwise all but first len characters.

Synopsis:

  1. right(string, len)

Examples:

  1. cr> select right('crate.io', 2) AS col;
  2. +-----+
  3. | col |
  4. +-----+
  5. | io |
  6. +-----+
  7. SELECT 1 row in set (... sec)
  1. cr> select right('crate.io', -6) AS col;
  2. +-----+
  3. | col |
  4. +-----+
  5. | io |
  6. +-----+
  7. SELECT 1 row in set (... sec)

lpad('string1', len[, 'string2'])

Fill up string1 to length len by prepending the characters string2 (a space by default). If string1 is already longer than len then it is truncated (on the right).

Synopsis:

  1. lpad(string1, len[, string2])

Example:

  1. cr> select lpad(' I like CrateDB!!', 41, 'yes! ') AS col;
  2. +-------------------------------------------+
  3. | col |
  4. +-------------------------------------------+
  5. | yes! yes! yes! yes! yes! I like CrateDB!! |
  6. +-------------------------------------------+
  7. SELECT 1 row in set (... sec)

rpad('string1', len[, 'string2'])

Fill up string1 to length len by appending the characters string2 (a space by default). If string1 is already longer than len then it is truncated.

Synopsis:

  1. rpad(string1, len[, string2])

Example:

  1. cr> select rpad('Do you like Crate?', 38, ' yes!') AS col;
  2. +----------------------------------------+
  3. | col |
  4. +----------------------------------------+
  5. | Do you like Crate? yes! yes! yes! yes! |
  6. +----------------------------------------+
  7. SELECT 1 row in set (... sec)

Note

In both cases, the scalar functions lpad and rpad, do now accept a len greater than 50000.

encode(bytea, format)

Encode takes a binary string (hex format) and returns a text encoding using the specified format. Supported formats are: base64, hex, and escape. The escape format replaces unprintable characters with octal byte notation like \nnn. For the reverse function, see decode().

Synopsis:

  1. encode(string1, format)

Example:

  1. cr> select encode(E'123\b\t56', 'base64') AS col;
  2. +--------------+
  3. | col |
  4. +--------------+
  5. | MTIzCAk1Ng== |
  6. +--------------+
  7. SELECT 1 row in set (... sec)

decode(text, format)

Decodes a text encoded string using the specified format and returns a binary string (hex format). Supported formats are: base64, hex, and escape. For the reverse function, see encode().

Synopsis:

  1. decode(text1, format)

Example:

  1. cr> select decode('T\214', 'escape') AS col;
  2. +--------+
  3. | col |
  4. +--------+
  5. | \x548c |
  6. +--------+
  7. SELECT 1 row in set (... sec)

repeat(text, integer)

Repeats a string the specified number of times.

If the number of repetitions is equal or less than zero then the function returns an empty string.

Returns: text

  1. cr> select repeat('ab', 3) AS repeat;
  2. +--------+
  3. | repeat |
  4. +--------+
  5. | ababab |
  6. +--------+
  7. SELECT 1 row in set (... sec)

split_part(text, text, integer)

Splits a string into parts using a delimiter and returns the part at the given index. The first part is addressed by index 1.

Special Cases:

  • Returns the empty string if the index is greater than the number of parts.

  • If any of the arguments is NULL, the result is NULL.

  • If the delimiter is the empty string, the input string is considered as consisting of exactly one part.

Returns: text

Synopsis:

  1. split_part(string, delimiter, index)

Example:

  1. cr> select split_part('ab--cdef--gh', '--', 2) AS part;
  2. +------+
  3. | part |
  4. +------+
  5. | cdef |
  6. +------+
  7. SELECT 1 row in set (... sec)

Date and time functions

date_trunc('interval', ['timezone',] timestamp)

Returns: timestamp with time zone

Limits a timestamps precision to a given interval.

Valid intervals are:

  • second

  • minute

  • hour

  • day

  • week

  • month

  • quarter

  • year

Valid values for timezone are either the name of a time zone (for example ‘Europe/Vienna’) or the UTC offset of a time zone (for example ‘+01:00’). To get a complete overview of all possible values take a look at the available time zones supported by Joda-Time.

The following example shows how to use the date_trunc function to generate a day based histogram in the Europe/Moscow timezone:

  1. cr> select
  2. ... date_trunc('day', 'Europe/Moscow', date) as day,
  3. ... count(*) as num_locations
  4. ... from locations
  5. ... group by 1
  6. ... order by 1;
  7. +---------------+---------------+
  8. | day | num_locations |
  9. +---------------+---------------+
  10. | 308523600000 | 4 |
  11. | 1367352000000 | 1 |
  12. | 1373918400000 | 8 |
  13. +---------------+---------------+
  14. SELECT 3 rows in set (... sec)

If you don’t specify a time zone, truncate uses UTC time:

  1. cr> select date_trunc('day', date) as day, count(*) as num_locations
  2. ... from locations
  3. ... group by 1
  4. ... order by 1;
  5. +---------------+---------------+
  6. | day | num_locations |
  7. +---------------+---------------+
  8. | 308534400000 | 4 |
  9. | 1367366400000 | 1 |
  10. | 1373932800000 | 8 |
  11. +---------------+---------------+
  12. SELECT 3 rows in set (... sec)

extract(field from source)

extract is a special expression that translates to a function which retrieves subfields such as day, hour or minute from a timestamp.

The return type depends on the used field.

Synopsis

  1. EXTRACT( field FROM expression )

field

An identifier or string literal which identifies the part of the timestamp that should be extracted.

expression

An expression that resolves to a timestamp data type with or without timezone or is castable to timestamp data types.

  1. cr> select extract(day from '2014-08-23') AS day;
  2. +-----+
  3. | day |
  4. +-----+
  5. | 23 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

source must be an expression that returns a timestamp. In case the expression has a different return type but is known to be castable to timestamp an implicit cast will be attempted.

field is an identifier that selects which part of the timestamp to extract. The following fields are supported:

CENTURY

Return type: integer

century of era

Returns the ISO representation which is a straight split of the date.

Year 2000 century 20 and year 2001 is also century 20. This is different to the GregorianJulian (GJ) calendar system where 2001 would be century 21.

YEAR

Return type: integer

the year field

QUARTER

Return type: integer

the quarter of the year (1 - 4)

MONTH

Return type: integer

the month of the year

WEEK

Return type: integer

the week of the year

DAY

Return type: integer

the day of the month

DAY_OF_MONTH

Return type: integer

same as day

DAY_OF_WEEK

Return type: integer

day of the week. Starting with Monday (1) to Sunday (7)

DOW

Return type: integer

same as day_of_week

DAY_OF_YEAR

Return type: integer

the day of the year (1 - 365 / 366)

DOY

Return type: integer

same as day_of_year

HOUR

Return type: integer

the hour field

MINUTE

Return type: integer

the minute field

SECOND

Return type: integer

the second field

EPOCH

Return type: double precision

The number of seconds since Jan 1, 1970.

Can be negative if earlier than Jan 1, 1970.

CURRENT_TIME

The CURRENT_TIME expression returns the time in microseconds since midnight UTC at the time the SQL statement was handled. Clock time is looked up at most once within the scope of a single query, to ensure that multiple occurrences of CURRENT_TIME evaluate to the same value.

synopsis:

  1. CURRENT_TIME [ ( precision ) ]

precision must be a positive integer between 0 and 6. The default value is 6. It determines the number of fractional seconds to output. A value of 0 means the time will have second precision, no fractional seconds (microseconds) are given.

Note

No guarantee is provided about the accuracy of the underlying clock, results may be limited to millisecond precision, depending on the system.

CURRENT_TIMESTAMP

The CURRENT_TIMESTAMP expression returns the timestamp in milliseconds since epoch at the time the SQL statement was handled. Therefore, the same timestamp value is returned for every invocation of a single statement.

Synopsis:

  1. CURRENT_TIMESTAMP [ ( precision ) ]

Here, precision must be a positive integer between 0 and 3. The default value is 3. This value determines the number of fractional seconds to output. A value of 0 means the timestamp will have second precision, no fractional seconds (milliseconds) are given.

Tip

To get an offset value of CURRENT_TIMESTAMP (e.g., this same time one day ago), you can add or subtract an interval, like so:

  1. CURRENT_TIMESTAMP - '1 day'::interval

Note

If the CURRENT_TIMESTAMP function is used in Generated columns it behaves slightly different in UPDATE operations. In such a case the actual timestamp of each row update is returned.

The return value of expressions CURRENT_TIMESTAMP and CURRENT_TIME depends on the system clock and the JVM implementation.

CURDATE()

The CURDATE() scalar function is an alias of the CURRENT_DATE expression. See CURRENT_DATE.

synopsis:

  1. CURDATE()

CURRENT_DATE

The CURRENT_DATE expression returns the date in UTC timezone at the time the SQL statement was handled. Clock time is looked up at most once within the scope of a single query, to ensure that multiple occurrences of CURRENT_DATE evaluate to the same value.

synopsis:

  1. CURRENT_DATE

now()

Returns the current date and time.

This is the same as current_timestamp

Returns: timestamp with time zone

Synopsis:

  1. now()

date_format([format_string, [timezone,]] timestamp)

The date_format function formats a timestamp as string according to the (optional) format string.

Returns: text

Synopsis

  1. DATE_FORMAT( [ format_string, [ timezone, ] ] timestamp )

The only mandatory argument is the timestamp value to format. It can be any expression that is safely convertible to timestamp data type with or without timezone.

Format

The syntax for the format_string is 100% compatible to the syntax of the MySQL date_format function. For reference, the format is listed in detail below 1:

date_format Format

Format Specifier

Description

%a

Abbreviated weekday name (Sun..Sat)

%b

Abbreviated month name (Jan..Dec)

%c

Month in year, numeric (0..12)

%D

Day of month as ordinal number (1st, 2nd, … 24th)

%d

Day of month, padded to 2 digits (00..31)

%e

Day of month (0..31)

%f

Microseconds, padded to 6 digits (000000..999999)

%H

Hour in 24-hour clock, padded to 2 digits (00..23)

%h

Hour in 12-hour clock, padded to 2 digits (01..12)

%I

Hour in 12-hour clock, padded to 2 digits (01..12)

%i

Minutes, numeric (00..59)

%j

Day of year, padded to 3 digits (001..366)

%k

Hour in 24-hour clock (0..23)

%l

Hour in 12-hour clock (1..12)

%M

Month name (January..December)

%m

Month in year, numeric, padded to 2 digits (00..12)

%p

AM or PM

%r

Time, 12-hour (hh:mm:ss followed by AM or PM)

%S

Seconds, padded to 2 digits (00..59)

%s

Seconds, padded to 2 digits (00..59)

%T

Time, 24-hour (hh:mm:ss)

%U

Week number, sunday as first day of the week, first week of the year (01) is the one starting in this year, week 00 starts in last year (00..53)

%u

Week number, monday as first day of the week, first week of the year (01) is the one with at least 4 days in this year (00..53)

%V

Week number, sunday as first day of the week, first week of the year (01) is the one starting in this year, uses the week number of the last year, if the week started in last year (01..53)

%v

Week number, monday as first day of the week, first week of the year (01) is the one with at least 4 days in this year, uses the week number of the last year, if the week started in last year (01..53)

%W

Weekday name (Sunday..Saturday)

%w

Day of the week (0=Sunday..6=Saturday)

%X

weekyear, sunday as first day of the week, numeric, four digits; used with %V

%x

weekyear, monday as first day of the week, numeric, four digits; used with %v

%Y

Year, numeric, four digits

%y

Year, numeric, two digits

%%

A literal ‘%’ character

%x

x, for any ‘x’ not listed above

If no format_string is given the default format will be used:

  1. %Y-%m-%dT%H:%i:%s.%fZ
  1. cr> select date_format('1970-01-01') as epoque;
  2. +-----------------------------+
  3. | epoque |
  4. +-----------------------------+
  5. | 1970-01-01T00:00:00.000000Z |
  6. +-----------------------------+
  7. SELECT 1 row in set (... sec)

Timezone

Valid values for timezone are either the name of a time zone (for example ‘Europe/Vienna’) or the UTC offset of a time zone (for example ‘+01:00’). To get a complete overview of all possible values take a look at the available time zones supported by Joda-Time.

The timezone will be UTC if not provided:

  1. cr> select date_format('%W the %D of %M %Y %H:%i %p', 0) as epoque;
  2. +-------------------------------------------+
  3. | epoque |
  4. +-------------------------------------------+
  5. | Thursday the 1st of January 1970 00:00 AM |
  6. +-------------------------------------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select date_format('%Y/%m/%d %H:%i', 'EST', 0) as est_epoque;
  2. +------------------+
  3. | est_epoque |
  4. +------------------+
  5. | 1969/12/31 19:00 |
  6. +------------------+
  7. SELECT 1 row in set (... sec)

timezone(timezone, timestamp)

The timezone scalar function converts values of timestamp without time zone to/from timestamp with time zone.

Synopsis

  1. TIMEZONE(timezone, timestamp)

It has two variants depending on the type of timestamp:

Type of timestamp

Return Type

Description

timestamp without time zone OR bigint

timestamp with time zone

Treat given timestamp without time zone as located in the specified timezone

timestamp with time zone

timestamp without time zone

Convert given timestamp with time zone to the new timezone with no time zone designation

  1. cr> select
  2. ... 257504400000 as no_tz,
  3. ... date_format('%Y-%m-%d %h:%i', 257504400000) as no_tz_str,
  4. ... timezone('Europe/Madrid', 257504400000) as in_madrid,
  5. ... date_format('%Y-%m-%d %h:%i', timezone('Europe/Madrid',
  6. ... 257504400000)) as in_madrid_str;
  7. +--------------+------------------+--------------+------------------+
  8. | no_tz | no_tz_str | in_madrid | in_madrid_str |
  9. +--------------+------------------+--------------+------------------+
  10. | 257504400000 | 1978-02-28 09:00 | 257500800000 | 1978-02-28 08:00 |
  11. +--------------+------------------+--------------+------------------+
  12. SELECT 1 row in set (... sec)
  1. cr> select
  2. ... timezone('Europe/Madrid',
  3. ... '1978-02-28T10:00:00.000+01:00'::timestamp with time zone) as epoque,
  4. ... date_format('%Y-%m-%d %h:%i', timezone('Europe/Madrid',
  5. ... '1978-02-28T10:00:00.000+01:00'::timestamp with time zone)) as epoque_str;
  6. +--------------+------------------+
  7. | epoque | epoque_str |
  8. +--------------+------------------+
  9. | 257508000000 | 1978-02-28 10:00 |
  10. +--------------+------------------+
  11. SELECT 1 row in set (... sec)
  1. cr> select
  2. ... timezone('Europe/Madrid',
  3. ... '1978-02-28T10:00:00.000+01:00'::timestamp without time zone) as epoque,
  4. ... date_format('%Y-%m-%d %h:%i', timezone('Europe/Madrid',
  5. ... '1978-02-28T10:00:00.000+01:00'::timestamp without time zone)) as epoque_str;
  6. +--------------+------------------+
  7. | epoque | epoque_str |
  8. +--------------+------------------+
  9. | 257504400000 | 1978-02-28 09:00 |
  10. +--------------+------------------+
  11. SELECT 1 row in set (... sec)

to_char(expression, format_string)

The to_char function converts a timestamp or interval value to a string, based on a given format string.

Returns: text

Synopsis

  1. TO_CHAR( expression, format_string )

expression can be any value with the type of timestamp (with or without a timezone) or interval.

Format

The syntax for the format_string differs based the type of the expression. For timestamp expressions, the format_string is a template string containing any of the following symbols:

Pattern

Description

HH / HH12

Hour of day (01-12)

HH24

Hour of day (00-23)

MI

Minute (00-59)

SS

Second (00-59)

MS

Millisecond (000-999)

US

Microsecond (000000-999999)

FF1

Tenth of second (0-9)

FF2

Hundredth of second (00-99)

FF3

Millisecond (000-999)

FF4

Tenth of millisecond (0000-9999)

FF5

Hundredth of millisecond (00000-99999)

FF6

Microsecond (000000-999999)

SSSS / SSSSS

Seconds past midnight (0-86399)

AM / am / PM / pm

Meridiem indicator

A.M. / a.m. / P.M. / p.m.

Meridiem indicator (with periods)

Y,YYY

4 digit year with comma

YYYY

4 digit year

YYY

Last 3 digits of year

YY

Last 2 digits of year

Y

Last digit of year

IYYY

4 digit ISO-8601 week-numbering year

IYY

Last 3 digits of ISO-8601 week-numbering year

IY

Last 2 digits of ISO-8601 week-numbering year

I

Last digit of ISO-8601 week-numbering year

BC / bc / AD / ad

Era indicator

B.C. / b.c. / A.D. / a.d.

Era indicator with periods

MONTH / Month / month

Full month name (uppercase/capitalized/lowercase) padded to 9 characters

MON / Mon / mon

Short month name (uppercase/capitalized/lowercase) padded to 9 characters

MM

Month number (01-12)

DAY / Day / day

Full day name (uppercase/capitalized/lowercase) padded to 9 characters

DY / Dy / dy

Short, 3 character day name (uppercase/capitalized/lowercase)

DDD

Day of year (001-366)

IDDD

Day of ISO-8601 week-numbering year, where the first Monday of the first ISO week is day 1 (001-371)

DD

Day of month (01-31)

D

Day of the week, from Sunday (1) to Saturday (7)

ID

ISO-8601 day of the week, from Monday (1) to Sunday (7)

W

Week of month (1-5)

WW

Week number of year (1-53)

IW

Week number of ISO-8601 week-numbering year (01-53)

CC

Century

J

Julian Day

Q

Quarter

RM / rm

Month in Roman numerals (uppercase/lowercase)

TZ / tz

Time-zone abbreviation (uppercase/lowercase)

TZH

Time-zone hours

TZM

Time-zone minutes

OF

Time-zone offset from UTC

  1. cr> select to_char(timestamp '1970-01-01T17:31:12', 'Day, Month DD - HH12:MI AM YYYY AD') as ts;
  2. +-----------------------------------------+
  3. | ts |
  4. +-----------------------------------------+
  5. | Thursday, January 01 - 05:31 PM 1970 AD |
  6. +-----------------------------------------+
  7. SELECT 1 row in set (... sec)

For interval expressions, the formatting string accepts the same tokens as timestamp expressions. The function then uses the timestamp of the specified interval added to the timestamp of 0000/01/01 00:00:00:

  1. cr> select to_char(interval '1 year 3 weeks 200 minutes', 'YYYY MM DD HH12:MI:SS') as interval;
  2. +---------------------+
  3. | interval |
  4. +---------------------+
  5. | 0001 01 22 03:20:00 |
  6. +---------------------+
  7. SELECT 1 row in set (... sec)

Geo functions

distance(geo_point1, geo_point2)

Returns: double precision

The distance function can be used to calculate the distance between two points on earth. It uses the Haversine formula which gives great-circle distances between 2 points on a sphere based on their latitude and longitude.

The return value is the distance in meters.

Below is an example of the distance function where both points are specified using WKT. See geo_point for more information on the implicit type casting of geo points:

  1. cr> select distance('POINT (10 20)', 'POINT (11 21)') AS col;
  2. +-------------------+
  3. | col |
  4. +-------------------+
  5. | 152354.3209044634 |
  6. +-------------------+
  7. SELECT 1 row in set (... sec)

This scalar function can always be used in both the WHERE and ORDER BY clauses. With the limitation that one of the arguments must be a literal and the other argument must be a column reference.

Note

The algorithm of the calculation which is used when the distance function is used as part of the result column list has a different precision than what is stored inside the index which is utilized if the distance function is part of a WHERE clause.

For example if select distance(...) returns 0.0 an equality check with where distance(...) = 0 might not yield anything at all due to the precision difference.

within(shape1, shape2)

Returns: boolean

The within function returns true if shape1 is within shape2. If that is not the case false is returned.

shape1 can either be a geo_shape or a geo_point. shape2 must be a geo_shape.

Below is an example of the within function which makes use of the implicit type casting from strings in WKT representation to geo point and geo shapes:

  1. cr> select within(
  2. ... 'POINT (10 10)',
  3. ... 'POLYGON ((5 5, 10 5, 10 10, 5 10, 5 5))'
  4. ... ) AS is_within;
  5. +-----------+
  6. | is_within |
  7. +-----------+
  8. | TRUE |
  9. +-----------+
  10. SELECT 1 row in set (... sec)

This function can always be used within the WHERE clause.

intersects(geo_shape, geo_shape)

Returns: boolean

The intersects function returns true if both argument shapes share some points or area, they overlap. This also includes two shapes where one lies within the other.

If false is returned, both shapes are considered disjoint.

Example:

  1. cr> select
  2. ... intersects(
  3. ... {type='Polygon', coordinates=[
  4. ... [[13.4252, 52.7096],[13.9416, 52.0997],
  5. ... [12.7221, 52.1334],[13.4252, 52.7096]]]},
  6. ... 'LINESTRING(13.9636 52.6763, 13.2275 51.9578,
  7. ... 12.9199 52.5830, 11.9970 52.6830)'
  8. ... ) as intersects,
  9. ... intersects(
  10. ... {type='Polygon', coordinates=[
  11. ... [[13.4252, 52.7096],[13.9416, 52.0997],
  12. ... [12.7221, 52.1334],[13.4252, 52.7096]]]},
  13. ... 'LINESTRING (11.0742 49.4538, 11.5686 48.1367)'
  14. ... ) as disjoint;
  15. +------------+----------+
  16. | intersects | disjoint |
  17. +------------+----------+
  18. | TRUE | FALSE |
  19. +------------+----------+
  20. SELECT 1 row in set (... sec)

Due to a limitation on the geo_shape datatype this function cannot be used in the ORDER BY clause.

latitude(geo_point) and longitude(geo_point)

Returns: double precision

The latitude and longitude function return the coordinates of latitude or longitude of a point, or NULL if not available. The input must be a column of type geo_point, a valid WKT string or a double precision array. See geo_point for more information on the implicit type casting of geo points.

Example:

  1. cr> select mountain, height, longitude(coordinates) as "lon", latitude(coordinates) as "lat"
  2. ... from sys.summits order by height desc limit 1;
  3. +------------+--------+---------+---------+
  4. | mountain | height | lon | lat |
  5. +------------+--------+---------+---------+
  6. | Mont Blanc | 4808 | 6.86444 | 45.8325 |
  7. +------------+--------+---------+---------+
  8. SELECT 1 row in set (... sec)

Below is an example of the latitude/longitude functions which make use of the implicit type casting from strings to geo point:

  1. cr> select latitude('POINT (10 20)') AS lat, longitude([10.0, 20.0]) AS long;
  2. +------+------+
  3. | lat | long |
  4. +------+------+
  5. | 20.0 | 10.0 |
  6. +------+------+
  7. SELECT 1 row in set (... sec)

geohash(geo_point)

Returns: text

Returns a GeoHash representation based on full precision (12 characters) of the input point, or NULL if not available. The input has to be a column of type geo_point, a valid WKT string or a double precision array. See geo_point for more information of the implicit type casting of geo points.

Example:

  1. cr> select mountain, height, geohash(coordinates) as "geohash" from sys.summits
  2. ... order by height desc limit 1;
  3. +------------+--------+--------------+
  4. | mountain | height | geohash |
  5. +------------+--------+--------------+
  6. | Mont Blanc | 4808 | u0huspw99j1r |
  7. +------------+--------+--------------+
  8. SELECT 1 row in set (... sec)

Mathematical functions

All mathematical functions can be used within WHERE and ORDER BY clauses.

abs(number)

Returns the absolute value of the given number in the datatype of the given number:

  1. cr> select abs(214748.0998) AS a, abs(0) AS b, abs(-214748) AS c;
  2. +-------------+---+--------+
  3. | a | b | c |
  4. +-------------+---+--------+
  5. | 214748.0998 | 0 | 214748 |
  6. +-------------+---+--------+
  7. SELECT 1 row in set (... sec)

ceil(number)

Returns the smallest integer or long value that is not less than the argument.

Returns: bigint or integer

Return value will be of type integer if the input value is an integer or float. If the input value is of type bigint or double precision the return value will be of type bigint:

  1. cr> select ceil(29.9) AS col;
  2. +-----+
  3. | col |
  4. +-----+
  5. | 30 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

ceiling(number)

This is an alias for ceil.

degrees(double precision)

Convert the given radians value to degrees.

Returns: double precision

  1. cr> select degrees(0.5) AS degrees;
  2. +-------------------+
  3. | degrees |
  4. +-------------------+
  5. | 28.64788975654116 |
  6. +-------------------+
  7. SELECT 1 row in set (... sec)

exp(number)

Returns Euler’s number e raised to the power of the given numeric value. The output will be cast to the given input type and thus may loose precision.

Returns: Same as input type.

  1. cr> select exp(1.0) AS exp;
  2. +-------------------+
  3. | exp |
  4. +-------------------+
  5. | 2.718281828459045 |
  6. +-------------------+
  7. SELECT 1 row in set (... sec)

floor(number)

Returns the largest integer or long value that is not greater than the argument.

Returns: bigint or integer

Return value will be an integer if the input value is an integer or a float. If the input value is of type bigint or double precision the return value will be of type bigint.

See below for an example:

  1. cr> select floor(29.9) AS floor;
  2. +-------+
  3. | floor |
  4. +-------+
  5. | 29 |
  6. +-------+
  7. SELECT 1 row in set (... sec)

ln(number)

Returns the natural logarithm of given number.

Returns: double precision

See below for an example:

  1. cr> SELECT ln(1) AS ln;
  2. +-----+
  3. | ln |
  4. +-----+
  5. | 0.0 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

Note

An error is returned for arguments which lead to undefined or illegal results. E.g. ln(0) results in minus infinity, and therefore, an error is returned.

log(x : number, b : number)

Returns the logarithm of given x to base b.

Returns: double precision

See below for an example, which essentially is the same as above:

  1. cr> SELECT log(100, 10) AS log;
  2. +-----+
  3. | log |
  4. +-----+
  5. | 2.0 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

The second argument (b) is optional. If not present, base 10 is used:

  1. cr> SELECT log(100) AS log;
  2. +-----+
  3. | log |
  4. +-----+
  5. | 2.0 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

Note

An error is returned for arguments which lead to undefined or illegal results. E.g. log(0) results in minus infinity, and therefore, an error is returned.

The same is true for arguments which lead to a division by zero, as e.g. log(10, 1) does.

modulus(y, x)

Returns the remainder of y/x.

Returns: Same as argument types.

  1. cr> select modulus(5, 4) AS mod;
  2. +-----+
  3. | mod |
  4. +-----+
  5. | 1 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

mod(y, x)

This is an alias for modulus.

power(a: number, b: number)

Returns the given argument a raised to the power of argument b.

Returns: double precision

The return type of the power function is always double precision, even when both the inputs are integral types, in order to be consistent across positive and negative exponents (which will yield decimal types).

See below for an example:

  1. cr> SELECT power(2,3) AS pow;
  2. +-----+
  3. | pow |
  4. +-----+
  5. | 8.0 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

radians(double precision)

Convert the given degrees value to radians.

Returns: double precision

  1. cr> select radians(45.0) AS radians;
  2. +--------------------+
  3. | radians |
  4. +--------------------+
  5. | 0.7853981633974483 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

random()

The random function returns a random value in the range 0.0 <= X < 1.0.

Returns: double precision

Note

Every call to random will yield a new random number.

gen_random_text_uuid()

Returns a random time based UUID as text. The returned ID is similar to flake IDs and well suited for use as primary key value.

Note that the ID is opaque (i.e., not to be considered meaningful in any way) and the implementation is free to change.

round(number)

If the input is of type double precision or bigint the result is the closest bigint to the argument, with ties rounding up.

If the input is of type real or integer the result is the closest integer to the argument, with ties rounding up.

Returns: bigint or integer

See below for an example:

  1. cr> select round(42.2) AS round;
  2. +-------+
  3. | round |
  4. +-------+
  5. | 42 |
  6. +-------+
  7. SELECT 1 row in set (... sec)

trunc(number[, precision])

Returns number truncated to the specified precision (decimal places).

When precision is not specified, the result’s type is an integer, or bigint. When it is specified, the result’s type is double precision. Notice that trunc(number) and trunc(number, 0) return different result types.

See below for examples:

  1. cr> select trunc(29.999999, 3) AS trunc;
  2. +--------+
  3. | trunc |
  4. +--------+
  5. | 29.999 |
  6. +--------+
  7. SELECT 1 row in set (... sec)
  8. cr> select trunc(29.999999) AS trunc;
  9. +-------+
  10. | trunc |
  11. +-------+
  12. | 29 |
  13. +-------+
  14. SELECT 1 row in set (... sec)

sqrt(number)

Returns the square root of the argument.

Returns: double precision

See below for an example:

  1. cr> select sqrt(25.0) AS sqrt;
  2. +------+
  3. | sqrt |
  4. +------+
  5. | 5.0 |
  6. +------+
  7. SELECT 1 row in set (... sec)

sin(number)

Returns the sine of the argument.

Returns: double precision

See below for an example:

  1. cr> SELECT sin(1) AS sin;
  2. +--------------------+
  3. | sin |
  4. +--------------------+
  5. | 0.8414709848078965 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

asin(number)

Returns the arcsine of the argument.

Returns: double precision

See below for an example:

  1. cr> SELECT asin(1) AS asin;
  2. +--------------------+
  3. | asin |
  4. +--------------------+
  5. | 1.5707963267948966 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

cos(number)

Returns the cosine of the argument.

Returns: double precision

See below for an example:

  1. cr> SELECT cos(1) AS cos;
  2. +--------------------+
  3. | cos |
  4. +--------------------+
  5. | 0.5403023058681398 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

acos(number)

Returns the arccosine of the argument.

Returns: double precision

See below for an example:

  1. cr> SELECT acos(-1) AS acos;
  2. +-------------------+
  3. | acos |
  4. +-------------------+
  5. | 3.141592653589793 |
  6. +-------------------+
  7. SELECT 1 row in set (... sec)

tan(number)

Returns the tangent of the argument.

Returns: double precision

See below for an example:

  1. cr> SELECT tan(1) AS tan;
  2. +--------------------+
  3. | tan |
  4. +--------------------+
  5. | 1.5574077246549023 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

cot(number)

Returns the cotangent of the argument that represents the angle expressed in radians. The range of the argument is all real numbers. The cotangent of zero is undefined and returns Infinity.

Returns: double precision

See below for an example:

  1. cr> select cot(1) AS cot;
  2. +--------------------+
  3. | cot |
  4. +--------------------+
  5. | 0.6420926159343306 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

atan(number)

Returns the arctangent of the argument.

Returns: double precision

See below for an example:

  1. cr> SELECT atan(1) AS atan;
  2. +--------------------+
  3. | atan |
  4. +--------------------+
  5. | 0.7853981633974483 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

atan2(y: number, x: number)

Returns the arctangent of y/x.

Returns: double precision

  1. cr> SELECT atan2(2, 1) AS atan2;
  2. +--------------------+
  3. | atan2 |
  4. +--------------------+
  5. | 1.1071487177940904 |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

pi()

Returns the π constant.

Returns: double precision

  1. cr> SELECT pi() AS pi;
  2. +-------------------+
  3. | pi |
  4. +-------------------+
  5. | 3.141592653589793 |
  6. +-------------------+
  7. SELECT 1 row in set (... sec)

Regular expression functions

The regular expression functions in CrateDB use Java Regular Expressions.

See the API documentation for more details.

Note

Be aware that, in contrast to the functions, the regular expression operator uses Lucene Regular Expressions.

regexp_replace(source, pattern, replacement [, flags])

regexp_replace can be used to replace every (or only the first) occurrence of a subsequence matching pattern in the source string with the replacement string. If no subsequence in source matches the regular expression pattern, source is returned unchanged.

Returns: text

pattern is a Java regular expression. For details on the regexp syntax, see Java Regular Expressions.

The replacement string may contain expressions like $N where N is a digit between 0 and 9. It references the Nth matched group of pattern and the matching subsequence of that group will be inserted in the returned string. The expression $0 will insert the whole matching source.

By default, only the first occurrence of a subsequence matching pattern will be replaced. If all occurrences shall be replaced use the g flag.

Flags

regexp_replace supports a number of flags as optional parameters. These flags are given as a string containing any of the characters listed below. Order does not matter.

Flag

Description

i

enable case insensitive matching

u

enable unicode case folding when used together with i

U

enable unicode support for character classes like \W

s

make . match line terminators, too

m

make ^ and $ match on the beginning or end of a line too.

x

permit whitespace and line comments starting with #

d

only \n is considered a line-terminator when using ^, $ and .

g

replace all occurrences of a subsequence matching pattern, not only the first

Examples

  1. cr> select name, regexp_replace(name, '(\w+)\s(\w+)+', '$1 - $2') as replaced from locations
  2. ... order by name limit 5;
  3. +---------------------+-----------------------+
  4. | name | replaced |
  5. +---------------------+-----------------------+
  6. | | |
  7. | Aldebaran | Aldebaran |
  8. | Algol | Algol |
  9. | Allosimanius Syneca | Allosimanius - Syneca |
  10. | Alpha Centauri | Alpha - Centauri |
  11. +---------------------+-----------------------+
  12. SELECT 5 rows in set (... sec)
  1. cr> select regexp_replace('alcatraz', '(foo)(bar)+', '$1baz') as replaced;
  2. +----------+
  3. | replaced |
  4. +----------+
  5. | alcatraz |
  6. +----------+
  7. SELECT 1 row in set (... sec)
  1. cr> select name, regexp_replace(name, '([A-Z]\w+) .+', '$1', 'ig') as replaced from locations
  2. ... order by name limit 5;
  3. +---------------------+--------------+
  4. | name | replaced |
  5. +---------------------+--------------+
  6. | | |
  7. | Aldebaran | Aldebaran |
  8. | Algol | Algol |
  9. | Allosimanius Syneca | Allosimanius |
  10. | Alpha Centauri | Alpha |
  11. +---------------------+--------------+
  12. SELECT 5 rows in set (... sec)

Array functions

array_cat(first_array, second_array)

The array_cat function concatenates two arrays into one array

Returns: array

  1. cr> select array_cat([1,2,3],[3,4,5,6]) AS array_cat;
  2. +-----------------------+
  3. | array_cat |
  4. +-----------------------+
  5. | [1, 2, 3, 3, 4, 5, 6] |
  6. +-----------------------+
  7. SELECT 1 row in set (... sec)

It can be used to append elements to array fields

  1. cr> create table array_cat_example (list array(integer));
  2. CREATE OK, 1 row affected (... sec)
  1. cr> insert into array_cat_example (list) values ([1,2,3]);
  2. INSERT OK, 1 row affected (... sec)
  1. cr> update array_cat_example set list = array_cat(list, [4, 5, 6]);
  2. UPDATE OK, 1 row affected (... sec)
  1. cr> select * from array_cat_example;
  2. +--------------------+
  3. | list |
  4. +--------------------+
  5. | [1, 2, 3, 4, 5, 6] |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

Note

Appending to arrays with array_cat in updates is handy, but unfortunately not isolated. We use optimistic concurrency control to ensure that your update operation used the latest state of the row. But only 3 retry attempts are made by fetching the newest version again and if they all fail, the query fails.

You can also use the concat operator || with arrays:

  1. cr> select [1,2,3] || [4,5,6] || [7,8,9] AS arr;
  2. +-----------------------------+
  3. | arr |
  4. +-----------------------------+
  5. | [1, 2, 3, 4, 5, 6, 7, 8, 9] |
  6. +-----------------------------+
  7. SELECT 1 row in set (... sec)

array_unique(first_array, [ second_array])

The array_unique function merges two arrays into one array with unique elements

Returns: array

  1. cr> select array_unique([1, 2, 3], [3, 4, 4]) AS arr;
  2. +--------------+
  3. | arr |
  4. +--------------+
  5. | [1, 2, 3, 4] |
  6. +--------------+
  7. SELECT 1 row in set (... sec)

If the arrays have different types all elements will be cast to a common type based on the type precedence.

  1. cr> select array_unique([10, 20], [10.0, 20.3]) AS arr;
  2. +--------------------+
  3. | arr |
  4. +--------------------+
  5. | [10.0, 20.0, 20.3] |
  6. +--------------------+
  7. SELECT 1 row in set (... sec)

array_difference(first_array, second_array)

The array_difference function removes elements from the first array that are contained in the second array.

Returns: array

  1. cr> select array_difference([1,2,3,4,5,6,7,8,9,10],[2,3,6,9,15]) AS arr;
  2. +---------------------+
  3. | arr |
  4. +---------------------+
  5. | [1, 4, 5, 7, 8, 10] |
  6. +---------------------+
  7. SELECT 1 row in set (... sec)

It can be used to remove elements from array fields.

  1. cr> create table array_difference_example (list array(integer));
  2. CREATE OK, 1 row affected (... sec)
  1. cr> insert into array_difference_example (list) values ([1,2,3,4,5,6,7,8,9,10]);
  2. INSERT OK, 1 row affected (... sec)
  1. cr> update array_difference_example set list = array_difference(list, [6]);
  2. UPDATE OK, 1 row affected (... sec)
  1. cr> select * from array_difference_example;
  2. +------------------------------+
  3. | list |
  4. +------------------------------+
  5. | [1, 2, 3, 4, 5, 7, 8, 9, 10] |
  6. +------------------------------+
  7. SELECT 1 row in set (... sec)

array(subquery)

The array(subquery) expression is an array constructor function which operates on the result of the subquery.

Returns: array

See also

Array construction with subquery

array_upper(anyarray, dimension)

The array_upper function returns the number of elements in the requested array dimension (the upper bound of the dimension).

Returns: integer

  1. cr> select array_upper([[1, 4], [3]], 1) AS size;
  2. +------+
  3. | size |
  4. +------+
  5. | 2 |
  6. +------+
  7. SELECT 1 row in set (... sec)

array_length(anyarray, dimension)

The array_length function returns the number of elements in the requested array dimension.

Returns: integer

  1. cr> select array_length([[1, 4], [3]], 1) AS len;
  2. +-----+
  3. | len |
  4. +-----+
  5. | 2 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

array_lower(anyarray, dimension)

The array_lower function returns the lower bound of the requested array dimension (which is 1 if the dimension is valid and has at least one element).

Returns: integer

  1. cr> select array_lower([[1, 4], [3]], 1) AS size;
  2. +------+
  3. | size |
  4. +------+
  5. | 1 |
  6. +------+
  7. SELECT 1 row in set (... sec)

array_to_string(anyarray, separator, [ null_string ])

The array_to_string function concatenates elements of the given array into a single string using the separator.

Returns: text

  1. cr> select array_to_string(['Arthur', 'Ford', 'Trillian'], ',') AS str;
  2. +----------------------+
  3. | str |
  4. +----------------------+
  5. | Arthur,Ford,Trillian |
  6. +----------------------+
  7. SELECT 1 row in set (... sec)

If the separator argument is NULL, the result is NULL:

  1. cr> select array_to_string(['Arthur', 'Ford', 'Trillian'], NULL) AS str;
  2. +------+
  3. | str |
  4. +------+
  5. | NULL |
  6. +------+
  7. SELECT 1 row in set (... sec)

If null_string is provided and is not NULL, then NULL elements of the array are replaced by that string, otherwise they are omitted:

  1. cr> select array_to_string(['Arthur', NULL, 'Trillian'], ',', 'Ford') AS str;
  2. +----------------------+
  3. | str |
  4. +----------------------+
  5. | Arthur,Ford,Trillian |
  6. +----------------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select array_to_string(['Arthur', NULL, 'Trillian'], ',') AS str;
  2. +-----------------+
  3. | str |
  4. +-----------------+
  5. | Arthur,Trillian |
  6. +-----------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select array_to_string(['Arthur', NULL, 'Trillian'], ',', NULL) AS str;
  2. +-----------------+
  3. | str |
  4. +-----------------+
  5. | Arthur,Trillian |
  6. +-----------------+
  7. SELECT 1 row in set (... sec)

string_to_array(string, separator, [ null_string ])

The string_to_array splits a string into an array of text elements using a supplied separator and an optional null-string to set matching substring elements to NULL.

Returns: array(text)

  1. cr> select string_to_array('Arthur,Ford,Trillian', ',') AS arr;
  2. +--------------------------------+
  3. | arr |
  4. +--------------------------------+
  5. | ["Arthur", "Ford", "Trillian"] |
  6. +--------------------------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select string_to_array('Arthur,Ford,Trillian', ',', 'Ford') AS arr;
  2. +------------------------------+
  3. | arr |
  4. +------------------------------+
  5. | ["Arthur", null, "Trillian"] |
  6. +------------------------------+
  7. SELECT 1 row in set (... sec)

separator

If the separator argument is NULL, each character of the input string becomes a separate element in the resulting array.

  1. cr> select string_to_array('Ford', NULL) AS arr;
  2. +----------------------+
  3. | arr |
  4. +----------------------+
  5. | ["F", "o", "r", "d"] |
  6. +----------------------+
  7. SELECT 1 row in set (... sec)

If the separator is an empty string, then the entire input string is returned as a one-element array.

  1. cr> select string_to_array('Arthur,Ford', '') AS arr;
  2. +-----------------+
  3. | arr |
  4. +-----------------+
  5. | ["Arthur,Ford"] |
  6. +-----------------+
  7. SELECT 1 row in set (... sec)

null_string

If the null_string argument is omitted or NULL, none of the substrings of the input will be replaced by NULL.

array_min(array)

The array_min function returns the smallest element in array. If array is NULL or an empty array, the function returns NULL. This function supports arrays of any of the primitive types.

  1. cr> SELECT array_min([3, 2, 1]) AS min;
  2. +-----+
  3. | min |
  4. +-----+
  5. | 1 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

array_max(array)

The array_max function returns the largest element in array. If array is NULL or an empty array, the function returns NULL. This function supports arrays of any of the primitive types.

  1. cr> SELECT array_max([1,2,3]) AS max;
  2. +-----+
  3. | max |
  4. +-----+
  5. | 3 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

array_sum(array)

Returns the sum of array elements that are not NULL. If array is NULL or an empty array, the function returns NULL. This function supports arrays of any numeric types.

For real and double precison arguments, the return type is equal to the argument type. For char, smallint, integer, and bigint arguments, the return type changes to bigint.

If any bigint value exceeds range limits (-2^64 to 2^64-1), an ArithmeticException will be raised.

  1. cr> SELECT array_sum([1,2,3]) AS sum;
  2. +-----+
  3. | sum |
  4. +-----+
  5. | 6 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

The sum on the bigint array will result in an overflow in the following query:

  1. cr> SELECT array_sum([9223372036854775807,9223372036854775807]) as sum;
  2. ArithmeticException[long overflow]

To address the overflow of the sum of the given array elements, we cast the array to the numeric data type:

  1. cr> SELECT array_sum([9223372036854775807,9223372036854775807] :: numeric[])
  2. ... as sum;
  3. +----------------------+
  4. | sum |
  5. +----------------------+
  6. | 18446744073709551614 |
  7. +----------------------+
  8. SELECT 1 row in set (... sec)

array_avg(array)

Returns the average of all values in array that are not NULL If array is NULL or an empty array, the function returns NULL. This function supports arrays of any numeric types.

For real and double precison arguments, the return type is equal to the argument type. For char, smallint, integer, and bigint arguments, the return type is numeric.

  1. cr> SELECT array_avg([1,2,3]) AS avg;
  2. +-----+
  3. | avg |
  4. +-----+
  5. | 2 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

Conditional functions and expressions

CASE WHEN ... THEN ... END

The case expression is a generic conditional expression similar to if/else statements in other programming languages and can be used wherever an expression is valid.

  1. CASE WHEN condition THEN result
  2. [WHEN ...]
  3. [ELSE result]
  4. END

Each condition expression must result in a boolean value. If the condition’s result is true, the value of the result expression that follows the condition will be the final result of the case expression and the subsequent when branches will not be processed. If the condition’s result is not true, any subsequent when clauses are examined in the same manner. If no when condition yields true, the value of the case expression is the result of the else clause. If the else clause is omitted and no condition is true, the result is null.

Example:

  1. cr> select id,
  2. ... case when id = 0 then 'zero'
  3. ... when id % 2 = 0 then 'even'
  4. ... else 'odd'
  5. ... end as parity
  6. ... from case_example order by id;
  7. +----+--------+
  8. | id | parity |
  9. +----+--------+
  10. | 0 | zero |
  11. | 1 | odd |
  12. | 2 | even |
  13. | 3 | odd |
  14. +----+--------+
  15. SELECT 4 rows in set (... sec)

As a variant, a case expression can be written using the simple form:

  1. CASE expression
  2. WHEN value THEN result
  3. [WHEN ...]
  4. [ELSE result]
  5. END

Example:

  1. cr> select id,
  2. ... case id when 0 then 'zero'
  3. ... when 1 then 'one'
  4. ... else 'other'
  5. ... end as description
  6. ... from case_example order by id;
  7. +----+-------------+
  8. | id | description |
  9. +----+-------------+
  10. | 0 | zero |
  11. | 1 | one |
  12. | 2 | other |
  13. | 3 | other |
  14. +----+-------------+
  15. SELECT 4 rows in set (... sec)

Note

All result expressions must be convertible to a single data type.

if(condition, result [, default])

The if function is a conditional function comparing to if statements of most other programming languages. If the given condition expression evaluates to true, the result expression is evaluated and its value is returned. If the condition evaluates to false, the result expression is not evaluated and the optional given default expression is evaluated instead and its value will be returned. If the default argument is omitted, NULL will be returned instead.

  1. cr> select id, if(id = 0, 'zero', 'other') as description from if_example order by id;
  2. +----+-------------+
  3. | id | description |
  4. +----+-------------+
  5. | 0 | zero |
  6. | 1 | other |
  7. | 2 | other |
  8. | 3 | other |
  9. +----+-------------+
  10. SELECT 4 rows in set (... sec)

coalesce('first_arg', second_arg [, ... ])

The coalesce function takes one or more arguments of the same type and returns the first non-null value of these. The result will be NULL only if all the arguments evaluate to NULL.

Returns: same type as arguments

  1. cr> select coalesce(clustered_by, 'nothing') AS clustered_by
  2. ... from information_schema.tables
  3. ... where table_name='nodes';
  4. +--------------+
  5. | clustered_by |
  6. +--------------+
  7. | nothing |
  8. +--------------+
  9. SELECT 1 row in set (... sec)

greatest('first_arg', second_arg[ , ... ])

The greatest function takes one or more arguments of the same type and will return the largest value of these. NULL values in the arguments list are ignored. The result will be NULL only if all the arguments evaluate to NULL.

Returns: same type as arguments

  1. cr> select greatest(1, 2) AS greatest;
  2. +----------+
  3. | greatest |
  4. +----------+
  5. | 2 |
  6. +----------+
  7. SELECT 1 row in set (... sec)

least('first_arg', second_arg[ , ... ])

The least function takes one or more arguments of the same type and will return the smallest value of these. NULL values in the arguments list are ignored. The result will be NULL only if all the arguments evaluate to NULL.

Returns: same type as arguments

  1. cr> select least(1, 2) AS least;
  2. +-------+
  3. | least |
  4. +-------+
  5. | 1 |
  6. +-------+
  7. SELECT 1 row in set (... sec)

nullif('first_arg', second_arg)

The nullif function compares two arguments of the same type and, if they have the same value, returns NULL; otherwise returns the first argument.

Returns: same type as arguments

  1. cr> select nullif(table_schema, 'sys') AS nullif
  2. ... from information_schema.tables
  3. ... where table_name='nodes';
  4. +--------+
  5. | nullif |
  6. +--------+
  7. | NULL |
  8. +--------+
  9. SELECT 1 row in set (... sec)

System information functions

CURRENT_SCHEMA

The CURRENT_SCHEMA system information function returns the name of the current schema of the session. If no current schema is set, this function will return the default schema, which is doc.

Returns: text

The default schema can be set when using the JDBC client and HTTP clients such as CrateDB PDO.

Note

The CURRENT_SCHEMA function has a special SQL syntax, meaning that it must be called without trailing parenthesis (()). However, CrateDB also supports the optional parenthesis.

Synopsis:

  1. CURRENT_SCHEMA [ ( ) ]

Example:

  1. cr> SELECT CURRENT_SCHEMA;
  2. +----------------+
  3. | current_schema |
  4. +----------------+
  5. | doc |
  6. +----------------+
  7. SELECT 1 row in set (... sec)

CURRENT_SCHEMAS(boolean)

The CURRENT_SCHEMAS() system information function returns the current stored schemas inside the search_path session state, optionally including implicit schemas (e.g. pg_catalog). If no custom search_path is set, this function will return the default search_path schemas.

Returns: array(text)

Synopsis:

  1. CURRENT_SCHEMAS ( boolean )

Example:

  1. cr> SELECT CURRENT_SCHEMAS(true) AS schemas;
  2. +-----------------------+
  3. | schemas |
  4. +-----------------------+
  5. | ["pg_catalog", "doc"] |
  6. +-----------------------+
  7. SELECT 1 row in set (... sec)

CURRENT_USER

The CURRENT_USER system information function returns the name of the current connected user or crate if the user management module is disabled.

Returns: text

Synopsis:

  1. CURRENT_USER

Example:

  1. cr> select current_user AS name;
  2. +-------+
  3. | name |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)

USER

Equivalent to CURRENT_USER.

Returns: text

Synopsis:

  1. USER

Example:

  1. cr> select user AS name;
  2. +-------+
  3. | name |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)

SESSION_USER

The SESSION_USER system information function returns the name of the current connected user or crate if the user management module is disabled.

Returns: text

Synopsis:

  1. SESSION_USER

Example:

  1. cr> select session_user AS name;
  2. +-------+
  3. | name |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)

Note

CrateDB doesn’t currently support the switching of execution context. This makes SESSION_USER functionally equivalent to CURRENT_USER. We provide it as it’s part of the SQL standard.

Additionally, the CURRENT_USER, SESSION_USER and USER functions have a a special SQL syntax, meaning that they must be called without trailing parenthesis (()).

pg_backend_pid()

The pg_backend_pid() system information function is implemented for enhanced compatibility with PostgreSQL. CrateDB will always return -1 as there isn’t a single process attached to one query. This is different to PostgreSQL, where this represents the process ID of the server process attached to the current session.

Returns: integer

Synopsis:

  1. pg_backend_pid()

Example:

  1. cr> select pg_backend_pid() AS pid;
  2. +-----+
  3. | pid |
  4. +-----+
  5. | -1 |
  6. +-----+
  7. SELECT 1 row in set (... sec)

pg_postmaster_start_time()

Returns the server start time as timestamp with time zone.

current_database()

The current_database function returns the name of the current database, which in CrateDB will always be crate:

  1. cr> select current_database() AS db;
  2. +-------+
  3. | db |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)

current_setting(text [,boolean])

The current_setting function returns the current value of a session setting.

Returns: text

Synopsis:

  1. current_setting(setting_name [, missing_ok])

If no setting exists for setting_name, current_setting throws an error, unless missing_ok argument is provided and is true.

Examples:

  1. cr> select current_setting('search_path') AS search_path;
  2. +-----------------+
  3. | search_path |
  4. +-----------------+
  5. | pg_catalog, doc |
  6. +-----------------+
  7. SELECT 1 row in set (... sec)
  1. cr> select current_setting('foo');
  2. SQLParseException[Unrecognised Setting: foo]
  1. cr> select current_setting('foo', true) AS foo;
  2. +------+
  3. | foo |
  4. +------+
  5. | NULL |
  6. +------+
  7. SELECT 1 row in set (... sec)

pg_get_expr

The function pg_get_expr is implemented to improve compatibility with clients that use the PostgreSQL wire protocol. The function always returns null.

Synopsis:

  1. pg_get_expr(expr text, relation_oid int)

Example:

  1. cr> select pg_get_expr('literal', 1) AS col;
  2. +------+
  3. | col |
  4. +------+
  5. | NULL |
  6. +------+
  7. SELECT 1 row in set (... sec)

pg_get_userbyid

The function pg_get_userbyid is implemented to improve compatibility with clients that use the PostgreSQL wire protocol. The function always returns the default CrateDB user for non-null arguments, otherwise, null is returned.

Returns: text

Synopsis:

  1. pg_get_userbyid(id integer)

Example:

  1. cr> select pg_get_userbyid(1) AS name;
  2. +-------+
  3. | name |
  4. +-------+
  5. | crate |
  6. +-------+
  7. SELECT 1 row in set (... sec)

pg_typeof

The function pg_typeof returns the text representation of the value’s data type passed to it.

Returns: text

Synopsis:

  1. pg_typeof(expression)

Example:

  1. cr> select pg_typeof([1, 2, 3]) as typeof;
  2. +---------------+
  3. | typeof |
  4. +---------------+
  5. | integer_array |
  6. +---------------+
  7. SELECT 1 row in set (... sec)

pg_function_is_visible

The function pg_function_is_visible returns true for OIDs that refer to a system or a user defined function.

Returns: boolean

Synopsis:

  1. pg_function_is_visible(OID)

Example:

  1. cr> select pg_function_is_visible(-919555782) as pg_function_is_visible;
  2. +------------------------+
  3. | pg_function_is_visible |
  4. +------------------------+
  5. | TRUE |
  6. +------------------------+
  7. SELECT 1 row in set (... sec)

pg_get_function_result

The function pg_get_function_result returns the text representation of the return value’s data type of the function referred by the OID.

Returns: text

Synopsis:

  1. pg_get_function_result(OID)

Example:

  1. cr> select pg_get_function_result(-919555782) as _pg_get_function_result;
  2. +-------------------------+
  3. | _pg_get_function_result |
  4. +-------------------------+
  5. | time with time zone |
  6. +-------------------------+
  7. SELECT 1 row in set (... sec)

version

Returns the CrateDB version information.

Returns: text

Synopsis:

  1. version()

Example:

  1. cr> select version() AS version;
  2. +---------...-+
  3. | version |
  4. +---------...-+
  5. | CrateDB ... |
  6. +---------...-+
  7. SELECT 1 row in set (... sec)

obj_description(integer, text)

This function exists mainly for compatibility with PostgreSQL. In PostgreSQL, the function returns the comment for a database object. CrateDB doesn’t support user defined comments for database objects, so it always returns null.

Returns: text

Example:

  1. cr> SELECT pg_catalog.obj_description(1, 'pg_type') AS comment;
  2. +---------+
  3. | comment |
  4. +---------+
  5. | NULL |
  6. +---------+
  7. SELECT 1 row in set (... sec)

format_type(integer, integer)

Returns the type name of a type. The first argument is the OID of the type. The second argument is the type modifier. This function exits for PostgreSQL compatibility and the type modifier is always ignored.

Returns: text

Example:

  1. cr> SELECT pg_catalog.format_type(25, null) AS name;
  2. +------+
  3. | name |
  4. +------+
  5. | text |
  6. +------+
  7. SELECT 1 row in set (... sec)

If the given OID is not know, ??? is returned:

  1. cr> SELECT pg_catalog.format_type(3, null) AS name;
  2. +------+
  3. | name |
  4. +------+
  5. | ??? |
  6. +------+
  7. SELECT 1 row in set (... sec)

Special functions

ignore3vl(boolean)

The ignore3vl function operates on a boolean argument and eliminates the 3-valued logic#Comparisonswith_NULL_and_the_three-valued_logic(3VL)) on the whole tree of operators beneath it. More specifically, FALSE is evaluated to FALSE, TRUE to TRUE and NULL to FALSE.

Returns: boolean

Note

The main usage of the ignore3vl function is in the WHERE clause when a NOT operator is involved. Such filtering, with 3-valued logic#Comparisonswith_NULL_and_the_three-valued_logic(3VL)), cannot be translated to an optimized query in the internal storage engine, and therefore can degrade performance. E.g.:

  1. SELECT * FROM t
  2. WHERE NOT 5 = ANY(t.int_array_col);

If we can ignore the 3-valued logic#Comparisonswith_NULL_and_the_three-valued_logic(3VL)), we can write the query as:

  1. SELECT * FROM t
  2. WHERE NOT IGNORE3VL(5 = ANY(t.int_array_col));

which will yield better performance (in execution time) than before.

Caution

If there are NULL values in the long_array_col, in the case that 5 = ANY(t.long_array_col) evaluates to NULL, without the ignore3vl, it would be evaluated as NOT NULL => NULL, resulting to zero matched rows. With the IGNORE3VL in place it will be evaluated as NOT FALSE => TRUE resulting to all rows matching the filter. E.g:

  1. cr> SELECT * FROM t
  2. ... WHERE NOT 5 = ANY(t.int_array_col);
  3. +---------------+
  4. | int_array_col |
  5. +---------------+
  6. +---------------+
  7. SELECT 0 rows in set (... sec)
  1. cr> SELECT * FROM t
  2. ... WHERE NOT IGNORE3VL(5 = ANY(t.int_array_col));
  3. +-----------------+
  4. | int_array_col |
  5. +-----------------+
  6. | [1, 2, 3, null] |
  7. +-----------------+
  8. SELECT 1 row in set (... sec)

Synopsis:

  1. ignore3vl(boolean)

Example:

  1. cr> SELECT ignore3vl(true) as v1, ignore3vl(false) as v2, ignore3vl(null) as v3;
  2. +------+-------+-------+
  3. | v1 | v2 | v3 |
  4. +------+-------+-------+
  5. | TRUE | FALSE | FALSE |
  6. +------+-------+-------+
  7. SELECT 1 row in set (... sec)

1

https://dev.mysql.com/doc/refman/5.6/en/date-and-time-functions.html#function_date-format