User-defined functions
Table of contents
CREATE OR REPLACE
CrateDB supports user-defined functions. See CREATE FUNCTION for a full syntax description.
CREATE FUNCTION
defines a new function:
cr> CREATE FUNCTION my_subtract_function(integer, integer)
... RETURNS integer
... LANGUAGE JAVASCRIPT
... AS 'function my_subtract_function(a, b) { return a - b; }';
CREATE OK, 1 row affected (... sec)
cr> SELECT doc.my_subtract_function(3, 1) AS col;
+-----+
| col |
+-----+
| 2 |
+-----+
SELECT 1 row in set (... sec)
CREATE OR REPLACE FUNCTION
will either create a new function or replace an existing function definition:
cr> CREATE OR REPLACE FUNCTION log10(bigint)
... RETURNS double precision
... LANGUAGE JAVASCRIPT
... AS 'function log10(a) {return Math.log(a)/Math.log(10); }';
CREATE OK, 1 row affected (... sec)
cr> SELECT doc.log10(10) AS col;
+-----+
| col |
+-----+
| 1.0 |
+-----+
SELECT 1 row in set (... sec)
It is possible to use named function arguments in the function signature. For example, the calculate_distance
function signature has two geo_point
arguments named start
and end
:
cr> CREATE OR REPLACE FUNCTION calculate_distance("start" geo_point, "end" geo_point)
... RETURNS real
... LANGUAGE JAVASCRIPT
... AS 'function calculate_distance(start, end) {
... return Math.sqrt(
... Math.pow(end[0] - start[0], 2),
... Math.pow(end[1] - start[1], 2));
... }';
CREATE OK, 1 row affected (... sec)
Note
Argument names are used for query documentation purposes only. You cannot reference arguments by name in the function body.
Optionally, a schema-qualified function name can be defined. If you omit the schema, the current session schema is used:
cr> CREATE OR REPLACE FUNCTION my_schema.log10(bigint)
... RETURNS double precision
... LANGUAGE JAVASCRIPT
... AS 'function log10(a) { return Math.log(a)/Math.log(10); }';
CREATE OK, 1 row affected (... sec)
Note
In order to improve the PostgreSQL server compatibility CrateDB allows the creation of user defined functions against the pg_catalog schema. However, the creation of user defined functions against the read-only System information and Information schema schemas is prohibited.
Supported types
Function arguments and return values can be any of the supported data types. The values passed into a function must strictly correspond to the specified argument data types.
Note
The value returned by the function will be casted to the return type provided in the definition if required. An exception will be thrown if the cast is not successful.
Overloading
Within a specific schema, you can overload functions by defining functions with the same name but a different set of arguments:
cr> CREATE FUNCTION my_schema.my_multiply(integer, integer)
... RETURNS integer
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b) { return a * b; }';
CREATE OK, 1 row affected (... sec)
This would overload the my_multiply
function with different argument types:
cr> CREATE FUNCTION my_schema.my_multiply(bigint, bigint)
... RETURNS bigint
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b) { return a * b; }';
CREATE OK, 1 row affected (... sec)
This would overload the my_multiply
function with more arguments:
cr> CREATE FUNCTION my_schema.my_multiply(bigint, bigint, bigint)
... RETURNS bigint
... LANGUAGE JAVASCRIPT
... AS 'function my_multiply(a, b, c) { return a * b * c; }';
CREATE OK, 1 row affected (... sec)
Caution
It is considered bad practice to create functions that have the same name as the CrateDB built-in functions!
Note
If you call a function without a schema name, CrateDB will look it up in the built-in functions first and only then in the user-defined functions available in the search_path.
Therefore a built-in function with the same name as a user-defined function will hide the latter, even if it contains a different set of arguments! However, such functions can still be called if the schema name is explicitly provided.
Determinism
Caution
User-defined functions need to be deterministic, meaning that they must always return the same result value when called with the same argument values, because CrateDB might cache the returned values and reuse the value if the function is called multiple times with the same arguments.
DROP FUNCTION
Functions can be dropped like this:
cr> DROP FUNCTION doc.log10(bigint);
DROP OK, 1 row affected (... sec)
Adding IF EXISTS
prevents from raising an error if the function doesn’t exist:
cr> DROP FUNCTION IF EXISTS doc.log10(integer);
DROP OK, 1 row affected (... sec)
Optionally, argument names can be specified within the drop statement:
cr> DROP FUNCTION IF EXISTS doc.calculate_distance(start_point geo_point, end_point geo_point);
DROP OK, 1 row affected (... sec)
Optionally, you can provide a schema:
cr> DROP FUNCTION my_schema.log10(bigint);
DROP OK, 1 row affected (... sec)
Supported languages
Currently, CrateDB only supports JavaScript for user-defined functions.
JavaScript
The user defined function JavaScript is compatible with the ECMAScript 2019 specification.
CrateDB uses the GraalVM JavaScript engine as a JavaScript (ECMAScript) language execution runtime. The GraalVM JavaScript engine is a Java application that works on the stock Java Virtual Machines (VMs). The interoperability between Java code (host language) and JavaScript user-defined functions (guest language) is guaranteed by the GraalVM Polyglot API.
Please note: CrateDB does not use the GraalVM JIT compiler as optimizing compiler. However, the stock host Java VM JIT compilers can JIT-compile, optimize, and execute the GraalVM JavaScript codebase to a certain extent.
The execution context for guest JavaScript is created with restricted privileges to allow for the safe execution of less trusted guest language code. The guest language application context for each user-defined function is created with default access modifiers, so any access to managed resources is denied. The only exception is the host language interoperability configuration which explicitly allows access to Java lists and arrays. Please refer to GraalVM Security Guide for more detailed information.
Also, even though user-defined functions implemented with ECMA-compliant JavaScript, objects that are normally accessible with a web browser (e.g. window
, console
, and so on) are not available.
JavaScript supported types
JavaScript functions can handle all CrateDB data types. However, for some return types the function output must correspond to the certain format.
If a function requires geo_point
as a return type, then the JavaScript function must return a double precision
array of size 2, WKT
string or GeoJson
object.
Here is an example of a JavaScript function returning a double array
:
cr> CREATE FUNCTION rotate_point(point geo_point, angle real)
... RETURNS geo_point
... LANGUAGE JAVASCRIPT
... AS 'function rotate_point(point, angle) {
... var cos = Math.cos(angle);
... var sin = Math.sin(angle);
... var x = cos * point[0] - sin * point[1];
... var y = sin * point[0] + cos * point[1];
... return [x, y];
... }';
CREATE OK, 1 row affected (... sec)
Below is an example of a JavaScript function returning a WKT
string, which will be cast to geo_point
:
cr> CREATE FUNCTION symmetric_point(point geo_point)
... RETURNS geo_point
... LANGUAGE JAVASCRIPT
... AS 'function symmetric_point (point, angle) {
... var x = - point[0],
... y = - point[1];
... return "POINT (\" + x + \", \" + y +\")";
... }';
CREATE OK, 1 row affected (... sec)
Similarly, if the function specifies the geo_shape
return data type, then the JavaScript function should return a GeoJson
object or WKT
string:
cr> CREATE FUNCTION line("start" array(double precision), "end" array(double precision))
... RETURNS object
... LANGUAGE JAVASCRIPT
... AS 'function line(start, end) {
... return { "type": "LineString", "coordinates" : [start_point, end_point] };
... }';
CREATE OK, 1 row affected (... sec)
Note
If the return value of the JavaScript function is undefined
, it is converted to NULL
.
Working with NUMBERS
The JavaScript engine interprets numbers as java.lang.Double
, java.lang.Long
, or java.lang.Integer
, depending on the computation performed. In most cases, this is not an issue, since the return type of the JavaScript function will be cast to the return type specified in the CREATE FUNCTION
statement, although cast might result in a loss of precision.
However, when you try to cast DOUBLE PRECISION
to TIMESTAMP WITH TIME ZONE
, it will be interpreted as UTC seconds and will result in a wrong value:
cr> CREATE FUNCTION utc(bigint, bigint, bigint)
... RETURNS TIMESTAMP WITH TIME ZONE
... LANGUAGE JAVASCRIPT
... AS 'function utc(year, month, day) {
... return Date.UTC(year, month, day, 0, 0, 0);
... }';
CREATE OK, 1 row affected (... sec)
cr> SELECT date_format(utc(2016,04,6)) as epoque;
+------------------------------+
| epoque |
+------------------------------+
| 48314-07-22T00:00:00.000000Z |
+------------------------------+
SELECT 1 row in set (... sec)
To avoid this behavior, the numeric value should be divided by 1000 before it is returned:
cr> CREATE FUNCTION utc(bigint, bigint, bigint)
... RETURNS TIMESTAMP WITH TIME ZONE
... LANGUAGE JAVASCRIPT
... AS 'function utc(year, month, day) {
... return Date.UTC(year, month, day, 0, 0, 0)/1000;
... }';
CREATE OK, 1 row affected (... sec)
cr> SELECT date_format(utc(2016,04,6)) as epoque;
+-----------------------------+
| epoque |
+-----------------------------+
| 2016-05-06T00:00:00.000000Z |
+-----------------------------+
SELECT 1 row in set (... sec)
Working with Array
methods
The JavaScript Array
object has a number of prototype methods you can use, such as join, map, sort, slice, reduce, and so on.
Normally, you can call these methods directly from an Array
object, like so:
function array_join(a, b) {
return a.join(b);
}
However, when writing JavaScript for use with CrateDB, you must explicitly use the prototype method:
function array_join(a, b) {
return Array.prototype.join.call(a, b);
}
You must do it like this because arguments are not passed as Array
objects, and so do not have the associated prototype methods available. Arguments are instead passed as array-like objects.