Greenplum PL/pgSQL Procedural Language

Greenplum PL/pgSQL Procedural Language

This section contains an overview of the Greenplum Database PL/pgSQL language.

About Greenplum Database PL/pgSQL
PL/pgSQL Plan Caching
PL/pgSQL Examples
References

Parent topic: Greenplum Database Reference Guide

About Greenplum Database PL/pgSQL

Greenplum Database PL/pgSQL is a loadable procedural language that is installed and registered by default with Greenplum Database. You can create user-defined functions using SQL statements, functions, and operators.

With PL/pgSQL you can group a block of computation and a series of SQL queries inside the database server, thus having the power of a procedural language and the ease of use of SQL. Also, with PL/pgSQL you can use all the data types, operators and functions of Greenplum Database SQL.

The PL/pgSQL language is a subset of Oracle PL/SQL. Greenplum Database PL/pgSQL is based on Postgres PL/pgSQL. The Postgres PL/pgSQL documentation is at https://www.postgresql.org/docs/8.3/static/plpgsql.html

When using PL/pgSQL functions, function attributes affect how Greenplum Database creates query plans. You can specify the attribute IMMUTABLE, STABLE, or VOLATILE as part of the LANGUAGE clause to classify the type of function, For information about the creating functions and function attributes, see the CREATE FUNCTION command in the Greenplum Database Reference Guide.

You can run PL/SQL code blocks as anonymous code blocks. See the DO command in the Greenplum Database Reference Guide.

Greenplum Database SQL Limitations

When using Greenplum Database PL/pgSQL, limitations include

Triggers are not supported
Cursors are forward moving only (not scrollable)
Updatable cursors (UPDATE...WHERE CURRENT OF and DELETE...WHERE CURRENT OF) are not supported.

For information about Greenplum Database SQL conformance, see Summary of Greenplum Features in the Greenplum Database Reference Guide.

The PL/pgSQL Language

PL/pgSQL is a block-structured language. The complete text of a function definition must be a block. A block is defined as:

[ <label> ]
[ DECLARE
   declarations ]
BEGIN
   statements
END [ label ];

Each declaration and each statement within a block is terminated by a semicolon (;). A block that appears within another block must have a semicolon after END, as shown in the previous block. The END that concludes a function body does not require a semicolon.

Important: Do not confuse the use of the BEGIN and END keywords for grouping statements in PL/pgSQL with the database commands for transaction control. The PL/pgSQL BEGIN and END keywords are only for grouping; they do not start or end a transaction. Functions are always executed within a transaction established by an outer query — they cannot start or commit that transaction, since there would be no context for them to execute in. However, a PL/pgSQL block that contains an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction. For more about the EXCEPTION clause, see the post the Postgres documentation on error trapping at https://www.postgresql.org/docs/8.3/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING.

All key words and identifiers can be written in mixed upper and lower case. Identifiers are implicitly converted to lowercase unless enclosed in double-quotes ( “ ).

You can add comments in PL/pgSQL in the following ways:

A double dash (—) starts a comment that extends to the end of the line.
A /* starts a block comment that extends to the next occurrence of */.

Block comments cannot be nested, but double dash comments can be enclosed into a block comment and a double dash can hide the block comment delimiters /* and */.

Any statement in the statement section of a block can be a subblock. Subblocks can be used for logical grouping or to localize variables to a small group of statements.

The variables declared in the declarations section preceding a block are initialized to their default values every time the block is entered, not only once per function call. For example declares the variable quantity several times:

CREATE FUNCTION testfunc() RETURNS integer AS $$
DECLARE
   quantity integer := 30;
BEGIN
   RAISE NOTICE 'Quantity here is %', quantity;  
   -- Quantity here is 30
   quantity := 50;
   --
   -- Create a subblock
   --
   DECLARE
      quantity integer := 80;
   BEGIN
      RAISE NOTICE 'Quantity here is %', quantity;  
      -- Quantity here is 80
   END;
   RAISE NOTICE 'Quantity here is %', quantity; 
   -- Quantity here is 50
   RETURN quantity;
END;
$$ LANGUAGE plpgsql;

Executing SQL Commands

You can execute SQL commands with PL/pgSQL statements such as EXECUTE, PERFORM, and SELECT ... INTO. For information about the PL/pgSQL statements, see https://www.postgresql.org/docs/8.3/static/plpgsql-statements.html.

Note: The PL/pgSQL statement SELECT INTO is not supported in the EXECUTE statement.

PL/pgSQL Plan Caching

A PL/pgSQL function’s volatility classification has implications on how Greenplum Database caches plans that reference the function. Refer to Function Volatility and Plan Caching in the Greenplum Database Administrator Guide for information on plan caching considerations for Greenplum Database function volatility categories.

When a PL/pgSQL function executes for the first time in a database session, the PL/pgSQL interpreter parses the function’s SQL expressions and commands. The interpreter creates a prepared execution plan as each expression and SQL command is first executed in the function. The PL/pgSQL interpreter reuses the execution plan for a specific expression and SQL command for the life of the database connection. While this reuse substantially reduces the total amount of time required to parse and generate plans, errors in a specific expression or command cannot be detected until run time when that part of the function is executed.

Greenplum Database will automatically re-plan a saved query plan if there is any schema change to any relation used in the query, or if any user-defined function used in the query is redefined. This makes the re-use of a prepared plan transparent in most cases.

The SQL commands that you use in a PL/pgSQL function must refer to the same tables and columns on every execution. You cannot use a parameter as the name of a table or a column in an SQL command.

PL/pgSQL caches a separate query plan for each combination of actual argument types in which you invoke a polymorphic function to ensure that data type differences do not cause unexpected failures.

Refer to the PostgreSQL Plan Caching documentation for a detailed discussion of plan caching considerations in the PL/pgSQL language.

PL/pgSQL Examples

The following are examples of PL/pgSQL user-defined functions.

Example: Aliases for Function Parameters

Parameters passed to functions are named with identifiers such as $1, $2. Optionally, aliases can be declared for $n parameter names for increased readability. Either the alias or the numeric identifier can then be used to refer to the parameter value.

There are two ways to create an alias. The preferred way is to give a name to the parameter in the CREATE FUNCTION command, for example:

CREATE FUNCTION sales_tax(subtotal real) RETURNS real AS $$
BEGIN
   RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;

You can also explicitly declare an alias, using the declaration syntax:

name ALIAS FOR $n;

This example, creates the same function with the DECLARE syntax.

CREATE FUNCTION sales_tax(real) RETURNS real AS $$
DECLARE
    subtotal ALIAS FOR $1;
BEGIN
    RETURN subtotal * 0.06;
END;
$$ LANGUAGE plpgsql;

Example: Using the Data Type of a Table Column

When declaring a variable, you can use %TYPE to specify the data type of a variable or table column. This is the syntax for declaring a variable with the data type of a table column:

name table.column_name%TYPE;

You can use this to declare variables that will hold database values. For example, if you have a column named user_id in your users table. To declare the variable my_userid with the same data type as the users.user_id column:

my_userid users.user_id%TYPE;

%TYPE is particularly valuable in polymorphic functions, since the data types needed for internal variables may change from one call to the next. Appropriate variables can be created by applying %TYPE to the function’s arguments or result placeholders.

Example: Composite Type Based on a Table Row

The following syntax declares a composite variable based on table row:

name table_name%ROWTYPE;

Such a row variable can hold a whole row of a SELECT or FOR query result, so long as that query column set matches the declared type of the variable. The individual fields of the row value are accessed using the usual dot notation, for example rowvar.column.

Parameters to a function can be composite types (complete table rows). In that case, the corresponding identifier $n will be a row variable, and fields can be selected from it, for example $1.user_id.

Only the user-defined columns of a table row are accessible in a row-type variable, not the OID or other system columns. The fields of the row type inherit the table’s field size or precision for data types such as char(n).

The next example function uses a row variable composite type. Before creating the function, create the table that is used by the function with this command.

CREATE TABLE table1 (
  f1 text,
  f2 numeric,
  f3 integer
) distributed by (f1);

This INSERT command adds data to the table.

INSERT INTO table1 values 
 ('test1', 14.1, 3),
 ('test2', 52.5, 2),
 ('test3', 32.22, 6),
 ('test4', 12.1, 4) ;

This function uses a variable and ROWTYPE composite variable based on table1.

CREATE OR REPLACE FUNCTION t1_calc( name text) RETURNS integer 
AS $$ 
DECLARE
    t1_row   table1%ROWTYPE;
    calc_int table1.f3%TYPE;
BEGIN
    SELECT * INTO t1_row FROM table1 WHERE table1.f1 = $1 ;
    calc_int = (t1_row.f2 * t1_row.f3)::integer ;
    RETURN calc_int ;
END;
$$ LANGUAGE plpgsql VOLATILE;

Note: The previous function is classified as a VOLATILE function because function values could change within a single table scan.

The following SELECT command uses the function.

select t1_calc( 'test1' );

Note: The example PL/pgSQL function uses SELECT with the INTO clause. It is different from the SQL command SELECT INTO. If you want to create a table from a SELECT result inside a PL/pgSQL function, use the SQL command CREATE TABLE AS.

Example: Using a Variable Number of Arguments

You can declare a PL/pgSQL function to accept variable numbers of arguments, as long as all of the optional arguments are of the same data type. You must mark the last argument of the function as VARIADIC and declare the argument using an array type. You can refer to a function that includes VARIADIC arguments as a variadic function.

For example, this variadic function returns the minimum value of a variable array of numerics:

CREATE FUNCTION mleast (VARIADIC numeric[]) 
    RETURNS numeric AS $$
  DECLARE minval numeric;
  BEGIN
    SELECT min($1[i]) FROM generate_subscripts( $1, 1) g(i) INTO minval;
    RETURN minval;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION
SELECT mleast(10, -1, 5, 4.4);
 mleast
--------
     -1
(1 row)

Effectively, all of the actual arguments at or beyond the VARIADIC position are gathered up into a one-dimensional array.

You can pass an already-constructed array into a variadic function. This is particularly useful when you want to pass arrays between variadic functions. Specify VARIADIC in the function call as follows:

SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);

This prevents PL/pgSQL from expanding the function’s variadic parameter into its element type.

Example: Using Default Argument Values

You can declare PL/pgSQL functions with default values for some or all input arguments. The default values are inserted whenever the function is called with fewer than the declared number of arguments. Because arguments can only be omitted from the end of the actual argument list, you must provide default values for all arguments after an argument defined with a default value.

For example:

CREATE FUNCTION use_default_args(a int, b int DEFAULT 2, c int DEFAULT 3)
    RETURNS int AS $$
DECLARE
    sum int;
BEGIN
    sum := $1 + $2 + $3;
    RETURN sum;
END;
$$ LANGUAGE plpgsql;
SELECT use_default_args(10, 20, 30);
 use_default_args
------------------
               60
(1 row)
SELECT use_default_args(10, 20);
 use_default_args
------------------
               33
(1 row)
SELECT use_default_args(10);
 use_default_args
------------------
               15
(1 row)

You can also use the = sign in place of the keyword DEFAULT.

Example: Using Polymorphic Data Types

PL/pgSQL supports the polymorphic anyelement, anyarray, anyenum, and anynonarray types. Using these types, you can create a single PL/pgSQL function that operates on multiple data types. Refer to Greenplum Database Data Types for additional information on polymorphic type support in Greenplum Database.

A special parameter named $0 is created when the return type of a PL/pgSQL function is declared as a polymorphic type. The data type of $0 identifies the return type of the function as deduced from the actual input types.

In this example, you create a polymorphic function that returns the sum of two values:

CREATE FUNCTION add_two_values(v1 anyelement,v2 anyelement)
    RETURNS anyelement AS $$ 
DECLARE 
    sum ALIAS FOR $0;
BEGIN
    sum := v1 + v2;
    RETURN sum;
END;
$$ LANGUAGE plpgsql;

Execute add_two_values() providing integer input values:

SELECT add_two_values(1, 2);
 add_two_values
----------------
              3
(1 row)

The return type of add_two_values() is integer, the type of the input arguments. Now execute add_two_values() providing float input values:

SELECT add_two_values (1.1, 2.2);
 add_two_values
----------------
            3.3
(1 row)

The return type of add_two_values() in this case is float.

You can also specify VARIADIC arguments in polymorphic functions.

Example: Anonymous Block

This example executes the function in the previous example as an anonymous block with the DO command. In the example, the anonymous block retrieves the input value from a temporary table.

CREATE TEMP TABLE list AS VALUES ('test1') DISTRIBUTED RANDOMLY;
DO $$ 
DECLARE
    t1_row   table1%ROWTYPE;
    calc_int table1.f3%TYPE;
BEGIN
    SELECT * INTO t1_row FROM table1, list WHERE table1.f1 = list.column1 ;
    calc_int = (t1_row.f2 * t1_row.f3)::integer ;
    RAISE NOTICE 'calculated value is %', calc_int ;
END $$ LANGUAGE plpgsql ;

References

The Postgres documentation about PL/pgSQL is at https://www.postgresql.org/docs/8.3/static/plpgsql.html

Also, see the CREATE FUNCTION command in the Greenplum Database Reference Guide.

For a summary of built-in Greenplum Database functions, see Summary of Built-in Functions in the Greenplum Database Reference Guide. For information about using Greenplum Database functions see “Querying Data” in the Greenplum Database Administrator Guide

For information about porting Oracle functions, see https://www.postgresql.org/docs/8.3/static/plpgsql-porting.html. For information about installing and using the Oracle compatibility functions with Greenplum Database, see “Oracle Compatibility Functions” in the Greenplum Database Utility Guide.