PL/pgSQL Language

PL/pgSQL Language

This section contains an overview of the Greenplum Database PL/pgSQL language.

About Greenplum Database PL/pgSQL

Greenplum Database PL/pgSQL is a loadable procedural language that is installed and registered by default with Greenplum Database. You can create user-defined functions using SQL statements, functions, and operators.

With PL/pgSQL you can group a block of computation and a series of SQL queries inside the database server, thus having the power of a procedural language and the ease of use of SQL. Also, with PL/pgSQL you can use all the data types, operators and functions of Greenplum Database SQL.

The PL/pgSQL language is a subset of Oracle PL/SQL. Greenplum Database PL/pgSQL is based on Postgres PL/pgSQL. The Postgres PL/pgSQL documentation is at https://www.postgresql.org/docs/9.4/plpgsql.html

When using PL/pgSQL functions, function attributes affect how Greenplum Database creates query plans. You can specify the attribute IMMUTABLE, STABLE, or VOLATILE as part of the LANGUAGE clause to classify the type of function, For information about the creating functions and function attributes, see the CREATE FUNCTION command in the Greenplum Database Reference Guide.

You can run PL/SQL code blocks as anonymous code blocks. See the DO command in the Greenplum Database Reference Guide.

Greenplum Database SQL Limitations

When using Greenplum Database PL/pgSQL, limitations include

  • Triggers are not supported
  • Cursors are forward moving only (not scrollable)
  • Updatable cursors (UPDATE…WHERE CURRENT OF and DELETE…WHERE CURRENT OF) are not supported.

For information about Greenplum Database SQL conformance, see Summary of Greenplum Features in the Greenplum Database Reference Guide.

The PL/pgSQL Language

PL/pgSQL is a block-structured language. The complete text of a function definition must be a block. A block is defined as:

  1. [ label ]
  2. [ DECLARE
  3. declarations ]
  4. BEGIN
  5. statements
  6. END [ label ];

Each declaration and each statement within a block is terminated by a semicolon (;). A block that appears within another block must have a semicolon after END, as shown in the previous block. The END that concludes a function body does not require a semicolon.

A label is required only if you want to identify the block for use in an EXIT statement, or to qualify the names of variables declared in the block. If you provide a label after END, it must match the label at the block’s beginning.

Important: Do not confuse the use of the BEGIN and END keywords for grouping statements in PL/pgSQL with the database commands for transaction control. The PL/pgSQL BEGIN and END keywords are only for grouping; they do not start or end a transaction. Functions are always executed within a transaction established by an outer query — they cannot start or commit that transaction, since there would be no context for them to execute in. However, a PL/pgSQL block that contains an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction. For more about the EXCEPTION clause, see the PostgreSQL documentation on trapping errors at https://www.postgresql.org/docs/9.4/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING.

Keywords are case-insensitive. Identifiers are implicitly converted to lowercase unless double-quoted, just as they are in ordinary SQL commands.

Comments work the same way in PL/pgSQL code as in ordinary SQL:

  • A double dash (—) starts a comment that extends to the end of the line.
  • A /* starts a block comment that extends to the matching occurrence of */.

    Block comments nest.

Any statement in the statement section of a block can be a subblock. Subblocks can be used for logical grouping or to localize variables to a small group of statements.

Variables declared in a subblock mask any similarly-named variables of outer blocks for the duration of the subblock. You can access the outer variables if you qualify their names with their block’s label. For example this function declares a variable named quantity several times:

  1. CREATE FUNCTION testfunc() RETURNS integer AS $$
  2. << outerblock >>
  3. DECLARE
  4. quantity integer := 30;
  5. BEGIN
  6. RAISE NOTICE 'Quantity here is %', quantity; -- Prints 30
  7. quantity := 50;
  8. --
  9. -- Create a subblock
  10. --
  11. DECLARE
  12. quantity integer := 80;
  13. BEGIN
  14. RAISE NOTICE 'Quantity here is %', quantity; -- Prints 80
  15. RAISE NOTICE 'Outer quantity here is %', outerblock.quantity; -- Prints 50
  16. END;
  17. RAISE NOTICE 'Quantity here is %', quantity; -- Prints 50
  18. RETURN quantity;
  19. END;
  20. $$ LANGUAGE plpgsql;

Executing SQL Commands

You can execute SQL commands with PL/pgSQL statements such as EXECUTE, PERFORM, and SELECT … INTO. For information about the PL/pgSQL statements, see https://www.postgresql.org/docs/9.4/plpgsql-statements.html.

Note: The PL/pgSQL statement SELECT INTO is not supported in the EXECUTE statement.

PL/pgSQL Plan Caching

A PL/pgSQL function’s volatility classification has implications on how Greenplum Database caches plans that reference the function. Refer to Function Volatility and Plan Caching in the Greenplum Database Administrator Guide for information on plan caching considerations for Greenplum Database function volatility categories.

When a PL/pgSQL function executes for the first time in a database session, the PL/pgSQL interpreter parses the function’s SQL expressions and commands. The interpreter creates a prepared execution plan as each expression and SQL command is first executed in the function. The PL/pgSQL interpreter reuses the execution plan for a specific expression and SQL command for the life of the database connection. While this reuse substantially reduces the total amount of time required to parse and generate plans, errors in a specific expression or command cannot be detected until run time when that part of the function is executed.

Greenplum Database will automatically re-plan a saved query plan if there is any schema change to any relation used in the query, or if any user-defined function used in the query is redefined. This makes the re-use of a prepared plan transparent in most cases.

The SQL commands that you use in a PL/pgSQL function must refer to the same tables and columns on every execution. You cannot use a parameter as the name of a table or a column in an SQL command.

PL/pgSQL caches a separate query plan for each combination of actual argument types in which you invoke a polymorphic function to ensure that data type differences do not cause unexpected failures.

Refer to the PostgreSQL Plan Caching documentation for a detailed discussion of plan caching considerations in the PL/pgSQL language.

PL/pgSQL Examples

The following are examples of PL/pgSQL user-defined functions.

Example: Aliases for Function Parameters

Parameters passed to functions are named with identifiers such as $1, $2. Optionally, aliases can be declared for $n parameter names for increased readability. Either the alias or the numeric identifier can then be used to refer to the parameter value.

There are two ways to create an alias. The preferred way is to give a name to the parameter in the CREATE FUNCTION command, for example:

  1. CREATE FUNCTION sales_tax(subtotal real) RETURNS real AS $$
  2. BEGIN
  3. RETURN subtotal * 0.06;
  4. END;
  5. $$ LANGUAGE plpgsql;

You can also explicitly declare an alias, using the declaration syntax:

  1. name ALIAS FOR $n;

This example, creates the same function with the DECLARE syntax.

  1. CREATE FUNCTION sales_tax(real) RETURNS real AS $$
  2. DECLARE
  3. subtotal ALIAS FOR $1;
  4. BEGIN
  5. RETURN subtotal * 0.06;
  6. END;
  7. $$ LANGUAGE plpgsql;

Example: Using the Data Type of a Table Column

When declaring a variable, you can use the %TYPE construct to specify the data type of a variable or table column. This is the syntax for declaring a variable whose type is the data type of a table column:

  1. name table.column_name%TYPE;

You can use the %TYPE construct to declare variables that will hold database values. For example, suppose you have a column named user_id in your users table. To declare a variable named my_userid with the same data type as the users.user_id column:

  1. my_userid users.user_id%TYPE;

%TYPE is particularly valuable in polymorphic functions, since the data types needed for internal variables may change from one call to the next. Appropriate variables can be created by applying %TYPE to the function’s arguments or result placeholders.

Example: Composite Type Based on a Table Row

A variable of a composite type is called a row variable. The following syntax declares a composite variable based on table row:

  1. name table_name%ROWTYPE;

Such a row variable can hold a whole row of a SELECT or FOR query result, so long as that query’s column set matches the declared type of the variable. The individual fields of the row value are accessed using the usual dot notation, for example rowvar.column.

Parameters to a function can be composite types (complete table rows). In that case, the corresponding identifier $n will be a row variable, and fields can be selected from it, for example $1.user_id.

Only the user-defined columns of a table row are accessible in a row-type variable, not the OID or other system columns. The fields of the row type inherit the table’s field size or precision for data types such as char(n).

The next example function uses a row variable composite type. Before creating the function, create the table that is used by the function with this command.

  1. CREATE TABLE table1 (
  2. f1 text,
  3. f2 numeric,
  4. f3 integer
  5. ) distributed by (f1);

This INSERT command adds data to the table.

  1. INSERT INTO table1 values
  2. ('test1', 14.1, 3),
  3. ('test2', 52.5, 2),
  4. ('test3', 32.22, 6),
  5. ('test4', 12.1, 4) ;

This function uses a column %TYPE variable and %ROWTYPE composite variable based on table1.

  1. CREATE OR REPLACE FUNCTION t1_calc( name text) RETURNS integer
  2. AS $$
  3. DECLARE
  4. t1_row table1%ROWTYPE;
  5. calc_int table1.f3%TYPE;
  6. BEGIN
  7. SELECT * INTO t1_row FROM table1 WHERE table1.f1 = $1 ;
  8. calc_int = (t1_row.f2 * t1_row.f3)::integer ;
  9. RETURN calc_int ;
  10. END;
  11. $$ LANGUAGE plpgsql VOLATILE;

Note: The previous function is classified as a VOLATILE function because function values could change within a single table scan.

The following SELECT command uses the function.

  1. select t1_calc( 'test1' );

Note: The example PL/pgSQL function uses SELECT with the INTO clause. It is different from the SQL command SELECT INTO. If you want to create a table from a SELECT result inside a PL/pgSQL function, use the SQL command CREATE TABLE AS.

Example: Using a Variable Number of Arguments

You can declare a PL/pgSQL function to accept variable numbers of arguments, as long as all of the optional arguments are of the same data type. You must mark the last argument of the function as VARIADIC and declare the argument using an array type. You can refer to a function that includes VARIADIC arguments as a variadic function.

For example, this variadic function returns the minimum value of a variable array of numerics:

  1. CREATE FUNCTION mleast (VARIADIC numeric[])
  2. RETURNS numeric AS $$
  3. DECLARE minval numeric;
  4. BEGIN
  5. SELECT min($1[i]) FROM generate_subscripts( $1, 1) g(i) INTO minval;
  6. RETURN minval;
  7. END;
  8. $$ LANGUAGE plpgsql;
  9. CREATE FUNCTION
  10. SELECT mleast(10, -1, 5, 4.4);
  11. mleast
  12. --------
  13. -1
  14. (1 row)

Effectively, all of the actual arguments at or beyond the VARIADIC position are gathered up into a one-dimensional array.

You can pass an already-constructed array into a variadic function. This is particularly useful when you want to pass arrays between variadic functions. Specify VARIADIC in the function call as follows:

  1. SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);

This prevents PL/pgSQL from expanding the function’s variadic parameter into its element type.

Example: Using Default Argument Values

You can declare PL/pgSQL functions with default values for some or all input arguments. The default values are inserted whenever the function is called with fewer than the declared number of arguments. Because arguments can only be omitted from the end of the actual argument list, you must provide default values for all arguments after an argument defined with a default value.

For example:

  1. CREATE FUNCTION use_default_args(a int, b int DEFAULT 2, c int DEFAULT 3)
  2. RETURNS int AS $$
  3. DECLARE
  4. sum int;
  5. BEGIN
  6. sum := $1 + $2 + $3;
  7. RETURN sum;
  8. END;
  9. $$ LANGUAGE plpgsql;
  10. SELECT use_default_args(10, 20, 30);
  11. use_default_args
  12. ------------------
  13. 60
  14. (1 row)
  15. SELECT use_default_args(10, 20);
  16. use_default_args
  17. ------------------
  18. 33
  19. (1 row)
  20. SELECT use_default_args(10);
  21. use_default_args
  22. ------------------
  23. 15
  24. (1 row)

You can also use the \= sign in place of the keyword DEFAULT.

Example: Using Polymorphic Data Types

PL/pgSQL supports the polymorphic anyelement, anyarray, anyenum, and anynonarray types. Using these types, you can create a single PL/pgSQL function that operates on multiple data types. Refer to Greenplum Database Data Types for additional information on polymorphic type support in Greenplum Database.

A special parameter named $0 is created when the return type of a PL/pgSQL function is declared as a polymorphic type. The data type of $0 identifies the return type of the function as deduced from the actual input types.

In this example, you create a polymorphic function that returns the sum of two values:

  1. CREATE FUNCTION add_two_values(v1 anyelement,v2 anyelement)
  2. RETURNS anyelement AS $$
  3. DECLARE
  4. sum ALIAS FOR $0;
  5. BEGIN
  6. sum := v1 + v2;
  7. RETURN sum;
  8. END;
  9. $$ LANGUAGE plpgsql;

Execute add_two_values() providing integer input values:

  1. SELECT add_two_values(1, 2);
  2. add_two_values
  3. ----------------
  4. 3
  5. (1 row)

The return type of add_two_values() is integer, the type of the input arguments. Now execute add_two_values() providing float input values:

  1. SELECT add_two_values (1.1, 2.2);
  2. add_two_values
  3. ----------------
  4. 3.3
  5. (1 row)

The return type of add_two_values() in this case is float.

You can also specify VARIADIC arguments in polymorphic functions.

Example: Anonymous Block

This example executes the statements in the t1_calc() function from a previous example as an anonymous block using the DO command. In the example, the anonymous block retrieves the input value from a temporary table.

  1. CREATE TEMP TABLE list AS VALUES ('test1') DISTRIBUTED RANDOMLY;
  2. DO $$
  3. DECLARE
  4. t1_row table1%ROWTYPE;
  5. calc_int table1.f3%TYPE;
  6. BEGIN
  7. SELECT * INTO t1_row FROM table1, list WHERE table1.f1 = list.column1 ;
  8. calc_int = (t1_row.f2 * t1_row.f3)::integer ;
  9. RAISE NOTICE 'calculated value is %', calc_int ;
  10. END $$ LANGUAGE plpgsql ;

References

The PostgreSQL documentation about PL/pgSQL is at https://www.postgresql.org/docs/9.4/plpgsql.html

Also, see the CREATE FUNCTION command in the Greenplum Database Reference Guide.

For a summary of built-in Greenplum Database functions, see Summary of Built-in Functions in the Greenplum Database Reference Guide. For information about using Greenplum Database functions see “Querying Data” in the Greenplum Database Administrator Guide

For information about porting Oracle functions, see https://www.postgresql.org/docs/9.4/plpgsql-porting.html. For information about installing and using the Oracle compatibility functions with Greenplum Database, see “Oracle Compatibility Functions” in the Greenplum Database Utility Guide.