PostgreSQL wire protocol

CrateDB contains support for the PostgreSQL wire protocol v3.

If a node is started with postgres support enabled it will bind to port 5432 by default. To use a custom port, set the corresponding Ports in the Configuration.

However, even though connecting PostgreSQL tools and client libraries is supported, the actual SQL statements have to be supported by CrateDB’s SQL dialect. A notable difference is that CrateDB doesn’t support transactions, which is why clients should generally enable autocommit.

Note

In order to use setFetchSize in JDBC it is possible to set autocommit to false.

The client will utilize the fetchSize on SELECT statements and only load up to fetchSize rows into memory.

See the PostgreSQL JDBC Query docs for more information.

Write operations will still behave as if autocommit was enabled and commit or rollback calls are ignored.

Server compatibility and implementation status

CrateDB emulates PostgreSQL server version 10.5.

Start-Up

SSL Support

SSL can be configured using Secured communications (SSL/TLS).

Authentication

Authentication methods can be configured using Host-Based Authentication (HBA).

ParameterStatus

After the authentication has succeeded the server has the possibility to send multiple ParameterStatus messages to the client. These are used to communicate information like server_version (emulates PostgreSQL 9.5) or server_encoding.

CrateDB also sends a message containing the crate_version parameter. This contains the current CrateDB version number.

This information is useful for clients to detect that they’re connecting to CrateDB instead of a PostgreSQL instance.

Database selection

Since CrateDB uses schemas instead of databases, the database parameter sets the default schema name for future queries. If no schema is specified, the schema doc will be used as default. Additionally, the only supported charset is UTF8.

Query Modes

Simple query

The PostgreSQL simple query protocol mode is fully implemented.

Extended query

The PostgreSQL extended query protocol mode is implemented with the following limitations:

  • The ParameterDescription message works for the most common use cases except for DDL statements.

  • To optimize the execution of bulk operations the execution of statements is delayed until the Sync message is received

Copy operations

CrateDB does not support the COPY sub-protocol.

Function call

The function call sub-protocol is not supported since it’s a legacy feature.

Canceling requests

Operations can be cancelled using the KILL statement, hence the CancelRequest message is unsupported. Consequently, the server won’t send a BackendKeyData message during connection initialization.

pg_catalog

For improved compatibility, the pg_catalog schema is implemented containing following tables:

pg_type

Some clients require the pg_catalog.pg_type in order to be able to stream arrays or other non-primitive types.

For compatibility reasons, there is a trimmed down pg_type table available in CrateDB:

  1. cr> SELECT oid, typname, typarray, typelem, typlen, typtype, typcategory
  2. ... FROM pg_catalog.pg_type
  3. ... ORDER BY oid;
  4. +------+------------------------------+----------+---------+--------+---------+-------------+
  5. | oid | typname | typarray | typelem | typlen | typtype | typcategory |
  6. +------+------------------------------+----------+---------+--------+---------+-------------+
  7. | 16 | bool | 1000 | 0 | 1 | b | N |
  8. | 18 | char | 1002 | 0 | 1 | b | S |
  9. | 19 | name | -1 | 0 | 64 | b | S |
  10. | 20 | int8 | 1016 | 0 | 8 | b | N |
  11. | 21 | int2 | 1005 | 0 | 2 | b | N |
  12. | 23 | int4 | 1007 | 0 | 4 | b | N |
  13. | 24 | regproc | 1008 | 0 | 4 | b | N |
  14. | 25 | text | 1009 | 0 | -1 | b | S |
  15. | 26 | oid | 1028 | 0 | 4 | b | N |
  16. | 30 | oidvector | 1013 | 26 | -1 | b | A |
  17. | 114 | json | 199 | 0 | -1 | b | U |
  18. | 199 | _json | 0 | 114 | -1 | b | A |
  19. | 600 | point | 1017 | 0 | 16 | b | G |
  20. | 700 | float4 | 1021 | 0 | 4 | b | N |
  21. | 701 | float8 | 1022 | 0 | 8 | b | N |
  22. | 1000 | _bool | 0 | 16 | -1 | b | A |
  23. | 1002 | _char | 0 | 18 | -1 | b | A |
  24. | 1005 | _int2 | 0 | 21 | -1 | b | A |
  25. | 1007 | _int4 | 0 | 23 | -1 | b | A |
  26. | 1008 | _regproc | 0 | 24 | -1 | b | A |
  27. | 1009 | _text | 0 | 25 | -1 | b | A |
  28. | 1015 | _varchar | 0 | 1043 | -1 | b | A |
  29. | 1016 | _int8 | 0 | 20 | -1 | b | A |
  30. | 1017 | _point | 0 | 600 | -1 | b | A |
  31. | 1021 | _float4 | 0 | 700 | -1 | b | A |
  32. | 1022 | _float8 | 0 | 701 | -1 | b | A |
  33. | 1043 | varchar | 1015 | 0 | -1 | b | S |
  34. | 1082 | date | 1182 | 0 | 8 | b | D |
  35. | 1114 | timestamp without time zone | 1115 | 0 | 8 | b | D |
  36. | 1115 | _timestamp without time zone | 0 | 1114 | -1 | b | A |
  37. | 1182 | _date | 0 | 1082 | -1 | b | A |
  38. | 1184 | timestamptz | 1185 | 0 | 8 | b | D |
  39. | 1185 | _timestamptz | 0 | 1184 | -1 | b | A |
  40. | 1186 | interval | 1187 | 0 | 16 | b | T |
  41. | 1187 | _interval | 0 | 1186 | -1 | b | A |
  42. | 1231 | _numeric | 0 | 1700 | -1 | b | A |
  43. | 1266 | timetz | 1270 | 0 | 12 | b | D |
  44. | 1270 | _timetz | 0 | 1266 | -1 | b | A |
  45. | 1560 | bit | 1561 | 0 | -1 | b | V |
  46. | 1561 | _bit | 0 | 1560 | -1 | b | A |
  47. | 1700 | numeric | 1231 | 0 | -1 | b | N |
  48. | 2205 | regclass | 2210 | 0 | 4 | b | N |
  49. | 2210 | _regclass | 0 | 2205 | -1 | b | A |
  50. | 2249 | record | 2287 | 0 | -1 | p | P |
  51. | 2276 | any | 0 | 0 | 4 | p | P |
  52. | 2277 | anyarray | 0 | 2276 | -1 | p | P |
  53. | 2287 | _record | 0 | 2249 | -1 | p | A |
  54. +------+------------------------------+----------+---------+--------+---------+-------------+
  55. SELECT 47 rows in set (... sec)

Note

This is just a snapshot of the table. Check table information_schema.columns to get information for all supported columns.

Object Identifier Types

Object identifiers are used internally by PostgreSQL for various system tables. The oid type is currently mapped to the integer data type.

The oid type might have the following type aliases:

Name

Reference

Description

Example

regproc

pg_proc

a function name

sum

Note

Currently, casting a string or integer literal to the regproc type wouldn’t result in a function lookup. Instead, casting the string literal to the regproc type results in an object of the regproc type that has a name that corresponds to the string literal and the oid hash of the literal as oid. Casting an integer literal to the regproc type results in an object of the regproc type that has a name that corresponds to the string representation of the literal and the literal value as oid.

Show transaction isolation

For compatibility with JDBC the SHOW TRANSACTION ISOLATION LEVEL statement is implemented:

  1. cr> show transaction isolation level;
  2. +-----------------------+
  3. | transaction_isolation |
  4. +-----------------------+
  5. | read uncommitted |
  6. +-----------------------+
  7. SHOW 1 row in set (... sec)

BEGIN/START/COMMIT statements

For compatibility with clients that use the PostgresSQL wire protocol (e.g., the Golang lib/pq and pgx drivers), CrateDB will accept the BEGIN, COMMIT, and START TRASNACTION statements. For example:

  1. cr> BEGIN TRANSACTION ISOLATION LEVEL READ UNCOMMITTED,
  2. ... READ ONLY,
  3. ... NOT DEFERRABLE;
  4. BEGIN OK, 0 rows affected (... sec)
  5. cr> COMMIT
  6. COMMIT OK, 0 rows affected (... sec)

CrateDB will silently ignore the COMMIT, BEGIN, and START TRANSACTION statements and all respective parameters.

Client compatibility

JDBC

pgjdbc JDBC drivers version 9.4.1209 and above are compatible.

Limitations

  • reflection methods like conn.getMetaData().getTables(...) won’t work since the required tables are unavailable in CrateDB.

    As a workaround it’s possible to use SHOW TABLES or query the information_schema tables manually using SELECT statements.

  • OBJECT and GEO_SHAPE columns can be streamed as JSON but require pgjdbc version 9.4.1210 or newer.

  • Multidimensional arrays will be streamed as JSON encoded string to avoid a protocol limitation where all sub-arrays are required to have the same length.

  • The behavior of PreparedStatement.executeBatch in error cases depends on in which stage an error occurs: A BatchUpdateException is thrown if no processing has been done yet, whereas single operations failing after the processing started are indicated by an EXECUTE_FAILED (-3) return value.

  • Transaction limitations as described above.

  • Having escape processing enabled could prevent the usage of Object Literals in case an object key’s starting character clashes with a JDBC escape keyword (see also JDBC escape syntax). Currently, disabling escape processing will remedy this, but prevent the Extended Query API from working due to a bug at pgjdbc.

Connection failover and load balancing

Connection failover and load balancing is supported as described here: PostgreSQL JDBC connection failover.

Note

It is not recommended to use the targetServerType parameter since CrateDB has no concept of master-replica nodes.

Implementation differences

The PostgreSQL Wire Protocol makes it easy to use many PostgreSQL compatible tools and libraries directly with CrateDB. However, many of these tools assume that they are talking to PostgreSQL specifically, and thus rely on SQL extensions and idioms that are unique to PostgreSQL. Because of this, some tools or libraries may not work with other SQL databases such as CrateDB.

CrateDB’s SQL query engine enables real-time search & aggregations for online analytic processing (OLAP) and business intelligence (BI) with the benefit of the ability to scale horizontally. The use-cases of CrateDB are different than those of PostgreSQL, as CrateDB’s specialized storage schema and query execution engine address different requirements (see Clustering).

The listed features below cover the main differences in implementation and dialect between CrateDB and PostgreSQL. A detailed comparison between CrateDB’s SQL dialect and standard SQL is defined in SQL compatibility.

COPY

CrateDB does not support the distinct sub-protocol that is used to serve COPY operations and provides another implementation for transferring bulk data using the COPY FROM and COPY TO statements.

Objects

The definition of structured values by using JSON types, composite types or HSTORE are not supported. CrateDB alternatively allows the definition of nested documents (of type object) that store fieldscontaining any CrateDB supported data type, including nested object types.

Type casts

CrateDB accepts the Type conversion syntax for conversion of one data type to another.

See also

PostgreSQL value expressions

CrateDB value expressions

Arrays

Declaration of arrays

While multidimensional arrays in PostgreSQL must have matching extends for each dimension, CrateDB allows different length nested arrays as this example shows:

  1. cr> select [[1,2,3],[1,2]] from sys.cluster;
  2. +---------------------+
  3. | [[1, 2, 3], [1, 2]] |
  4. +---------------------+
  5. | [[1, 2, 3], [1, 2]] |
  6. +---------------------+
  7. SELECT 1 row in set (... sec)

Accessing arrays

Fetching arbitrary rectangular slices of an array using lower-bound:upper-bound expression in the array subscript is not supported.

See also

PostgreSQL Arrays

Text search functions and operators

The functions and operators provided by PostgreSQL for full-text search (see PostgreSQL Fulltext Search) are not compatible with those provided by CrateDB.

If you are missing features, functions or dialect improvements and have a great use case for it, let us know on GitHub. We’re always improving and extending CrateDB and we love to hear feedback.

Expression evaluation

Unlike PostgreSQL, expressions are not evaluated if the query results in 0 rows either because of the table is empty or by not matching the WHERE clause.