Materialized Views

Materialized views names are defined by:

  1. view_name::= re('[a-zA-Z_0-9]+')

CREATE MATERIALIZED VIEW

You can create a materialized view on a table using a CREATE MATERIALIZED VIEW statement:

  1. create_materialized_view_statement::= CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] view_name
  2. AS select_statement
  3. PRIMARY KEY '(' primary_key')'
  4. WITH table_options

For instance:

  1. CREATE MATERIALIZED VIEW monkeySpecies_by_population AS
  2. SELECT * FROM monkeySpecies
  3. WHERE population IS NOT NULL AND species IS NOT NULL
  4. PRIMARY KEY (population, species)
  5. WITH comment='Allow query by population instead of species';

The CREATE MATERIALIZED VIEW statement creates a new materialized view. Each such view is a set of rows which corresponds to rows which are present in the underlying, or base, table specified in the SELECT statement. A materialized view cannot be directly updated, but updates to the base table will cause corresponding updates in the view.

Creating a materialized view has 3 main parts:

Attempting to create an already existing materialized view will return an error unless the IF NOT EXISTS option is used. If it is used, the statement will be a no-op if the materialized view already exists.

By default, materialized views are built in a single thread. The initial build can be parallelized by increasing the number of threads specified by the property concurrent_materialized_view_builders in cassandra.yaml. This property can also be manipulated at runtime through both JMX and the setconcurrentviewbuilders and getconcurrentviewbuilders nodetool commands.

MV select statement

The select statement of a materialized view creation defines which of the base table is included in the view. That statement is limited in a number of ways:

  • the selection is limited to those that only select columns of the base table. In other words, you can’t use any function (aggregate or not), casting, term, etc. Aliases are also not supported. You can however use * as a shortcut of selecting all columns. Further, static columns cannot be included in a materialized view. Thus, a SELECT * command isn’t allowed if the base table has static columns. The WHERE clause has the following restrictions:

    • cannot include any bind_marker

    • cannot have columns that are not part of the base table primary key that are not restricted by an IS NOT NULL restriction

    • no other restriction is allowed

    • cannot have columns that are part of the view primary key be null, they must always be at least restricted by a IS NOT NULL restriction (or any other restriction, but they must have one).

  • cannot have an ordering clause, a limit, or xref:cassandra:developing/cql/dml.adoc#allow-filtering[ALLOW FILTERING

MV primary key

A view must have a primary key and that primary key must conform to the following restrictions:

  • it must contain all the primary key columns of the base table. This ensures that every row of the view correspond to exactly one row of the base table.

  • it can only contain a single column that is not a primary key column in the base table.

So for instance, give the following base table definition:

  1. CREATE TABLE t (
  2. k int,
  3. c1 int,
  4. c2 int,
  5. v1 int,
  6. v2 int,
  7. PRIMARY KEY (k, c1, c2)
  8. );

then the following view definitions are allowed:

  1. CREATE MATERIALIZED VIEW mv1 AS
  2. SELECT * FROM t
  3. WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL
  4. PRIMARY KEY (c1, k, c2);
  5. CREATE MATERIALIZED VIEW mv1 AS
  6. SELECT * FROM t
  7. WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL
  8. PRIMARY KEY (v1, k, c1, c2);

but the following ones are not allowed:

  1. // Error: cannot include both v1 and v2 in the primary key as both are not in the base table primary key
  2. CREATE MATERIALIZED VIEW mv1 AS
  3. SELECT * FROM t
  4. WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL AND v1 IS NOT NULL
  5. PRIMARY KEY (v1, v2, k, c1, c2);
  6. // Error: must include k in the primary as it's a base table primary key column
  7. CREATE MATERIALIZED VIEW mv1 AS
  8. SELECT * FROM t
  9. WHERE c1 IS NOT NULL AND c2 IS NOT NULL
  10. PRIMARY KEY (c1, c2);

MV options

A materialized view is internally implemented by a table and as such, creating a MV allows the same options than creating a table <create-table-options>.

ALTER MATERIALIZED VIEW

After creation, you can alter the options of a materialized view using the ALTER MATERIALIZED VIEW statement:

  1. alter_materialized_view_statement::= ALTER MATERIALIZED VIEW [ IF EXISTS ] view_name WITH table_options

The options that can be updated are the same than at creation time and thus the same than for tables <create-table-options>. If the view does not exist, the statement will return an error, unless IF EXISTS is used in which case the operation is a no-op.

DROP MATERIALIZED VIEW

Dropping a materialized view using the DROP MATERIALIZED VIEW statement:

  1. drop_materialized_view_statement::= DROP MATERIALIZED VIEW [ IF EXISTS ] view_name;

If the materialized view does not exists, the statement will return an error, unless IF EXISTS is used in which case the operation is a no-op.

MV Limitations

Removal of columns not selected in the Materialized View (via UPDATE base SET unselected_column = null or DELETE unselected_column FROM base) may shadow missed updates to other columns received by hints or repair. For this reason, we advise against doing deletions on base columns not selected in views until this is fixed on CASSANDRA-13826.