Resource Groups
Resource groups place limits on resource usage, and can enforce queueing policies on queries that run within them or divide their resources among sub-groups. A query belongs to a single resource group, and consumes resources from that group (and its ancestors). Except for the limit on queued queries, when a resource group runs out of a resource it does not cause running queries to fail; instead new queries become queued. A resource group may have sub-groups or may accept queries, but may not do both.
In PrestoDB, Resource Groups are a powerful tool for managing query execution and resource allocation. They allow administrators to control how resources are allocated and utilized on a Presto cluster.
Resource Usage Limits
Resource Groups can set limits on resource usage, such as CPU time, memory usage, or total number of queries. This is particularly useful in multi-tenant environments where you want to ensure that no single user or query monopolizes the system resources.
Resource Consumption
A query belongs to a single resource group, and it consumes resources from that group, as well as its parent groups. If a resource group runs out of a certain resource, it does not cause running queries to fail. Instead, new queries will be queued until resources become available again.
Sub-Groups and Query Acceptance
Sub-groups allow for hierarchical resource allocation, where each sub-group can have its own resource limits and queueing policies. A resource group that accepts queries directly is a leaf group, and it executes queries using its allocated resources.
The resource groups and associated selection rules are configured by a manager, which is pluggable. Presto resource management can be done in two ways:
File-Based Resource Management
In a file-based resource manager, configuration information about the resource groups is stored in a JSON file. This file contains definitions of all resource groups and the rules to select them. The configuration file is loaded and used at the start of the Presto server. Any changes to the file require a restart of the Presto server to take effect.
Database-Based Resource Management
In a database-based resource manager, configuration information about the resource groups is stored in a relational database. The database contains definitions of all resource groups and the rules to select them. Unlike the file-based resource manager, changes to the configuration in the database take effect immediately and do not require a restart of the Presto server.
Both methods have their pros and cons. File-based resource management is simpler to set up but less flexible, while database-based resource management is more complex to set up but offers more flexibility and dynamic changes.
File Resource Group Manager
The file resource group manager in PrestoDB is a way to manage resources using a JSON configuration file. This file contains definitions of all resource groups and rules for selecting the appropriate resource group for a given query.
To set up a file-based resource group manager:
Create the file
etc/resource-groups.properties
.In
etc/resource-groups.properties
, set theresource-groups.configuration-manager
property tofile
using the following code example:resource-groups.configuration-manager=file
Create a JSON file in
etc
namedresource-groups.json
. This file should contain the definitions for the resource groups. Each resource group can specify things like the maximum memory, the maximum queued queries, and the maximum running queries.For information on creating resource group definitions, see Resource Group Properties. For an example of a resource-groups.json file, see File Resource Group Manager.
In
etc/resource-groups.properties
, add a line that specifies the location of the JSON file using the following code example. Set theresource-groups.config-file
property to<file_path>
, where<file_path>
is the path to the JSON file.resource-groups.config-file=etc/resource-groups.json
Restart the Presto server. The new resource groups take effect immediately after the restart.
Database Resource Group Manager
The database resource group manager in PrestoDB is a way to manage resources using a relational database.
To set up a database-based resource group manager:
Create the file
etc/resource-groups.properties
.Set the
resource-groups.configuration-manager
property todb
.Set up a relational database: The database should be accessible by Presto. It will be used to store the configurations of the resource groups.
Create tables for resource groups and selection rules: You need to create tables in the database that will store the definitions of the resource groups and the rules for selecting the appropriate resource group for a given query.
To specify the database in the Presto configuration, add a line in
etc/resource-groups.properties
that specifies the JDBC URL of the database:resource-groups.config-db-url = <jdbc_url>
, where<jdbc_url>
is the JDBC URL of the database.Note : Currently only MySQL is supported.
etc/resource-groups.properties
should be similar to this following example:
resource-groups.configuration-manager=db
resource-groups.config-db-url=jdbc:mysql://localhost:3306/resource_groups
resource-groups.config-db-user=username
resource-groups.config-db-password=password
With the Database Resource Group Manager, changes to the configuration in the database take effect immediately and do not require a restart of the Presto server. This allows for more flexibility and dynamic changes to the resource group configurations.
Database Resource Group Manager Properties
Property name | Description | Default value |
---|---|---|
| Database URL to load configuration from. |
|
| Database user to connect with. |
|
| Password for database user to connect with. |
|
| The maximum time period for which the cluster will continue to accept queries after refresh failures, causing configuration to become stale. |
|
| How often the cluster reloads from the database |
|
| Setting this flag enables usage of an additional |
|
Resource Group Properties
Resource Groups are defined using a set of properties that determine how resources are allocated and used. Here are the key properties that can be set for a Resource Group:
name
(required): The name of the resource group. This is a mandatory property.maxQueued
(required): The maximum number of queries that can be queued in the resource group. If this limit is reached, new queries will be rejected.hardCpuLimit
(optional): maximum amount of CPU time this group may use in a period.softMemoryLimit
(required): The maximum amount of memory that the resource group can use. This can be specified in absolute terms (like “10GB”) or as a percentage of the total available memory (like “50%”).hardConcurrencyLimit
(required): The maximum number of queries that the resource group can run concurrently.softConcurrencyLimit
(optional): The soft limit on the number of concurrent queries. If this limit is exceeded, the scheduler will try to prevent new queries from starting, but it won’t force running queries to stop.softCpuLimit
(optional): maximum amount of CPU time this group may use in a period (seecpuQuotaPeriod
), before a penalty is applied to the maximum number of running queries.hardCpuLimit
must also be specified.schedulingPolicy
(optional): The policy that determines how queries are scheduled within the resource group. This can be set to one of the four valuesfair
,weighted
,weighted_fair
orquery_priority
:fair
(default): queued queries are processed first-in-first-out, and sub-groups must take turns starting new queries, if they have any queued.weighted_fair
: sub-groups are selected based on theirschedulingWeight
and the number of queries they are already running concurrently. The expected share of running queries for a sub-group is computed based on the weights for all currently eligible sub-groups. The sub-group with the least concurrency relative to its share is selected to start the next query.weighted
: queued queries are selected stochastically in proportion to their priority, specified via thequery_priority
{doc}session property </sql/set-session>
. Sub groups are selected to start new queries in proportion to theirschedulingWeight
.query_priority
: all sub-groups must also be configured withquery_priority
. Queued queries are selected strictly according to their priority.
schedulingWeight
(optional): The weight for the resource group when the parent group uses theweighted
scheduling policy. A higher weight means that the group gets a larger share of the parent group’s resources.jmxExport
(optional): If set totrue
, the statistics of the resource group will be exported via JMX. Defaults tofalse
.subGroups
(optional): list of sub-groups. A list of sub-groups within the resource group. Each sub-group can have its own set of properties.
Scheduling Weight Example
Schedule weighting is a method of assigning a priority to a resource. Sub-groups with a higher scheduling weight are given higher priority. For example, to ensure timely execution of scheduled pipeline queries, weight them higher than adhoc queries.
Here’s an example:
If you have a root resource group global
with two subgroups: engineering
and marketing
. The engineering
subgroup has a scheduling weight of 3, and the marketing
subgroup has a scheduling weight of 1. In this setup, the engineering
subgroup will get 75% of the parent group’s resources (because 3 is 75% of the total weight of 4), and the marketing
subgroup will get 25% of the parent group’s resources (because 1 is 25% of the total weight of 4).
Schedule weighting allows you to prioritize certain subgroups over others in terms of resource allocation. In this example, queries from the engineering
subgroup will be prioritized over queries from the marketing
subgroup.
Selector Rules
Here are the key components of selector rules in PrestoDB:
group
(required): The group these queries will run in.user
(optional): This is a regular expression that matches the user who is submitting the query.userGroup
(optional): Regex to match against every user group the user belongs to.source
(optional): This matches the source of the query, which is typically the application submitting the query.queryType
(optional): String to match against the type of the query submitted:SELECT
:SELECT
queries.EXPLAIN
:EXPLAIN
queries (but notEXPLAIN ANALYZE
).DESCRIBE
:DESCRIBE
,DESCRIBE INPUT
,DESCRIBE OUTPUT
, andSHOW
queries.INSERT
:INSERT
,CREATE TABLE AS
, andREFRESH MATERIALIZED VIEW
queries.UPDATE
:UPDATE
queries.DELETE
:DELETE
queries.ANALYZE
:ANALYZE
queries.DATA_DEFINITION
: Queries that alter/create/drop the metadata of schemas/tables/views, and that manage prepared statements, privileges, sessions, and transactions.
clientTags
(optional): List of tags. To match, every tag in this list must be in the list of client-provided tags associated with the query.
Selectors are processed sequentially and the first one that matches will be used.
Global Properties
cpuQuotaPeriod
(optional): the period in which cpu quotas are enforced. ThecpuQuotaPeriod
is a global property often used in container-based environments to control the amount of CPU resources that a container can use in a specified period.
Please note that the exact implementation and naming of these properties can vary between different container runtimes and orchestration systems.
Providing Selector Properties
The source name can be set as follows:
CLI: use the
--source
option.JDBC driver when used in client apps: add the
source
property to the connection configuration and set the value when using a Java application that uses the JDBC Driver.JDBC driver used with Java programs: add a property with the key
source
and the value on theConnection
instance as shown in the example.
Client tags can be set as follows:
CLI: use the
--client-tags
option.JDBC driver when used in client apps: add the
clientTags
property to the connection configuration and set the value when using a Java application that uses the JDBC Driver.JDBC driver used with Java programs: add a property with the key
clientTags
and the value on theConnection
instance as shown in the example.
Example
In the example configuration below, there are several resource groups, some of which are templates. Templates allow administrators to construct resource group trees dynamically. For example, in the pipeline_${USER}
group, ${USER}
is expanded to the name of the user that submitted the query. ${SOURCE}
is also supported, which is expanded to the source that submitted the query. You may also use custom named variables in the source
and user
regular expressions.
There are four selectors, that define which queries run in which resource group:
The first selector matches queries from
bob
and places them in the admin group.The second selector matches queries from
admin
user group and places them in the admin group.The third selector matches all data definition (DDL) queries from a source name that includes
pipeline
and places them in theglobal.data_definition
group. This could help reduce queue times for this class of queries, since they are expected to be fast.The fourth selector matches queries from a source name that includes
pipeline
, and places them in a dynamically-created per-user pipeline group under theglobal.pipeline
group.The fifth selector matches queries that come from BI tools which have a source matching the regular expression
jdbc#(?<toolname>.*)
and have client provided tags that are a superset ofhipri
. These are placed in a dynamically-created sub-group under theglobal.adhoc
group. The dynamic sub-groups are created based on the values of named variablestoolname
anduser
. The values are derived from the source regular expression and the query user respectively. Consider a query with a sourcejdbc#powerfulbi
, userkayla
, and client tagshipri
andfast
. This query is routed to theglobal.adhoc.bi-powerfulbi.kayla
resource group.The last selector is a catch-all, which places all queries that have not yet been matched into a per-user adhoc group.
Together, these selectors implement the following policy:
- The user
bob
and any user belonging to user groupadmin
is an admin and can run up to 50 concurrent queries. Queries will be run based on user-provided priority.
For the remaining users:
No more than 100 total queries may run concurrently.
Up to 5 concurrent DDL queries with a source
pipeline
can run. Queries are run in FIFO order.Non-DDL queries will run under the
global.pipeline
group, with a total concurrency of 45, and a per-user concurrency of 5. Queries are run in FIFO order.For BI tools, each tool can run up to 10 concurrent queries, and each user can run up to 3. If the total demand exceeds the limit of 10, the user with the fewest running queries gets the next concurrency slot. This policy results in fairness when under contention.
All remaining queries are placed into a per-user group under
global.adhoc.other
that behaves similarly.
File Resource Group Manager
{
"rootGroups": [
{
"name": "global",
"softMemoryLimit": "80%",
"hardConcurrencyLimit": 100,
"maxQueued": 1000,
"schedulingPolicy": "weighted",
"jmxExport": true,
"subGroups": [
{
"name": "data_definition",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 5,
"maxQueued": 100,
"schedulingWeight": 1
},
{
"name": "adhoc",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 50,
"maxQueued": 1,
"schedulingWeight": 10,
"subGroups": [
{
"name": "other",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 2,
"maxQueued": 1,
"schedulingWeight": 10,
"schedulingPolicy": "weighted_fair",
"subGroups": [
{
"name": "${USER}",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 1,
"maxQueued": 100
}
]
},
{
"name": "bi-${toolname}",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 10,
"maxQueued": 100,
"schedulingWeight": 10,
"schedulingPolicy": "weighted_fair",
"subGroups": [
{
"name": "${USER}",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 3,
"maxQueued": 10
}
]
}
]
},
{
"name": "pipeline",
"softMemoryLimit": "80%",
"hardConcurrencyLimit": 45,
"maxQueued": 100,
"schedulingWeight": 1,
"jmxExport": true,
"subGroups": [
{
"name": "pipeline_${USER}",
"softMemoryLimit": "50%",
"hardConcurrencyLimit": 5,
"maxQueued": 100
}
]
}
]
},
{
"name": "admin",
"softMemoryLimit": "100%",
"hardConcurrencyLimit": 50,
"maxQueued": 100,
"schedulingPolicy": "query_priority",
"jmxExport": true
}
],
"selectors": [
{
"user": "bob",
"group": "admin"
},
{
"userGroup": "admin",
"group": "admin"
},
{
"source": ".*pipeline.*",
"queryType": "DATA_DEFINITION",
"group": "global.data_definition"
},
{
"source": ".*pipeline.*",
"group": "global.pipeline.pipeline_${USER}"
},
{
"source": "jdbc#(?<toolname>.*)",
"clientTags": ["hipri"],
"group": "global.adhoc.bi-${toolname}.${USER}"
},
{
"group": "global.adhoc.other.${USER}"
}
],
"cpuQuotaPeriod": "1h"
}
Database Resource Group Manager
This example is for a MySQL database.
--This script first creates a database named presto_resource_groups and then
--creates the resource_groups_global_properties, resource_groups, and selectors tables within that database.
--It then inserts some example data into these tables.
--Please remember to replace 'user' with the actual username in your environment.
-- Also note that this is a simple example and may not cover all your use cases.
-- Always refer to the official PrestoDB documentation for the most accurate and up-to-date information.
CREATE DATABASE IF NOT EXISTS presto_resource_groups;
USE presto_resource_groups;
CREATE TABLE IF NOT EXISTS resource_groups_global_properties (
name VARCHAR(128) NOT NULL PRIMARY KEY,
value VARCHAR(512) NOT NULL,
UNIQUE (name)
);
INSERT INTO resource_groups_global_properties (name, value)
VALUES ('cpu_quota_period', '1h');
CREATE TABLE IF NOT EXISTS resource_groups (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(128) NOT NULL,
soft_memory_limit VARCHAR(128) NOT NULL,
hard_concurrency_limit INT NOT NULL,
max_queued INT NOT NULL,
jmx_export BOOLEAN NOT NULL,
soft_cpu_limit VARCHAR(128),
hard_cpu_limit VARCHAR(128),
scheduling_policy VARCHAR(128),
scheduling_weight INT,
parent_id BIGINT,
environment VARCHAR(128),
UNIQUE (name, environment)
);
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, jmx_export)
VALUES ('global', '80%', 100, 1000, true);
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, jmx_export, parent_id)
VALUES ('user', '50%', 50, 500, false, LAST_INSERT_ID());
CREATE TABLE IF NOT EXISTS selectors (
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
resource_group_id BIGINT NOT NULL,
user_regex VARCHAR(512),
source_regex VARCHAR(512),
query_type VARCHAR(512),
priority INT NOT NULL,
UNIQUE (resource_group_id, priority)
);
INSERT INTO selectors (resource_group_id, user_regex, priority)
VALUES (LAST_INSERT_ID(), 'user', 1);
-- global properties
-- get ID of 'other' group
SELECT resource_group_id FROM resource_groups WHERE name = 'other'; -- 4
-- create '${USER}' group with 'other' as parent.
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, environment, parent) VALUES ('${USER}', '10%', 1, 100, 'test_environment', 4);
-- create 'bi-${toolname}' group with 'adhoc' as parent
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, scheduling_weight, scheduling_policy, environment, parent) VALUES ('bi-${toolname}', '10%', 10, 100, 10, 'weighted_fair', 'test_environment', 3);
-- create 'pipeline' group with 'global' as parent
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, scheduling_weight, jmx_export, environment, parent) VALUES ('pipeline', '80%', 45, 100, 1, true, 'test_environment', 1);
-- create a root group 'global' with NULL parent
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, scheduling_policy, jmx_export, environment) VALUES ('global', '80%', 100, 1000, 'weighted', true, 'test_environment');
-- get ID of 'global' group
SELECT resource_group_id FROM resource_groups WHERE name = 'global'; -- 1
-- create two new groups with 'global' as parent
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, scheduling_weight, environment, parent) VALUES ('data_definition', '10%', 5, 100, 1, 'test_environment', 1);
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, scheduling_weight, environment, parent) VALUES ('adhoc', '10%', 50, 1, 10, 'test_environment', 1);
-- get ID of 'adhoc' group
SELECT resource_group_id FROM resource_groups WHERE name = 'adhoc'; -- 3
-- create 'other' group with 'adhoc' as parent
INSERT INTO resource_groups (name, soft_memory_limit, hard_concurrency_limit, max_queued, scheduling_weight, scheduling_policy, environment, parent) VALUES ('other', '10%', 2, 1, 10, 'weighted_fair', 'test_environment', 3);
-- Selectors
-- use ID of 'admin' resource group for selector
INSERT INTO selectors (resource_group_id, user_regex, priority) VALUES ((SELECT resource_group_id FROM resource_groups WHERE name = 'admin'), 'bob', 6);
-- use ID of 'admin' resource group for selector
INSERT INTO selectors (resource_group_id, user_group_regex, priority) VALUES ((SELECT resource_group_id FROM resource_groups WHERE name = 'admin'), 'admin', 5);
-- use ID of 'global.data_definition' resource group for selector
INSERT INTO selectors (resource_group_id, source_regex, query_type, priority) VALUES ((SELECT resource_group_id FROM resource_groups WHERE name = 'data_definition'), '.*pipeline.*', 'DATA_DEFINITION', 4);
-- use ID of 'global.pipeline.pipeline_${USER}' resource group for selector
INSERT INTO selectors (resource_group_id, source_regex, priority) VALUES ((SELECT resource_group_id FROM resource_groups WHERE name = 'pipeline_${USER}'), '.*pipeline.*', 3);