Labels and Selectors

Labels are key/value pairs that are attached to objects such as Pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a set of key/value labels defined. Each Key must be unique for a given object.

  1. "metadata": {
  2. "labels": {
  3. "key1" : "value1",
  4. "key2" : "value2"
  5. }
  6. }

Labels allow for efficient queries and watches and are ideal for use in UIs and CLIs. Non-identifying information should be recorded using annotations.

Motivation

Labels enable users to map their own organizational structures onto system objects in a loosely coupled fashion, without requiring clients to store these mappings.

Service deployments and batch processing pipelines are often multi-dimensional entities (e.g., multiple partitions or deployments, multiple release tracks, multiple tiers, multiple micro-services per tier). Management often requires cross-cutting operations, which breaks encapsulation of strictly hierarchical representations, especially rigid hierarchies determined by the infrastructure rather than by users.

Example labels:

  • "release" : "stable", "release" : "canary"
  • "environment" : "dev", "environment" : "qa", "environment" : "production"
  • "tier" : "frontend", "tier" : "backend", "tier" : "cache"
  • "partition" : "customerA", "partition" : "customerB"
  • "track" : "daily", "track" : "weekly"

These are examples of commonly used labels; you are free to develop your own conventions. Keep in mind that label Key must be unique for a given object.

Syntax and character set

Labels are key/value pairs. Valid label keys have two segments: an optional prefix and name, separated by a slash (/). The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots (.), not longer than 253 characters in total, followed by a slash (/).

If the prefix is omitted, the label Key is presumed to be private to the user. Automated system components (e.g. kube-scheduler, kube-controller-manager, kube-apiserver, kubectl, or other third-party automation) which add labels to end-user objects must specify a prefix.

The kubernetes.io/ and k8s.io/ prefixes are reserved for Kubernetes core components.

Valid label value:

  • must be 63 characters or less (can be empty),
  • unless empty, must begin and end with an alphanumeric character ([a-z0-9A-Z]),
  • could contain dashes (-), underscores (_), dots (.), and alphanumerics between.

For example, here’s a manifest for a Pod that has two labels environment: production and app: nginx:

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: label-demo
  5. labels:
  6. environment: production
  7. app: nginx
  8. spec:
  9. containers:
  10. - name: nginx
  11. image: nginx:1.14.2
  12. ports:
  13. - containerPort: 80

Label selectors

Unlike names and UIDs, labels do not provide uniqueness. In general, we expect many objects to carry the same label(s).

Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.

The API currently supports two types of selectors: equality-based and set-based. A label selector can be made of multiple requirements which are comma-separated. In the case of multiple requirements, all must be satisfied so the comma separator acts as a logical AND (&&) operator.

The semantics of empty or non-specified selectors are dependent on the context, and API types that use selectors should document the validity and meaning of them.

Note:

For some API types, such as ReplicaSets, the label selectors of two instances must not overlap within a namespace, or the controller can see that as conflicting instructions and fail to determine how many replicas should be present.

Caution:

For both equality-based and set-based conditions there is no logical OR (||) operator. Ensure your filter statements are structured accordingly.

Equality-based requirement

Equality- or inequality-based requirements allow filtering by label keys and values. Matching objects must satisfy all of the specified label constraints, though they may have additional labels as well. Three kinds of operators are admitted =,==,!=. The first two represent equality (and are synonyms), while the latter represents inequality. For example:

  1. environment = production
  2. tier != frontend

The former selects all resources with key equal to environment and value equal to production. The latter selects all resources with key equal to tier and value distinct from frontend, and all resources with no labels with the tier key. One could filter for resources in production excluding frontend using the comma operator: environment=production,tier!=frontend

One usage scenario for equality-based label requirement is for Pods to specify node selection criteria. For example, the sample Pod below selects nodes where the accelerator label exists and is set to nvidia-tesla-p100.

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: cuda-test
  5. spec:
  6. containers:
  7. - name: cuda-test
  8. image: "registry.k8s.io/cuda-vector-add:v0.1"
  9. resources:
  10. limits:
  11. nvidia.com/gpu: 1
  12. nodeSelector:
  13. accelerator: nvidia-tesla-p100

Set-based requirement

Set-based label requirements allow filtering keys according to a set of values. Three kinds of operators are supported: in,notin and exists (only the key identifier). For example:

  1. environment in (production, qa)
  2. tier notin (frontend, backend)
  3. partition
  4. !partition
  • The first example selects all resources with key equal to environment and value equal to production or qa.
  • The second example selects all resources with key equal to tier and values other than frontend and backend, and all resources with no labels with the tier key.
  • The third example selects all resources including a label with key partition; no values are checked.
  • The fourth example selects all resources without a label with key partition; no values are checked.

Similarly the comma separator acts as an AND operator. So filtering resources with a partition key (no matter the value) and with environment different than qa can be achieved using partition,environment notin (qa). The set-based label selector is a general form of equality since environment=production is equivalent to environment in (production); similarly for != and notin.

Set-based requirements can be mixed with equality-based requirements. For example: partition in (customerA, customerB),environment!=qa.

API

LIST and WATCH filtering

For list and watch operations, you can specify label selectors to filter the sets of objects returned; you specify the filter using a query parameter. (To learn in detail about watches in Kubernetes, read efficient detection of changes). Both requirements are permitted (presented here as they would appear in a URL query string):

  • equality-based requirements: ?labelSelector=environment%3Dproduction,tier%3Dfrontend
  • set-based requirements: ?labelSelector=environment+in+%28production%2Cqa%29%2Ctier+in+%28frontend%29

Both label selector styles can be used to list or watch resources via a REST client. For example, targeting apiserver with kubectl and using equality-based one may write:

  1. kubectl get pods -l environment=production,tier=frontend

or using set-based requirements:

  1. kubectl get pods -l 'environment in (production),tier in (frontend)'

As already mentioned set-based requirements are more expressive. For instance, they can implement the OR operator on values:

  1. kubectl get pods -l 'environment in (production, qa)'

or restricting negative matching via notin operator:

  1. kubectl get pods -l 'environment,environment notin (frontend)'

Set references in API objects

Some Kubernetes objects, such as services and replicationcontrollers, also use label selectors to specify sets of other resources, such as pods.

Service and ReplicationController

The set of pods that a service targets is defined with a label selector. Similarly, the population of pods that a replicationcontroller should manage is also defined with a label selector.

Label selectors for both objects are defined in json or yaml files using maps, and only equality-based requirement selectors are supported:

  1. "selector": {
  2. "component" : "redis",
  3. }

or

  1. selector:
  2. component: redis

This selector (respectively in json or yaml format) is equivalent to component=redis or component in (redis).

Resources that support set-based requirements

Newer resources, such as Job, Deployment, ReplicaSet, and DaemonSet, support set-based requirements as well.

  1. selector:
  2. matchLabels:
  3. component: redis
  4. matchExpressions:
  5. - { key: tier, operator: In, values: [cache] }
  6. - { key: environment, operator: NotIn, values: [dev] }

matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is “key”, the operator is “In”, and the values array contains only “value”. matchExpressions is a list of pod selector requirements. Valid operators include In, NotIn, Exists, and DoesNotExist. The values set must be non-empty in the case of In and NotIn. All of the requirements, from both matchLabels and matchExpressions are ANDed together — they must all be satisfied in order to match.

Selecting sets of nodes

One use case for selecting over labels is to constrain the set of nodes onto which a pod can schedule. See the documentation on node selection for more information.

Using labels effectively

You can apply a single label to any resources, but this is not always the best practice. There are many scenarios where multiple labels should be used to distinguish resource sets from one another.

For instance, different applications would use different values for the app label, but a multi-tier application, such as the guestbook example, would additionally need to distinguish each tier. The frontend could carry the following labels:

  1. labels:
  2. app: guestbook
  3. tier: frontend

while the Redis master and replica would have different tier labels, and perhaps even an additional role label:

  1. labels:
  2. app: guestbook
  3. tier: backend
  4. role: master

and

  1. labels:
  2. app: guestbook
  3. tier: backend
  4. role: replica

The labels allow for slicing and dicing the resources along any dimension specified by a label:

  1. kubectl apply -f examples/guestbook/all-in-one/guestbook-all-in-one.yaml
  2. kubectl get pods -Lapp -Ltier -Lrole
  1. NAME READY STATUS RESTARTS AGE APP TIER ROLE
  2. guestbook-fe-4nlpb 1/1 Running 0 1m guestbook frontend <none>
  3. guestbook-fe-ght6d 1/1 Running 0 1m guestbook frontend <none>
  4. guestbook-fe-jpy62 1/1 Running 0 1m guestbook frontend <none>
  5. guestbook-redis-master-5pg3b 1/1 Running 0 1m guestbook backend master
  6. guestbook-redis-replica-2q2yf 1/1 Running 0 1m guestbook backend replica
  7. guestbook-redis-replica-qgazl 1/1 Running 0 1m guestbook backend replica
  8. my-nginx-divi2 1/1 Running 0 29m nginx <none> <none>
  9. my-nginx-o0ef1 1/1 Running 0 29m nginx <none> <none>
  1. kubectl get pods -lapp=guestbook,role=replica
  1. NAME READY STATUS RESTARTS AGE
  2. guestbook-redis-replica-2q2yf 1/1 Running 0 3m
  3. guestbook-redis-replica-qgazl 1/1 Running 0 3m

Updating labels

Sometimes you may want to relabel existing pods and other resources before creating new resources. This can be done with kubectl label. For example, if you want to label all your NGINX Pods as frontend tier, run:

  1. kubectl label pods -l app=nginx tier=fe
  1. pod/my-nginx-2035384211-j5fhi labeled
  2. pod/my-nginx-2035384211-u2c7e labeled
  3. pod/my-nginx-2035384211-u3t6x labeled

This first filters all pods with the label “app=nginx”, and then labels them with the “tier=fe”. To see the pods you labeled, run:

  1. kubectl get pods -l app=nginx -L tier
  1. NAME READY STATUS RESTARTS AGE TIER
  2. my-nginx-2035384211-j5fhi 1/1 Running 0 23m fe
  3. my-nginx-2035384211-u2c7e 1/1 Running 0 23m fe
  4. my-nginx-2035384211-u3t6x 1/1 Running 0 23m fe

This outputs all “app=nginx” pods, with an additional label column of pods’ tier (specified with -L or --label-columns).

For more information, please see kubectl label.

What’s next