YurtHub Performance Test

Background

On the one hand, YurtHub is an important component of OpenYurt,providing the additional abstraction, taking over the traffic from edge to cloud, and supporting the key abilities of Node Autonomy, Flow Closed Loop, and so on. On the other hand, lots of the cloud native scenarios are troubled by the resource limitation of the edge node, as the important component of OpenYurt, the performance of YurtHub in different situation play an influential role in the cluster of OpenYurt. Therefore, we need to obtain deeper understanding on the performance of YurtHub.

Test Environment

Kubernetes Version

Major:"1", Minor:"22", GitVersion:"v1.22.12", GitCommit:"b058e1760c79f46a834ba59bd7a3486ecf28237d", GitTreeState:"clean", BuildDate:"2022-07-13T14:53:39Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"

OpenYurt Version

GitVersion:"v0.7.0", GitCommit:"d331a42", BuildDate:"2022-08-29T13:33:43Z", GoVersion:"go1.17.12", Compiler:"gc", Platform:"linux/amd64"

Node Configuration

Master and WorkNode use the ECSs with different configurations, and the used cluster contains one master and other one hundred worknodes. All worknodes join the cluster by using the command yurtadm on edge mode.

Operating System

MasterNode
LSB Version:core-4.1-amd64:core-4.1-noarch:core-4.1-amd64:core-4.1-noarch
Distributor IDCentOSCentOS
DescriptionCentOS Linux release 7.9.2009 (Core)CentOS Linux release 7.9.2009 (Core)
Release7.9.20097.9.2009
CodenameCoreCore

CPU

MasterNode
Architecturex86_64x86_64
CPU op-mode (s)32-bit, 64-bit32-bit, 64-bit
Byte OrderLittle EndianLittle Endian
CPU (s)82
On-line CPU(s) list0-70,1
Thread(s) per core22
Core(s) per socket44
Socket(s)11
NUMA node(s)11
Vendor IDGenuineIntelGenuineIntel
CPU family66
Model106106
Model nameIntel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHzIntel(R) Xeon(R) Platinum 8369B CPU @ 2.70GHz
Stepping66
CPU MHz2699.9982699.998

Memory

MasterNode
Total memory32245896 K7862304 K

Disk

MasterNode
Total Size40GiB (3800 IOPS)40GiB (3800 IOPS)
TypeESSD Cloud Disk PL0ESSD Cloud Disk PL1

Test Method

Through Promethus collecting three types of indicators from the edge side in the OpenYurt cluster.

  • Resource Occupation:YurtHub container CPU/Mem usage
  • Data Traffic:YurtHub forward the traffic of request
  • Request Delay:the delay of YurtHub forwarding request

The overall test architecture is shown in the following figure YurtHub Performance Test - 图1

Test Result

15:00-19:00 worknodes join successively 19:30 creating 2000 Pod, 1000 Service (deploying by Daemonset format, per node deploys 20 Pods, per Service contains 50 Endpoints) 19:35 all resources creating complete 21:06 delete all resources

Traffic

YurtHub Performance Test - 图2 The above picture is the whole performance of request traffic of YurtHub in the scenario of 100 edge nodes, and the following features can be observed:

  • In normal situation, traffic data have a wave of 5min period, and peak ranges 15-20 KB/s
  • In the process of workload deploying, traffic booms one time, and peak is 350 KB/s
  • In the process of workload downloading, traffic alse booms one time, having shorter duration time and higher peak about 780 KB/s

For further exploration of the source of traffic, we select one machine to analyze the traffic usage YurtHub Performance Test - 图3 The above figure shows the usage of traffic as workload deploying, we can get the rank of usage orderly when traffic mutating:

  • endpointslices, watch, 240 KB/s
  • endpoints, watch, 50 KB/s
  • services, watch, 25 KB/s
  • nodes, watch, 24 KB/s
  • pod, watch, 3 KB/s

The peak of traffic of machine is approximately 320 KB/s, and those traffic contain mostly watch requests about service. The count of endpoint in service(per service 50 endpoints) may cause such situation. In normal case, the watch request about nodes also causes the variety of traffic of 5min period.

YurtHub Performance Test - 图4 The above figure show the performance of traffic of machine in the process of uninstalling, total peak of traffic reach about 780k. Sorting by resource and action, the usage of traffic as the following data show:

  • endpointslices, watch, 540 KB/s
  • service, watch, 140 KB/s
  • endpoints, watch, 100 KB/s

Latency

When collecting latency, we conclude two types latency:

  • Full_latency:record the total time from request reaching YurtHub to request leaving YurtHub
  • Apiserver_latency:record the time from request forwarded by YurtHub to apiserver

    In the procedure of actual testing, two types latency have no difference, so we have full_latency as standard

The following figure show the situation of time spending mostly in the request per type as we see from verb:

  • Delete

YurtHub Performance Test - 图5

  • Create

YurtHub Performance Test - 图6

  • List

YurtHub Performance Test - 图7

  • Update

YurtHub Performance Test - 图8

  • Patch

YurtHub Performance Test - 图9

  • Get

YurtHub Performance Test - 图10 We can see the most time-spending requests mainly are the request of create/get/list about node and the request of list about service.

Memory

YurtHub Performance Test - 图11 In the initial state, before workload deploying, the memory occupation of YurtHub mostly ranges 35-40MB. Two machines use memory mostly, because the monitor suite of Prometheus deploy on those. The bottom line show the variety of YurtHub which is deployed on master by cloud mode. At 19:30 per node deployed 20 Pods, the memory of node improve about 2-5MB, and sustain such level. After workload is deleted, the memory reduce 10MB obviously, and recover to the previous level.

CPU

YurtHub Performance Test - 图12 The occupation of single core CPU is similar to the usage of traffic, normally periodic wave sustains a low level(approximately 0.02%), two peaks of wave show at times of workload deploying(22%)and workload deleting(25%).

Conclusion and Analysis

  • Without the pressure of workload, YurtHub occupys memory of approximately 30-40MB and rare CPU(< 0.02).
    • CPU mainly apply to handle the request received by YurtHub, and the peak of single core occupation could reach about 25% when resource creating.
    • The level of memory is related to the distribution of workload, and varies about 5MB when resource is created and deleted. After all workloads are deleted, memory occupation of YurtHub reduces largely then back to previous level, and specific causes waits to be analysed.
  • The usage of YurtHub traffic shows, in the procedure of resource creating and destroying, lots of requests spring(respectively 350 KB/s and 780 KB/s) in short time, among most traffic come from watch requests related to service(endpointslice, endpoint, service).
  • The delay in the procedure of YurtHub forwarding request can be ignored comparing to request itself, and the delay of request mainly is related to the size of request resource.