Benchmarking

gRPC is designed to support high-performance open-source RPCs in many languages. This document describes the performance benchmarking tools, the scenarios considered by the tests, and the testing infrastructure.

Overview

gRPC is designed for both high-performance and high-productivitydesign of distributed applications. Continuous performancebenchmarking is a critical part of the gRPC developmentworkflow. Multi-language performance tests run hourly againstthe master branch, and these numbers are reported to a dashboard forvisualization.

Performance testing design

Each language implements a performance testing worker that implementsa gRPCWorkerService. Thisservice directs the worker to act as either a client or a server forthe actual benchmark test, represented asBenchmarkService. Thatservice has two methods:

  • UnaryCall - a unary RPC of a simple request that specifies the number of bytes to return in the response
  • StreamingCall - a streaming RPC that allows repeated ping-pongs of request and response messages akin to the UnaryCall

gRPC performance testing worker diagram

gRPC performance testing worker diagram

These workers are controlled by adriverthat takes as input a scenario description (in JSON format) and anenvironment variable specifying the host:port of each worker process.

Languages under test

The following languages have continuous performance testing as bothclients and servers at master:

  • C++
  • Java
  • Go
  • C#
  • node.js
  • Python
  • Ruby

Additionally, all languages derived from C core have limitedperformance testing (smoke testing) conducted at every pull request.

In addition to running as both the client-side and server-side ofperformance tests, all languages are tested as clients against a C++server, and as servers against a C++ client. This test aims to providethe current upper bound of performance for a given language’s client orserver implementation without testing the other side.

Although PHP or mobile environments do not support a gRPC server(which is needed for our performance tests), their client-sideperformance can be benchmarked using a proxy WorkerService written inanother language. This code is implemented for PHP but is not yet incontinuous testing mode.

Scenarios under test

There are several important scenarios under test and displayed in the dashboardsabove, including the following:

  • Contentionless latency - the median and tail response latencies seen with only 1 client sending a single message at a time using StreamingCall
  • QPS - the messages/second rate when there are 2 clients and a total of 64 channels, each of which has 100 outstanding messages at a time sent using StreamingCall
  • Scalability (for selected languages) - the number of messages/second per server core

Most performance testing is using secure communication andprotobufs. Some C++ tests additionally use insecure communication andthe generic (non-protobuf) API to display peak performance. Additionalscenarios may be added in the future.

Testing infrastructure

All performance benchmarks are run as instances in GCE through ourJenkins testing infrastructure. In addition to the gRPC performancescenarios described above, we also run baselinenetperfTCP_RR latency numbers in order to understandthe underlying network characteristics. These numbers are present onour dashboard and sometimes vary depending on where our instanceshappen to be allocated within GCE.

Most test instances are 8-core systems, and these are used for bothlatency and QPS measurement. For C++ and Java, we additionally supportQPS testing on 32-core systems. All QPS tests use 2 identical client machinesfor each server, to make sure that QPS measurement is not client-limited.