OpenTelemetry

The concept of Distributed Tracing was first proposed by Google. The technology has matured over time, and there are now some protocol standards available for reference. Currently, two open-source frameworks are quite influential in this area: Netflix's OpenTracing and Google's OpenCensus. Both frameworks have a sizable developer community. To establish a unified technical standard, the two frameworks merged to form the OpenTelemetry project, abbreviated as otel. For more details, you can refer to:

  1. Introduction to OpenTracing
  2. Introduction to OpenTelemetry

Thus, our tracing technology solution is implemented based on the OpenTelemetry standard, with some open-source projects implementing the protocol standard in Golang:

  1. https://github.com/open-telemetry/opentelemetry-go
  2. https://github.com/open-telemetry/opentelemetry-go-contrib

Other third-party frameworks and systems (such as Jaeger/Prometheus/Grafana) also adhere to standardized protocols to integrate with OpenTelemetry, significantly reducing development and maintenance costs.

Tracing - Intro - 图1

Key Concepts

Let’s first look at the architecture diagram of OpenTelemetry. We won’t introduce it in full, only the commonly used concepts. For an introduction to the internal technical architecture of OpenTelemetry, you can refer to OpenTelemetry Architecture, and for semantic conventions, please refer to: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md

Tracing - Intro - 图2

TracerProvider

Mainly responsible for creating Tracer, usually requiring a concrete implementation from a third-party distributed tracing management platform. By default, it is an empty TracerProvider (NoopTracerProvider). While it can create Tracer, it doesn’t actually execute specific data flow transmission logic internally.

Tracer

Tracer represents a complete tracing, consisting of one or more span. The example below illustrates a tracer consisting of 8 spans:

  1. [Span A] ←←←(the root span)
  2. |
  3. +------+------+
  4. | |
  5. [Span B] [Span C] ←←←(Span C is a `ChildOf` Span A)
  6. | |
  7. [Span D] +---+-------+
  8. | |
  9. [Span E] [Span F] >>> [Span G] >>> [Span H]
  10. (Span G `FollowsFrom` Span F)

The timeline representation makes it easier to understand:

  1. ––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time
  2. [Span A···················································]
  3. [Span B··············································]
  4. [Span D··········································]
  5. [Span C········································]
  6. [Span E·······] [Span F··] [Span G··] [Span H··]

We usually create a Tracer in the following way:

  1. gtrace.NewTracer(tracerName)

Span

A Span is a fundamental component of a tracing. It represents a single work unit, such as a function call or an HTTP request. A span records the following essential elements:

  • Service name (operation name)
  • Service start and end times
  • K/V form Tags
  • K/V form Logs
  • SpanContext

Span is the most frequently used object among them, thus creating a Span is very straightforward, for example:

  1. gtrace.NewSpan(ctx, spanName, opts...)

Attributes

Attributes are stored as K/V key-value pairs for user-defined tags, mainly used for query filtering of tracing results. For example: http.method="GET",http.status_code=200. The key must be a string, and the value must be a string, boolean, or numeric type. Attributes in a span are only visible to itself and are not passed to subsequent spans through SpanContext. The way to set Attributes is as follows:

  1. span.SetAttributes(
  2. label.String("http.remote", conn.RemoteAddr().String()),
  3. label.String("http.local", conn.LocalAddr().String()),
  4. )

Events

Events are similar to Attributes and are also in the form of K/V key-value pairs. Unlike Attributes, Events also record the time when they are written, so Events are mainly used to record the time certain events occur. The key for Events must also be a string, but there are no restrictions on the type of value. For example:

  1. span.AddEvent("http.request", trace.WithAttributes(
  2. label.Any("http.request.header", headers),
  3. label.Any("http.request.baggage", gtrace.GetBaggageMap(ctx)),
  4. label.String("http.request.body", bodyContent),
  5. ))

SpanContext

SpanContext carries some data used for cross-service (cross-process) communication, mainly including:

  • Information sufficient to identify this span within the system, e.g., span_id, trace_id.

  • Baggage - maintains user-defined K/V formatted data for cross-service (cross-process) tracing in the entire tracing link. Baggage is similar to Attributes, also in K/V key-value pair format. The differences are:

  • Both key and value can only be in string format.

  • Baggage is visible not only to the current span but is also passed to all subsequent child spans through SpanContext. Be cautious using Baggage because transmitting these K,V in all spans incurs non-trivial network and CPU overhead.

Propagator

The Propagator is used for encoding/decoding data end-to-end, such as data transmission from the Client to the Server. TraceId, SpanId, and Baggage also need to be managed for data transmission through the propagator. Business-end developers are usually not aware of the Propagator, only middleware/interceptor developers need to understand its role. The standard protocol implementation library of OpenTelemetry provides a commonly used TextMapPropagator for common text data end-to-end transmission. Furthermore, to ensure compatibility of the data transmitted in the TextMapPropagator, special characters should not be included. For details, please refer to: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/api-propagators.md

The GoFrame framework uses the following propagator objects through the gtrace module and sets them globally in OpenTelemetry:

  1. // defaultTextMapPropagator is the default propagator for context propagation between peers.
  2. defaultTextMapPropagator = propagation.NewCompositeTextMapPropagator(
  3. propagation.TraceContext{},
  4. propagation.Baggage{},
  5. )

Supported Components

Tracing - Intro - 图3tip

The core components of GoFrame have fully supported the OpenTelemetry standard and automatically enable the tracing feature, with no need for developers to explicitly call or use it. Without injecting an external TracerProvider, the framework will use the default TracerProvider, which will only automatically create a TraceID and SpanID to facilitate tracing the request log and will not execute complex logic.

Including but not limited to the following core components:

Components with Auto Tracing SupportComponent NameDescription
HTTP ClientgclientThe HTTP client automatically enables the tracing feature. For specific usage examples, please refer to the subsequent example chapter.
Http ServerghttpThe HTTP server automatically enables the tracing feature. For specific usage examples, please refer to the subsequent example chapter.
gRPC Clientcontrib/rpc/grpcxThe gRPC client automatically enables the tracing feature. For specific usage examples, please refer to the subsequent example chapter.
gRPC Servercontrib/rpc/grpcxThe gRPC server automatically enables the tracing feature. For specific usage examples, please refer to the subsequent example chapter.
LoggingglogThe log content needs to inject the current request’s TraceId to quickly locate issues through logs. This feature is implemented by the glog component. Developers need to call the Ctx chain operation method to pass the context.Context context variable to the current logging operation chain when outputting logs. Failing to pass the context.Context will result in losing the TraceId in the log content.
ORMgdbThe execution of the database is an essential part of the link. The Orm component needs to deliver its execution information into the tracing as part of the execution trace.
NoSQL RedisgredisThe execution of Redis is also an essential part of the tracing. Redis needs to deliver its execution information into the tracing as part of the execution trace.
UtilsgtraceManaging the Tracing feature requires some encapsulation, mainly considering extensibility and usability. This encapsulation is implemented by the gtrace module. The documentation can be found at: https://pkg.go.dev/github.com/gogf/gf/v2/net/gtrace

Reference Materials