×

Distributed tracing overview

As a service owner, you can use distributed tracing to instrument your services to gather insights into your service architecture. You can use distributed tracing for monitoring, network profiling, and troubleshooting the interaction between components in modern, cloud-native, microservices-based applications.

With distributed tracing you can perform the following functions:

  • Monitor distributed transactions

  • Optimize performance and latency

  • Perform root cause analysis

Red Hat OpenShift distributed tracing consists of two main components:

  • Red Hat OpenShift distributed tracing platform - This component is based on the open source Jaeger project.

  • Red Hat OpenShift distributed tracing data collection - This component is based on the open source OpenTelemetry project.

Both of these components are based on the vendor-neutral OpenTracing APIs and instrumentation.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see our CTO Chris Wright’s message.

Getting support

If you experience difficulty with a procedure described in this documentation, or with OpenShift Container Platform in general, visit the Red Hat Customer Portal. From the Customer Portal, you can:

  • Search or browse through the Red Hat Knowledgebase of articles and solutions relating to Red Hat products.

  • Submit a support case to Red Hat Support.

  • Access other product documentation.

To identify issues with your cluster, you can use Insights in OpenShift Cluster Manager. Insights provides details about issues and, if available, information on how to solve a problem.

If you have a suggestion for improving this documentation or have found an error, submit a Jira issue for the most relevant documentation component. Please provide specific details, such as the section name and OpenShift Container Platform version.

New features and enhancements

This release adds improvements related to the following components and concepts.

New features and enhancements Red Hat OpenShift distributed tracing 2.7

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

Component versions supported in Red Hat OpenShift distributed tracing version 2.7

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.39

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.63.1

New features and enhancements Red Hat OpenShift distributed tracing 2.6

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

Component versions supported in Red Hat OpenShift distributed tracing version 2.6

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.38

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.60

New features and enhancements Red Hat OpenShift distributed tracing 2.5

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

This release introduces support for ingesting OpenTelemetry protocol (OTLP) to the Red Hat OpenShift distributed tracing platform Operator. The Operator now automatically enables the OTLP ports:

  • Port 4317 is used for OTLP gRPC protocol.

  • Port 4318 is used for OTLP HTTP protocol.

This release also adds support for collecting Kubernetes resource attributes to the Red Hat OpenShift distributed tracing data collection Operator.

Component versions supported in Red Hat OpenShift distributed tracing version 2.5

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.36

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.56

New features and enhancements Red Hat OpenShift distributed tracing 2.4

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

This release also adds support for auto-provisioning certificates using the Red Hat Elasticsearch Operator.

  • Self-provisioning, which means using the Red Hat OpenShift distributed tracing platform Operator to call the Red Hat Elasticsearch Operator during installation. Self provisioning is fully supported with this release.

  • Creating the Elasticsearch instance and certificates first and then configuring the distributed tracing platform to use the certificate is a Technology Preview for this release.

When upgrading to Red Hat OpenShift distributed tracing 2.4, the Operator recreates the Elasticsearch instance, which might take five to ten minutes. Distributed tracing will be down and unavailable for that period.

Component versions supported in Red Hat OpenShift distributed tracing version 2.4

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.34.1

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.49

New features and enhancements Red Hat OpenShift distributed tracing 2.3.1

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

Component versions supported in Red Hat OpenShift distributed tracing version 2.3.1

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.30.2

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.44.1-1

New features and enhancements Red Hat OpenShift distributed tracing 2.3.0

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

With this release, the Red Hat OpenShift distributed tracing platform Operator is now installed to the openshift-distributed-tracing namespace by default. Before this update, the default installation had been in the openshift-operators namespace.

Component versions supported in Red Hat OpenShift distributed tracing version 2.3.0

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.30.1

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.44.0

New features and enhancements Red Hat OpenShift distributed tracing 2.2.0

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

Component versions supported in Red Hat OpenShift distributed tracing version 2.2.0

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.30.0

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.42.0

New features and enhancements Red Hat OpenShift distributed tracing 2.1.0

This release of Red Hat OpenShift distributed tracing addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

Component versions supported in Red Hat OpenShift distributed tracing version 2.1.0

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.29.1

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.41.1

New features and enhancements Red Hat OpenShift distributed tracing 2.0.0

This release marks the rebranding of Red Hat OpenShift Jaeger to Red Hat OpenShift distributed tracing. This release consists of the following changes, additions, and improvements:

  • Red Hat OpenShift distributed tracing now consists of the following two main components:

    • Red Hat OpenShift distributed tracing platform - This component is based on the open source Jaeger project.

    • Red Hat OpenShift distributed tracing data collection - This component is based on the open source OpenTelemetry project.

  • Updates Red Hat OpenShift distributed tracing platform Operator to Jaeger 1.28. Going forward, Red Hat OpenShift distributed tracing will only support the stable Operator channel. Channels for individual releases are no longer supported.

  • Introduces a new Red Hat OpenShift distributed tracing data collection Operator based on OpenTelemetry 0.33. Note that this Operator is a Technology Preview feature.

  • Adds support for OpenTelemetry protocol (OTLP) to the Query service.

  • Introduces a new distributed tracing icon that appears in the OpenShift OperatorHub.

  • Includes rolling updates to the documentation to support the name change and new features.

This release also addresses Common Vulnerabilities and Exposures (CVEs) and bug fixes.

Component versions supported in Red Hat OpenShift distributed tracing version 2.0.0

Operator Component Version

Red Hat OpenShift distributed tracing platform

Jaeger

1.28.0

Red Hat OpenShift distributed tracing data collection

OpenTelemetry

0.33.0

Red Hat OpenShift distributed tracing Technology Preview

Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Red Hat OpenShift distributed tracing 2.4.0 Technology Preview

This release also adds support for auto-provisioning certificates using the Red Hat Elasticsearch Operator.

  • Self-provisioning, which means using the Red Hat OpenShift distributed tracing platform Operator to call the Red Hat Elasticsearch Operator during installation. Self provisioning is fully supported with this release.

  • Creating the Elasticsearch instance and certificates first and then configuring the distributed tracing platform to use the certificate is a Technology Preview for this release.

Red Hat OpenShift distributed tracing 2.2.0 Technology Preview

Unsupported OpenTelemetry Collector components included in the 2.1 release have been removed.

Red Hat OpenShift distributed tracing 2.1.0 Technology Preview

This release introduces a breaking change to how to configure certificates in the OpenTelemetry custom resource file. In the new version, the ca_file moves under tls in the custom resource, as shown in the following examples.

CA file configuration for OpenTelemetry version 0.33
spec:
  mode: deployment
  config: |
    exporters:
      jaeger:
        endpoint: jaeger-production-collector-headless.tracing-system.svc:14250
        ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"
CA file configuration for OpenTelemetry version 0.41.1
spec:
  mode: deployment
  config: |
    exporters:
      jaeger:
        endpoint: jaeger-production-collector-headless.tracing-system.svc:14250
        tls:
          ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"

Red Hat OpenShift distributed tracing 2.0.0 Technology Preview

This release includes the addition of the Red Hat OpenShift distributed tracing data collection, which you install using the Red Hat OpenShift distributed tracing data collection Operator. Red Hat OpenShift distributed tracing data collection is based on the OpenTelemetry APIs and instrumentation.

Red Hat OpenShift distributed tracing data collection includes the OpenTelemetry Operator and Collector. The Collector can be used to receive traces in either the OpenTelemetry or Jaeger protocol and send the trace data to Red Hat OpenShift distributed tracing. Other capabilities of the Collector are not supported at this time.

The OpenTelemetry Collector allows developers to instrument their code with vendor agnostic APIs, avoiding vendor lock-in and enabling a growing ecosystem of observability tooling.

Red Hat OpenShift distributed tracing known issues

These limitations exist in Red Hat OpenShift distributed tracing:

  • Apache Spark is not supported.

  • The streaming deployment via AMQ/Kafka is unsupported on IBM Z and IBM Power Systems.

These are the known issues for Red Hat OpenShift distributed tracing:

  • OBSDA-220 In some cases, if you try to pull an image using distributed tracing data collection, the image pull fails and a Failed to pull image error message appears. There is no workaround for this issue.

  • TRACING-2057 The Kafka API has been updated to v1beta2 to support the Strimzi Kafka Operator 0.23.0. However, this API version is not supported by AMQ Streams 1.6.3. If you have the following environment, your Jaeger services will not be upgraded, and you cannot create new Jaeger services or modify existing Jaeger services:

    • Jaeger Operator channel: 1.17.x stable or 1.20.x stable

    • AMQ Streams Operator channel: amq-streams-1.6.x

      To resolve this issue, switch the subscription channel for your AMQ Streams Operator to either amq-streams-1.7.x or stable.

Red Hat OpenShift distributed tracing fixed issues

  • OSSM-1910 Because of an issue introduced in version 2.6, TLS connections could not be established with OpenShift Container Platform Service Mesh. This update resolves the issue by changing the service port names to match conventions used by OpenShift Container Platform Service Mesh and Istio.

  • OBSDA-208 Before this update, the default 200m CPU and 256Mi memory resource limits could cause distributed tracing data collection to restart continuously on large clusters. This update resolves the issue by removing these resource limits.

  • OBSDA-222 Before this update, spans could be dropped in the OpenShift Container Platform distributed tracing platform. To help prevent this issue from occurring, this release updates version dependencies.

  • TRACING-2337 Jaeger is logging a repetitive warning message in the Jaeger logs similar to the following:

    {"level":"warn","ts":1642438880.918793,"caller":"channelz/logging.go:62","msg":"[core]grpc: Server.Serve failed to create ServerTransport: connection error: desc = \"transport: http2Server.HandleStreams received bogus greeting from client: \\\"\\\\x16\\\\x03\\\\x01\\\\x02\\\\x00\\\\x01\\\\x00\\\\x01\\\\xfc\\\\x03\\\\x03vw\\\\x1a\\\\xc9T\\\\xe7\\\\xdaCj\\\\xb7\\\\x8dK\\\\xa6\\\"\"","system":"grpc","grpc_log":true}

    This issue was resolved by exposing only the HTTP(S) port of the query service, and not the gRPC port.

  • TRACING-2009 The Jaeger Operator has been updated to include support for the Strimzi Kafka Operator 0.23.0.

  • TRACING-1907 The Jaeger agent sidecar injection was failing due to missing config maps in the application namespace. The config maps were getting automatically deleted due to an incorrect OwnerReference field setting and as a result, the application pods were not moving past the "ContainerCreating" stage. The incorrect settings have been removed.

  • TRACING-1725 Follow-up to TRACING-1631. Additional fix to ensure that Elasticsearch certificates are properly reconciled when there are multiple Jaeger production instances, using same name but within different namespaces. See also BZ-1918920.

  • TRACING-1631 Multiple Jaeger production instances, using same name but within different namespaces, causing Elasticsearch certificate issue. When multiple service meshes were installed, all of the Jaeger Elasticsearch instances had the same Elasticsearch secret instead of individual secrets, which prevented the OpenShift Elasticsearch Operator from communicating with all of the Elasticsearch clusters.

  • TRACING-1300 Failed connection between Agent and Collector when using Istio sidecar. An update of the Jaeger Operator enabled TLS communication by default between a Jaeger sidecar agent and the Jaeger Collector.

  • TRACING-1208 Authentication "500 Internal Error" when accessing Jaeger UI. When trying to authenticate to the UI using OAuth, I get a 500 error because oauth-proxy sidecar doesn’t trust the custom CA bundle defined at installation time with the additionalTrustBundle.

  • TRACING-1166 It is not currently possible to use the Jaeger streaming strategy within a disconnected environment. When a Kafka cluster is being provisioned, it results in a error: Failed to pull image registry.redhat.io/amq7/amq-streams-kafka-24-rhel7@sha256:f9ceca004f1b7dccb3b82d9a8027961f9fe4104e0ed69752c0bdd8078b4a1076.

  • TRACING-809 Jaeger Ingester is incompatible with Kafka 2.3. When there are two or more instances of the Jaeger Ingester and enough traffic it will continuously generate rebalancing messages in the logs. This is due to a regression in Kafka 2.3 that was fixed in Kafka 2.3.1. For more information, see Jaegertracing-1819.

  • BZ-1918920/LOG-1619 The Elasticsearch pods does not get restarted automatically after an update.

    Workaround: Restart the pods manually.