×

Distributed tracing records the path of a request through the various services that make up an application. It is used to tie information about different units of work together, to understand a whole chain of events in a distributed transaction. The units of work might be executed in different processes or hosts.

Distributed tracing overview

As a service owner, you can use distributed tracing to instrument your services to gather insights into your service architecture. You can use distributed tracing for monitoring, network profiling, and troubleshooting the interaction between components in modern, cloud-native, microservices-based applications.

With distributed tracing you can perform the following functions:

  • Monitor distributed transactions

  • Optimize performance and latency

  • Perform root cause analysis

Red Hat OpenShift distributed tracing consists of two main components:

  • Red Hat OpenShift distributed tracing platform - This component is based on the open source Jaeger project.

  • Red Hat OpenShift distributed tracing data collection - This component is based on the open source OpenTelemetry project.

Both of these components are based on the vendor-neutral OpenTracing APIs and instrumentation.

Using Red Hat OpenShift distributed tracing to enable distributed tracing

Red Hat OpenShift distributed tracing is made up of several components that work together to collect, store, and display tracing data. You can use Red Hat OpenShift distributed tracing with OpenShift Serverless to monitor and troubleshoot serverless applications.

Prerequisites
  • You have access to an OpenShift Container Platform account with cluster administrator access.

  • You have not yet installed the OpenShift Serverless Operator, Knative Serving, and Knative Eventing. These must be installed after the Red Hat OpenShift distributed tracing installation.

  • You have installed Red Hat OpenShift distributed tracing by following the OpenShift Container Platform "Installing distributed tracing" documentation.

  • You have installed the OpenShift CLI (oc).

  • You have created a project or have access to a project with the appropriate roles and permissions to create applications and other workloads in OpenShift Container Platform.

Procedure
  1. Create an OpenTelemetryCollector custom resource (CR):

    Example OpenTelemetryCollector CR
    apiVersion: opentelemetry.io/v1alpha1
    kind: OpenTelemetryCollector
    metadata:
      name: cluster-collector
      namespace: <namespace>
    spec:
      mode: deployment
      config: |
        receivers:
          zipkin:
        processors:
        exporters:
          jaeger:
            endpoint: jaeger-all-in-one-inmemory-collector-headless.tracing-system.svc:14250
            tls:
              ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"
          logging:
        service:
          pipelines:
            traces:
              receivers: [zipkin]
              processors: []
              exporters: [jaeger, logging]
  2. Verify that you have two pods running in the namespace where Red Hat OpenShift distributed tracing is installed:

    $ oc get pods -n <namespace>
    Example output
    NAME                                          READY   STATUS    RESTARTS   AGE
    cluster-collector-collector-85c766b5c-b5g99   1/1     Running   0          5m56s
    jaeger-all-in-one-inmemory-ccbc9df4b-ndkl5    2/2     Running   0          15m
  3. Verify that the following headless services have been created:

    $ oc get svc -n <namespace> | grep headless
    Example output
    cluster-collector-collector-headless            ClusterIP   None             <none>        9411/TCP                                 7m28s
    jaeger-all-in-one-inmemory-collector-headless   ClusterIP   None             <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   16m

    These services are used to configure Jaeger, Knative Serving, and Knative Eventing. The name of the Jaeger service may vary.

  4. Install the OpenShift Serverless Operator by following the "Installing the OpenShift Serverless Operator" documentation.

  5. Install Knative Serving by creating the following KnativeServing CR:

    Example KnativeServing CR
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
    metadata:
        name: knative-serving
        namespace: knative-serving
    spec:
      config:
        tracing:
          backend: "zipkin"
          zipkin-endpoint: "http://cluster-collector-collector-headless.tracing-system.svc:9411/api/v2/spans"
          debug: "false"
          sample-rate: "0.1" (1)
    1 The sample-rate defines sampling probability. Using sample-rate: "0.1" means that 1 in 10 traces are sampled.
  6. Install Knative Eventing by creating the following KnativeEventing CR:

    Example KnativeEventing CR
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeEventing
    metadata:
        name: knative-eventing
        namespace: knative-eventing
    spec:
      config:
        tracing:
          backend: "zipkin"
          zipkin-endpoint: "http://cluster-collector-collector-headless.tracing-system.svc:9411/api/v2/spans"
          debug: "false"
          sample-rate: "0.1" (1)
    1 The sample-rate defines sampling probability. Using sample-rate: "0.1" means that 1 in 10 traces are sampled.
  7. Create a Knative service:

    Example service
    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: helloworld-go
    spec:
      template:
        metadata:
          labels:
            app: helloworld-go
          annotations:
            autoscaling.knative.dev/minScale: "1"
            autoscaling.knative.dev/target: "1"
        spec:
          containers:
          - image: quay.io/openshift-knative/helloworld:v1.2
            imagePullPolicy: Always
            resources:
              requests:
                cpu: "200m"
            env:
            - name: TARGET
              value: "Go Sample v1"
  8. Make some requests to the service:

    Example HTTPS request
    $ curl https://helloworld-go.example.com
  9. Get the URL for the Jaeger web console:

    Example command
    $ oc get route jaeger-all-in-one-inmemory  -o jsonpath='{.spec.host}' -n <namespace>

    You can now examine traces by using the Jaeger console.

Using Jaeger to enable distributed tracing

If you do not want to install all of the components of Red Hat OpenShift distributed tracing, you can still use distributed tracing on OpenShift Container Platform with OpenShift Serverless. To do this, you must install and configure Jaeger as a standalone integration.

Prerequisites
  • You have access to an OpenShift Container Platform account with cluster administrator access.

  • You have installed the OpenShift Serverless Operator, Knative Serving, and Knative Eventing.

  • You have installed the Red Hat OpenShift distributed tracing platform Operator.

  • You have installed the OpenShift CLI (oc).

  • You have created a project or have access to a project with the appropriate roles and permissions to create applications and other workloads in OpenShift Container Platform.

Procedure
  1. Create and apply a Jaeger custom resource (CR) that contains the following:

    Jaeger CR
    apiVersion: jaegertracing.io/v1
    kind: Jaeger
    metadata:
      name: jaeger
      namespace: default
  2. Enable tracing for Knative Serving, by editing the KnativeServing CR and adding a YAML configuration for tracing:

    Tracing YAML example for Serving
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
    metadata:
      name: knative-serving
      namespace: knative-serving
    spec:
      config:
        tracing:
          sample-rate: "0.1" (1)
          backend: zipkin (2)
          zipkin-endpoint: "http://jaeger-collector.default.svc.cluster.local:9411/api/v2/spans" (3)
          debug: "false" (4)
    1 The sample-rate defines sampling probability. Using sample-rate: "0.1" means that 1 in 10 traces are sampled.
    2 backend must be set to zipkin.
    3 The zipkin-endpoint must point to your jaeger-collector service endpoint. To get this endpoint, substitute the namespace where the Jaeger CR is applied.
    4 Debugging should be set to false. Enabling debug mode by setting debug: "true" allows all spans to be sent to the server, bypassing sampling.
  3. Enable tracing for Knative Eventing by editing the KnativeEventing CR:

    Tracing YAML example for Eventing
    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeEventing
    metadata:
      name: knative-eventing
      namespace: knative-eventing
    spec:
      config:
        tracing:
          sample-rate: "0.1" (1)
          backend: zipkin (2)
          zipkin-endpoint: "http://jaeger-collector.default.svc.cluster.local:9411/api/v2/spans" (3)
          debug: "false" (4)
    1 The sample-rate defines sampling probability. Using sample-rate: "0.1" means that 1 in 10 traces are sampled.
    2 Set backend to zipkin.
    3 Point the zipkin-endpoint to your jaeger-collector service endpoint. To get this endpoint, substitute the namespace where the Jaeger CR is applied.
    4 Debugging should be set to false. Enabling debug mode by setting debug: "true" allows all spans to be sent to the server, bypassing sampling.
Verification

You can access the Jaeger web console to see tracing data, by using the jaeger route.

  1. Get the jaeger route’s hostname by entering the following command:

    $ oc get route jaeger -n default
    Example output
    NAME     HOST/PORT                         PATH   SERVICES       PORT    TERMINATION   WILDCARD
    jaeger   jaeger-default.apps.example.com          jaeger-query   <all>   reencrypt     None
  2. Open the endpoint address in your browser to view the console.