This guide describes the built-in monitoring support provided by the Operator SDK using the Prometheus Operator and details usage for Operator authors.

Prometheus Operator support

Prometheus is an open-source systems monitoring and alerting toolkit. The Prometheus Operator creates, configures, and manages Prometheus clusters running on Kubernetes-based clusters, such as OpenShift Container Platform.

Helper functions exist in the Operator SDK by default to automatically set up metrics in any generated Go-based Operator for use on clusters where the Prometheus Operator is deployed.

Metrics helper

In Go-based Operators generated using the Operator SDK, the following function exposes general metrics about the running program:

func ExposeMetricsPort(ctx context.Context, port int32) (*v1.Service, error)

These metrics are inherited from the controller-runtime library API. By default, the metrics are served on 0.0.0.0:8383/metrics.

A Service object is created with the metrics port exposed, which can be then accessed by Prometheus. The Service object is garbage collected when the leader Pod’s root owner is deleted.

The following example is present in the cmd/manager/main.go file in all Operators generated using the Operator SDK:

import(
    "github.com/operator-framework/operator-sdk/pkg/metrics"
    "machine.openshift.io/controller-runtime/pkg/manager"
)

var (
    // Change the below variables to serve metrics on a different host or port.
    metricsHost       = "0.0.0.0" (1)
    metricsPort int32 = 8383 (2)
)
...
func main() {
    ...
    // Pass metrics address to controller-runtime manager
    mgr, err := manager.New(cfg, manager.Options{
        Namespace:          namespace,
        MetricsBindAddress: fmt.Sprintf("%s:%d", metricsHost, metricsPort),
    })

    ...
    // Create Service object to expose the metrics port.
    _, err = metrics.ExposeMetricsPort(ctx, metricsPort)
    if err != nil {
        // handle error
        log.Info(err.Error())
    }
    ...
}
1 The host that the metrics are exposed on.
2 The port that the metrics are exposed on.

Modifying the metrics port

Operator authors can modify the port that metrics are exposed on.

Prerequisites
  • Go-based Operator generated using the Operator SDK

  • Kubernetes-based cluster with the Prometheus Operator deployed

Procedure
  • In the generated Operator’s cmd/manager/main.go file, change the value of metricsPort in the line var metricsPort int32 = 8383.

ServiceMonitor resources

A ServiceMonitor is a Custom Resource Definition (CRD) provided by the Prometheus Operator that discovers the Endpoints in Service objects and configures Prometheus to monitor those Pods.

In Go-based Operators generated using the Operator SDK, the GenerateServiceMonitor() helper function can take a Service object and generate a ServiceMonitor Custom Resource (CR) based on it.

Additional resources

Creating ServiceMonitor resources

Operator authors can add Service target discovery of created monitoring Services using the metrics.CreateServiceMonitor() helper function, which accepts the newly created Service.

Prerequisites
  • Go-based Operator generated using the Operator SDK

  • Kubernetes-based cluster with the Prometheus Operator deployed

Procedure
  • Add the metrics.CreateServiceMonitor() helper function to your Operator code:

    import(
        "k8s.io/api/core/v1"
        "github.com/operator-framework/operator-sdk/pkg/metrics"
        "machine.openshift.io/controller-runtime/pkg/client/config"
    )
    func main() {
    
        ...
        // Populate below with the Service(s) for which you want to create ServiceMonitors.
        services := []*v1.Service{}
        // Create one ServiceMonitor per application per namespace.
        // Change the below value to name of the Namespace you want the ServiceMonitor to be created in.
        ns := "default"
        // restConfig is used for talking to the Kubernetes apiserver
        restConfig := config.GetConfig()
    
        // Pass the Service(s) to the helper function, which in turn returns the array of ServiceMonitor objects.
        serviceMonitors, err := metrics.CreateServiceMonitors(restConfig, ns, services)
        if err != nil {
            // Handle errors here.
        }
        ...
    }