×

Overview

Within OpenShift Enterprise, Kubernetes manages containerized applications across a set of containers or hosts and provides mechanisms for deployment, maintenance, and application-scaling. Docker packages, instantiates, and runs containerized applications.

A Kubernetes cluster consists of one or more masters and a set of nodes. You can optionally configure your masters for high availability (HA) to ensure that the cluster has no single point of failure.

Masters

The master is the host or hosts that contain the master components, including the API server, controller manager server, and etcd. The master manages nodes in its Kubernetes cluster and schedules pods to run on nodes.

Table 1. Master Components
Component Description

API Server

The Kubernetes API server validates and configures the data for pods, services, and replication controllers. It also assigns pods to nodes and synchronizes pod information with service configuration. Can be run as a standalone process.

etcd

etcd stores the persistent master state while other components watch etcd for changes to bring themselves into the desired state. etcd can be optionally configured for high availability, typically deployed with 2n+1 peer services.

Controller Manager Server

The controller manager server watches etcd for changes to replication controller objects and then uses the API to enforce the desired state. Can be run as a standalone process. Several such processes create a cluster with one active leader at a time.

Virtual IP

Optional, used when configuring highly-available masters with the pacemaker method. There is one virtual IP (VIP) and it is managed by Pacemaker.

The VIP is the single point of contact, but not a single point of failure, for all OpenShift clients that:

  • cannot be configured with all master service endpoints, or

  • do not know how to load balance across multiple masters nor retry failed master service connections.

HAProxy

Optional, used when configuring highly-available masters with the native method to balance load between API master endpoints.

The advanced installation method can configure HAProxy for you with the native method. Alternatively, you can use the native method but pre-configure your own load balancer of choice, or use the pacemaker HA method instead.

Pacemaker

Optional, used when configuring highly-available masters with the pacemaker method. Requires a High Availability Add-on for Red Hat Enterprise Linux subscription.

Pacemaker is the core technology of the High Availability Add-on for Red Hat Enterprise Linux, providing consensus, fencing, and service management. It can be run on all master hosts to ensure that all active-passive components have one instance running.

Another option is to use HAProxy load balancer to switch between API endpoints.

High Availability Masters

You can optionally configure your masters for high availability (HA) to ensure that the cluster has no single point of failure.

To mitigate concerns about availability of the master, two activities are recommended:

  1. A runbook entry should be created for reconstructing the master. A runbook entry is a necessary backstop for any highly-available service. Additional solutions merely control the frequency that the runbook must be consulted. For example, a cold standby of the master host can adequately fulfill SLAs that require no more than minutes of downtime for creation of new applications or recovery of failed application components.

  2. Use a high availability solution to configure your masters and ensure that the cluster has no single point of failure. The advanced installation method

provides specific examples using either the native or pacemaker HA method, configuring HAProxy or Pacemaker, respectively. You can also take the concepts and apply them towards your existing HA solutions using the native method instead of HAProxy.

Moving from a single master cluster to multiple masters after installation is not supported.

When using the native HA method with HAProxy, master components have the following availability:

Table 2. Availability Matrix with HAProxy
Role Style Notes

etcd

Active-active

Fully redundant deployment with load balancing

API Server

Active-active

Managed by HAProxy

Controller Manager Server

Active-passive

One instance is elected as a cluster leader at a time

HAProxy

Active-passive

Balances load between API master endpoints

When using the pacemaker method, the availability table is slightly different:

Table 3. Availability Matrix with Pacemaker
Role Style Notes

etcd

Active-active

Fully redundant deployment with load balancing

Master service

Active-passive

One active at a time, managed by Pacemaker

Pacemaker

Active-active

Fully redundant deployment

Virtual IP

Active-passive

One active at a time, managed by Pacemaker

Highly-available Masters Using Pacemaker
Figure 1. Highly-available Masters Using Pacemaker

Nodes

A node provides the runtime environments for containers. Each node in a Kubernetes cluster has the required services to be managed by the master. Nodes also have the required services to run pods, including Docker, a kubelet, and a service proxy.

OpenShift Enterprise creates nodes from a cloud provider, physical systems, or virtual systems. Kubernetes interacts with node objects that are a representation of those nodes. The master uses the information from node objects to validate nodes with health checks. A node is ignored until it passes the health checks, and the master continues checking nodes until they are valid. The Kubernetes documentation has more information on node management.

Administrators can manage nodes in an OpenShift Enterprise instance using the CLI. To define full configuration and security options when launching node servers, use dedicated node configuration files.

Kubelet

Each node has a kubelet that updates the node as specified by a container manifest, which is a YAML file that describes a pod. The kubelet uses a set of manifests to ensure that its containers are started and that they continue to run. A sample manifest can be found in the Kubernetes documentation.

A container manifest can be provided to a kubelet by:

  • A file path on the command line that is checked every 20 seconds.

  • An HTTP endpoint passed on the command line that is checked every 20 seconds.

  • The kubelet watching an etcd server, such as /registry/hosts/$(hostname -f), and acting on any changes.

  • The kubelet listening for HTTP and responding to a simple API to submit a new manifest.

Service Proxy

Each node also runs a simple network proxy that reflects the services defined in the API on that node. This allows the node to do simple TCP and UDP stream forwarding across a set of back ends.

Node Object Definition

The following is an example node object definition in Kubernetes:

apiVersion: v1 (1)
kind: Node (2)
metadata:
  creationTimestamp: null
  labels: (3)
    kubernetes.io/hostname: node1.example.com
  name: node1.example.com (4)
spec:
  externalID: node1.example.com (5)
status:
  nodeInfo:
    bootID: ""
    containerRuntimeVersion: ""
    kernelVersion: ""
    kubeProxyVersion: ""
    kubeletVersion: ""
    machineID: ""
    osImage: ""
    systemUUID: ""
1 apiVersion defines the API version to use.
2 kind set to Node identifies this as a definition for a node object.
3 metadata.labels lists any labels that have been added to the node.
4 metadata.name is a required value that defines the name of the node object. This value is shown in the NAME column when running the oc get nodes command.
5 spec.externalID defines the fully-qualified domain name where the node can be reached. Defaults to the metadata.name value when empty.

The REST API Reference has more details on these definitions.