Kubernetes Infrastructure - Infrastructure Components | Architecture

Overview
Masters
- High Availability Masters
Nodes

Overview

Within OpenShift, Kubernetes manages containerized applications across a set of containers or hosts and provides mechanisms for deployment, maintenance, and application-scaling. Docker packages, instantiates, and runs containerized applications.

A Kubernetes cluster consists of one or more masters and a set of nodes.

Masters

The master is the host or hosts that contain the master components, including the API server, controller manager server, and etcd. The master manages nodes in its Kubernetes cluster and schedules pods to run on nodes.

Table 1. Master Components
Component	Description
API Server	The Kubernetes API server validates and configures the data for pods, services, and replication controllers. It also assigns pods to nodes and synchronizes pod information with service configuration.
etcd	etcd stores the persistent master state while other components watch etcd for changes to bring themselves into the desired state. etcd can be optionally configured for high availability, typically deployed with 2n+1 peer services.
Controller Manager Server	The controller manager server watches etcd for changes to replication controller objects and then uses the API to enforce the desired state.
Pacemaker	Optional, used when configuring highly-available masters. Pacemaker is the core technology of the High Availability Add-on for Red Hat Enterprise Linux, providing consensus, fencing, and service management. It can be run on all master hosts to ensure that all active-passive components have one instance running.
Virtual IP	Optional, used when configuring highly-available masters. The virtual IP (VIP) is the single point of contact, but not a single point of failure, for all OpenShift clients that: cannot be configured with all master service endpoints, or do not know how to load balance across multiple masters nor retry failed master service connections. There is one VIP and it is managed by Pacemaker.

High Availability Masters

You can optionally configure your masters for high availability (HA) to ensure that the cluster has no single point of failure.

To mitigate concerns about availability of the master, two activities are recommended:

A runbook entry should be created for reconstructing the master. A runbook entry is a necessary backstop for any highly-available service. Additional solutions merely control the frequency that the runbook must be consulted. For example, a cold standby of the master host can adequately fulfill SLAs that require no more than minutes of downtime for creation of new applications or recovery of failed application components.
Use a high availability solution to configure your masters and ensure that the cluster has no single point of failure. The advanced installation method provides specific examples using Pacemaker as the management technology, which Red Hat recommends. However, you can take the concepts and apply them towards your existing high availability solutions.

In production OpenShift Enterprise clusters, you must maintain high availability of the API Server load balancer. If the API Server load balancer is not available, nodes cannot report their status, all their pods are marked dead, and the pods' endpoints are removed from the service.

In addition to configuring HA for OpenShift Enterprise, you must separately configure HA for the API Server load balancer. To configure HA, it is much preferred to integrate an enterprise load balancer (LB) such as an F5 Big-IP™ or a Citrix Netscaler™ appliance. If such solutions are not available, it is possible to run multiple HAProxy load balancers and use Keepalived to provide a floating virtual IP address for HA. However, this solution is not recommended for production instances.

Moving from a single master cluster to multiple masters after installation is not supported.

When using Pacemaker, master components have the following availability:

Table 2. Availability Matrix
Role	Style	Notes
etcd	Active-active	Fully redundant deployment with load balancing
Master service	Active-passive	One active at a time, managed by Pacemaker
Pacemaker	Active-active	Fully redundant deployment
Virtual IP	Active-passive	One active at a time, managed by Pacemaker

Figure 1. Highly-available Masters Using Pacemaker

Nodes

A node provides the runtime environments for containers. Each node in a Kubernetes cluster has the required services to be managed by the master. Nodes also have the required services to run pods, including Docker, a kubelet, and a service proxy.

OpenShift creates nodes from a cloud provider, physical systems, or virtual systems. Kubernetes interacts with node objects that are a representation of those nodes. The master uses the information from node objects to validate nodes with health checks. A node is ignored until it passes the health checks, and the master continues checking nodes until they are valid. The Kubernetes documentation has more information on node management.

Administrators can manage nodes in an OpenShift instance using the CLI. To define full configuration and security options when launching node servers, use dedicated node configuration files.

Kubelet

Each node has a kubelet that updates the node as specified by a container manifest, which is a YAML file that describes a pod. The kubelet uses a set of manifests to ensure that its containers are started and that they continue to run. A sample manifest can be found in the Kubernetes documentation.

A container manifest can be provided to a kubelet by:

A file path on the command line that is checked every 20 seconds.
An HTTP endpoint passed on the command line that is checked every 20 seconds.
The kubelet watching an etcd server, such as /registry/hosts/$(hostname -f), and acting on any changes.
The kubelet listening for HTTP and responding to a simple API to submit a new manifest.

Service Proxy

Each node also runs a simple network proxy that reflects the services defined in the API on that node. This allows the node to do simple TCP and UDP stream forwarding across a set of back ends.

Node Object Definition

The following is an example node object definition in Kubernetes:

apiVersion: v1 (1)
kind: Node (2)
metadata:
  creationTimestamp: null
  labels: (3)
    kubernetes.io/hostname: node1.example.com
  name: node1.example.com (4)
spec:
  externalID: node1.example.com (5)
status:
  nodeInfo:
    bootID: ""
    containerRuntimeVersion: ""
    kernelVersion: ""
    kubeProxyVersion: ""
    kubeletVersion: ""
    machineID: ""
    osImage: ""
    systemUUID: ""

1	`apiVersion` defines the API version to use.
2	`kind` set to `Node` identifies this as a definition for a node object.
3	`metadata.labels` lists any labels that have been added to the node.
4	`metadata.name` is a required value that defines the name of the node object. This value is shown in the `NAME` column when running the `oc get nodes` command.
5	`spec.externalID` defines the fully-qualified domain name where the node can be reached. Defaults to the `metadata.name` value when empty.

The REST API Reference has more details on these definitions.