OPENSHIFT_HA_MONITOR_PORT
This topic describes configuring IP failover for pods and services on your OpenShift Container Platform cluster.
IP failover manages a pool of Virtual IP (VIP) addresses on a set of nodes. Every VIP in the set is serviced by a node selected from the set. As long a single node is available, the VIPs are served. There is no way to explicitly distribute the VIPs over the nodes, so there can be nodes with no VIPs and other nodes with many VIPs. If there is only one node, all VIPs are on it.
The VIPs must be routable from outside the cluster. |
IP failover monitors a port on each VIP to determine whether the port is reachable on the node. If the port is not reachable, the VIP is not assigned to the node. If the port is set to 0
, this check is suppressed. The check script does the needed testing.
IP failover uses Keepalived to host a set of externally accessible VIP addresses on a set of hosts. Each VIP is only serviced by a single host at a time. Keepalived uses the Virtual Router Redundancy Protocol (VRRP) to determine which host, from the set of hosts, services which VIP. If a host becomes unavailable, or if the service that Keepalived is watching does not respond, the VIP is switched to another host from the set. This means a VIP is always serviced as long as a host is available.
When a node running Keepalived passes the check script, the VIP on that node can enter the master
state based on its priority and the priority of the current master and as determined by the preemption strategy.
A cluster administrator can provide a script through the OPENSHIFT_HA_NOTIFY_SCRIPT
variable, and this script is called whenever the state of the VIP on the node changes. Keepalived uses the master
state when it is servicing the VIP, the backup
state when another node is servicing the VIP, or in the fault
state when the check script fails. The notify script is called with the new state whenever the state changes.
You can create an IP failover deployment configuration on OpenShift Container Platform. The IP failover deployment configuration specifies the set of VIP addresses, and the set of nodes on which to service them. A cluster can have multiple IP failover deployment configurations, with each managing its own set of unique VIP addresses. Each node in the IP failover configuration runs an IP failover pod, and this pod runs Keepalived.
When using VIPs to access a pod with host networking, the application pod runs on all nodes that are running the IP failover pods. This enables any of the IP failover nodes to become the master and service the VIPs when needed. If application pods are not running on all nodes with IP failover, either some IP failover nodes never service the VIPs or some application pods never receive any traffic. Use the same selector and replication count, for both IP failover and the application pods, to avoid this mismatch.
While using VIPs to access a service, any of the nodes can be in the IP failover set of nodes, since the service is reachable on all nodes, no matter where the application pod is running. Any of the IP failover nodes can become master at any time. The service can either use external IPs and a service port or it can use a NodePort
.
When using external IPs in the service definition, the VIPs are set to the external IPs, and the IP failover monitoring port is set to the service port. When using a node port, the port is open on every node in the cluster, and the service load-balances traffic from whatever node currently services the VIP. In this case, the IP failover monitoring port is set to the NodePort
in the service definition.
Setting up a |
Even though a service VIP is highly available, performance can still be affected. Keepalived makes sure that each of the VIPs is serviced by some node in the configuration, and several VIPs can end up on the same node even when other nodes have none. Strategies that externally load-balance across a set of VIPs can be thwarted when IP failover puts multiple VIPs on the same node. |
When you use ingressIP
, you can set up IP failover to have the same VIP range as the ingressIP
range. You can also disable the monitoring port. In this case, all the VIPs appear on same node in the cluster. Any user can set up a service with an ingressIP
and have it highly available.
There are a maximum of 254 VIPs in the cluster. |
The following table contains the variables used to configure IP failover.
Variable Name | Default | Description |
---|---|---|
|
|
The IP failover pod tries to open a TCP connection to this port on each Virtual IP (VIP). If connection is established, the service is considered to be running. If this port is set to |
|
The interface name that IP failover uses to send Virtual Router Redundancy Protocol (VRRP) traffic. The default value is |
|
|
|
The number of replicas to create. This must match |
|
The list of IP address ranges to replicate. This must be provided. For example, |
|
|
|
The offset value used to set the virtual router IDs. Using different offset values allows multiple IP failover configurations to exist within the same cluster. The default offset is |
|
The number of groups to create for VRRP. If not set, a group is created for each virtual IP range specified with the |
|
|
INPUT |
The name of the iptables chain, to automatically add an |
|
The full path name in the pod file system of a script that is periodically run to verify the application is operating. |
|
|
|
The period, in seconds, that the check script is run. |
|
The full path name in the pod file system of a script that is run whenever the state changes. |
|
|
|
The strategy for handling a new higher priority host. The |
As a cluster administrator, you can configure IP failover on an entire cluster, or on a subset of nodes, as defined by the label selector. You can also configure multiple IP failover deployment configurations in your cluster, where each one is independent of the others.
The IP failover deployment configuration ensures that a failover pod runs on each of the nodes matching the constraints or the label used.
This pod runs Keepalived, which can monitor an endpoint and use Virtual Router Redundancy Protocol (VRRP) to fail over the virtual IP (VIP) from one node to another if the first node cannot reach the service or endpoint.
For production use, set a selector
that selects at least two nodes, and set replicas
equal to the number of selected nodes.
You are logged in to the cluster with a user with cluster-admin
privileges.
You created a pull secret.
Create an IP failover service account:
$ oc create sa ipfailover
Update security context constraints (SCC) for hostNetwork
:
$ oc adm policy add-scc-to-user privileged -z ipfailover
$ oc adm policy add-scc-to-user hostnetwork -z ipfailover
Create a deployment YAML file to configure IP failover:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ipfailover-keepalived (1)
labels:
ipfailover: hello-openshift
spec:
strategy:
type: Recreate
replicas: 2
selector:
matchLabels:
ipfailover: hello-openshift
template:
metadata:
labels:
ipfailover: hello-openshift
spec:
serviceAccountName: ipfailover
privileged: true
hostNetwork: true
nodeSelector:
node-role.kubernetes.io/worker: ""
containers:
- name: openshift-ipfailover
image: quay.io/openshift/origin-keepalived-ipfailover
ports:
- containerPort: 63000
hostPort: 63000
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
volumeMounts:
- name: lib-modules
mountPath: /lib/modules
readOnly: true
- name: host-slash
mountPath: /host
readOnly: true
mountPropagation: HostToContainer
- name: etc-sysconfig
mountPath: /etc/sysconfig
readOnly: true
- name: config-volume
mountPath: /etc/keepalive
env:
- name: OPENSHIFT_HA_CONFIG_NAME
value: "ipfailover"
- name: OPENSHIFT_HA_VIRTUAL_IPS (2)
value: "1.1.1.1-2"
- name: OPENSHIFT_HA_VIP_GROUPS (3)
value: "10"
- name: OPENSHIFT_HA_NETWORK_INTERFACE (4)
value: "ens3" #The host interface to assign the VIPs
- name: OPENSHIFT_HA_MONITOR_PORT (5)
value: "30060"
- name: OPENSHIFT_HA_VRRP_ID_OFFSET (6)
value: "0"
- name: OPENSHIFT_HA_REPLICA_COUNT (7)
value: "2" #Must match the number of replicas in the deployment
- name: OPENSHIFT_HA_USE_UNICAST
value: "false"
#- name: OPENSHIFT_HA_UNICAST_PEERS
#value: "10.0.148.40,10.0.160.234,10.0.199.110"
- name: OPENSHIFT_HA_IPTABLES_CHAIN (8)
value: "INPUT"
#- name: OPENSHIFT_HA_NOTIFY_SCRIPT (9)
# value: /etc/keepalive/mynotifyscript.sh
- name: OPENSHIFT_HA_CHECK_SCRIPT (10)
value: "/etc/keepalive/mycheckscript.sh"
- name: OPENSHIFT_HA_PREEMPTION (11)
value: "preempt_delay 300"
- name: OPENSHIFT_HA_CHECK_INTERVAL (12)
value: "2"
livenessProbe:
initialDelaySeconds: 10
exec:
command:
- pgrep
- keepalived
volumes:
- name: lib-modules
hostPath:
path: /lib/modules
- name: host-slash
hostPath:
path: /
- name: etc-sysconfig
hostPath:
path: /etc/sysconfig
# config-volume contains the check script
# created with `oc create configmap keepalived-checkscript --from-file=mycheckscript.sh`
- configMap:
defaultMode: 0755
name: keepalived-checkscript
name: config-volume
imagePullSecrets:
- name: openshift-pull-secret (13)
1 | The name of the IP failover deployment. |
2 | The list of IP address ranges to replicate. This must be provided. For example, 1.2.3.4-6,1.2.3.9 . |
3 | The number of groups to create for VRRP. If not set, a group is created for each virtual IP range specified with the OPENSHIFT_HA_VIP_GROUPS variable. |
4 | The interface name that IP failover uses to send VRRP traffic. By default, eth0 is used. |
5 | The IP failover pod tries to open a TCP connection to this port on each VIP. If connection is established, the service is considered to be running. If this port is set to 0 , the test always passes. The default value is 80 . |
6 | The offset value used to set the virtual router IDs. Using different offset values allows multiple IP failover configurations to exist within the same cluster. The default offset is 0 , and the allowed range is 0 through 255 . |
7 | The number of replicas to create. This must match spec.replicas value in IP failover deployment configuration. The default value is 2 . |
8 | The name of the iptables chain to automatically add an iptables rule to allow the VRRP traffic on. If the value is not set, an iptables rule is not added. If the chain does not exist, it is not created, and Keepalived operates in unicast mode. The default is INPUT . |
9 | The full path name in the pod file system of a script that is run whenever the state changes. |
10 | The full path name in the pod file system of a script that is periodically run to verify the application is operating. |
11 | The strategy for handling a new higher priority host. The default value is preempt_delay 300 , which causes a Keepalived instance to take over a VIP after 5 minutes if a lower-priority master is holding the VIP. |
12 | The period, in seconds, that the check script is run. The default value is 2 . |
13 | Create the pull secret before creating the deployment, otherwise you will get an error when creating the deployment. |