Initial planning considerations

Consider the following tested object maximums when you plan your Red Hat OpenShift Service on AWS cluster.

These guidelines are based on a cluster of 102 workers in a multi-availability zone configuration. For smaller clusters, the maximums are lower.

The sizing of the control plane and infrastructure nodes is dynamically calculated during the installation process, based on the number of worker nodes. If you change the number of worker nodes after the installation, control plane and infra nodes must be resized manually. Infra nodes are resized by the Red Hat SRE team, and you can open a ticket in the Customer Portal to request the infra node resizing.

The following table lists the size of control plane and infrastructure nodes that are assigned during installation.

Number of worker nodes Control plane size Infrastructure node size

1 to 25

m5.2xlarge

r5.xlarge

26 to 100

m5.4xlarge

r5.2xlarge

101 to 180 [1]

m5.8xlarge

r5.4xlarge

  1. The maximum number of worker nodes on ROSA is 180

For larger clusters, infrastructure node sizing can become a large impacting factor to scalability. There are many factors that influence the stated thresholds, including the etcd version or storage data format.

Exceeding these limits does not necessarily mean that the cluster will fail. In most cases, exceeding these numbers results in lower overall performance.

The OpenShift Container Platform version used in all of the tests is OCP 4.8.0.

ROSA tested cluster maximums

The following table specifies the maximum limits for each tested type in a Red Hat OpenShift Service on AWS cluster.

Table 1. Tested cluster maximums
Maximum type 4.8 tested maximum

Number of nodes

102

Number of pods [1]

20,400

Number of pods per node

250

Number of pods per core

There is no default value

Number of namespaces [2]

3,400

Number of pods per namespace [3]

20,400

Number of services [4]

10,000

Number of services per namespace

10,000

Number of back ends per service

10,000

Number of deployments per namespace [3]

1,000

  1. The pod count displayed here is the number of test pods. The actual number of pods depends on the application’s memory, CPU, and storage requirements.

  2. When there are a large number of active projects, etcd can suffer from poor performance if the keyspace grows excessively large and exceeds the space quota. Periodic maintenance of etcd, including defragmentation, is highly recommended to make etcd storage available.

  3. There are a number of control loops in the system that must iterate over all objects in a given namespace as a reaction to some changes in state. Having a large number of objects of a type, in a single namespace, can make those loops expensive and slow down processing the state changes. The limit assumes that the system has enough CPU, memory, and disk to satisfy the application requirements.

  4. Each service port and each service back end has a corresponding entry in iptables. The number of back ends of a given service impacts the size of the endpoints objects, which then impacts the size of data that is sent throughout the system.

In OpenShift Container Platform 4.8, half of a CPU core (500 millicore) is reserved by the system compared to previous versions of OpenShift Container Platform.

OpenShift Container Platform testing environment and configuration

The following table lists the OpenShift Container Platform environment and configuration on which the cluster maximums are tested for the AWS cloud platform.

Node Type vCPU RAM(GiB) Disk type Disk size(GiB)/IOS Count Region

Control plane/etcd [1]

m5.4xlarge

16

64

io1

350 / 1,000

3

us-west-2

Infrastructure nodes [2]

r5.2xlarge

8

64

gp2

300 / 900

3

us-west-2

Workload [3]

m5.2xlarge

8

32

gp2

350 / 900

3

us-west-2

Worker nodes

m5.2xlarge

8

32

gp2

350 / 900

102

us-west-2

  1. io1 disks are used for control plane/etcd nodes because etcd is I/O intensive and latency sensitive. A greater number of IOPS can be required, depending on usage.

  2. Infrastructure nodes are used to host monitoring components because Prometheus can claim a large amount of memory, depending on usage patterns.

  3. Workload nodes are dedicated to run performance and scalability workload generators.

Larger cluster sizes and higher object counts might be reachable. However, the sizing of the infrastructure nodes limits the amount of memory that is available to Prometheus. When creating, modifying, or deleting objects, Prometheus stores the metrics in its memory for roughly 3 hours prior to persisting the metrics on disk. If the rate of creation, modification, or deletion of objects is too high, Prometheus can become overwhelmed and fail due to the lack of memory resources.