# oc scale -n openshift-infra --replicas=2 rc hawkular-metrics
OpenShift Container Platform exposes metrics that can be collected and stored in back-ends by Heapster. As an OpenShift Container Platform administrator, you can view containers and components metrics in one user interface. These metrics are also used by horizontal pod autoscalers in order to determine when and how to scale.
This topic provides information for scaling the metrics components.
Autoscaling the metrics components, such as Hawkular and Heapster, is not supported by OpenShift Container Platform. |
Run metrics pods on dedicated OpenShift Container Platform infrastructure nodes.
Use persistent storage when configuring metrics. Set
USE_PERSISTENT_STORAGE=true
.
Keep the METRICS_RESOLUTION=30
parameter in OpenShift Container Platform metrics
deployments. Using a value lower than the default value of 30
for
METRICS_RESOLUTION
is not recommended. When using the Ansible metrics
installation procedure, this is the openshift_metrics_resolution
parameter.
Closely monitor OpenShift Container Platform nodes with host metrics pods to detect early capacity shortages (CPU and memory) on the host system. These capacity shortages can cause problems for metrics pods.
In OpenShift Container Platform version 3.7 testing, test cases up to 25,000 pods were monitored in a OpenShift Container Platform cluster.
In tests performed with with 210 and 990 OpenShift Container Platform nodes, where 10500 pods and 11000 pods were monitored respectively, the Cassandra database grew at the speed shown in the table below:
Number of Nodes | Number of Pods | Cassandra Storage growth speed | Cassandra storage growth per day | Cassandra storage growth per week |
---|---|---|---|---|
210 |
10500 |
500 MB per hour |
15 GB |
75 GB |
990 |
11000 |
1 GB per hour |
30 GB |
210 GB |
In the above calculation, approximately 20 percent of the expected size was added as overhead to ensure that the storage requirements do not exceed calculated value.
If the METRICS_DURATION
and METRICS_RESOLUTION
values are kept at the
default (7
days and 15
seconds respectively), it is safe to plan Cassandra
storage size requirements for week, as in the values above.
Because OpenShift Container Platform metrics uses the Cassandra database as a datastore for
metrics data, if If you use a Cassandra database as a datastore for metrics data, see the Cassandra documentation for their recommendations. |
One set of metrics pods (Cassandra/Hawkular/Heapster) is able to monitor at least 25,000 pods.
Pay attention to system load on nodes where OpenShift Container Platform metrics pods run. Use that information to determine if it is necessary to scale out a number of OpenShift Container Platform metrics pods and spread the load across multiple OpenShift Container Platform nodes. Scaling OpenShift Container Platform metrics heapster pods is not recommended. |
If persistent storage was used to deploy OpenShift Container Platform metrics, then you must create a persistent volume (PV) for the new Cassandra pod to use before you can scale out the number of OpenShift Container Platform metrics Cassandra pods. However, if Cassandra was deployed with dynamically provisioned PVs, then this step is not necessary.
Cassandra nodes use persistent storage. Therefore, scaling up or down is not possible with replication controllers.
Scaling a Cassandra cluster requires modifying the
openshift_metrics_cassandra_replicas
variable and re-running the
deployment.
By default, the Cassandra cluster is a single-node cluster.
To scale up the number of OpenShift Container Platform metrics hawkular pods to two replicas, run:
# oc scale -n openshift-infra --replicas=2 rc hawkular-metrics
Alternatively, update your inventory file and re-run the deployment.
If you add a new node to or remove an existing node from a Cassandra cluster, the data stored in the cluster rebalances across the cluster. |
To scale down:
If remotely accessing the container, run the following for the Cassandra node you want to remove:
$ oc exec -it <hawkular-cassandra-pod> nodetool decommission
If locally accessing the container, run the following instead:
$ oc rsh <hawkular-cassandra-pod> nodetool decommission
This command can take a while to run since it copies data across the cluster.
You can monitor the decommission progress with nodetool netstats -H
.
Once the previous command succeeds, scale down the rc
for the Cassandra instance to 0
.
# oc scale -n openshift-infra --replicas=0 rc <hawkular-cassandra-rc>
This will remove the Cassandra pod.
If the scale down process completed and the existing Cassandra nodes are
functioning as expected, you can also delete the |