OpenShift Cluster Manager
Most access by Red Hat site reliability engineering (SRE) teams is done by using cluster Operators through automated configuration management.
OpenShift Dedicated on Google Cloud Platform (GCP) clusters that are created with the Workload Identify Federation (WIF) authentication type do not use Operators for SRE access. Instead, the required roles necessary for SRE account access are assigned to the sd-sre-platform-gcp-access group as part of the WIF configuration creation and are validated prior to the deployment of the cluster by the OpenShift Cluster Manager. For more information about WIF configurations, see Additional resources. |
For a list of the available subprocessors, see the Red Hat Subprocessor List on the Red Hat Customer Portal.
SREs access OpenShift Dedicated clusters through a proxy. The proxy mints a service account in an OpenShift Dedicated cluster for the SREs when they log in. As no identity provider is configured for OpenShift Dedicated clusters, SREs access the proxy by running a local web console container. SREs do not access the cluster web console directly. SREs must authenticate as individual users to ensure auditability. All authentication attempts are logged to a Security Information and Event Management (SIEM) system.
Red Hat SRE adheres to the principle of least privilege when accessing OpenShift Dedicated and public cloud provider components. There are four basic categories of manual SRE access:
SRE admin access through the Red Hat Customer Portal with normal two-factor authentication and no privileged elevation.
SRE admin access through the Red Hat corporate SSO with normal two-factor authentication and no privileged elevation.
OpenShift elevation, which is a manual elevation using Red Hat SSO. It is fully audited and management approval is required for every operation SREs make.
Cloud provider access or elevation, which is a manual elevation for cloud provider console or CLI access. Access is limited to 60 minutes and is fully audited.
Each of these access types has different levels of access to components:
Component | Typical SRE admin access (Red Hat Customer Portal) | Typical SRE admin access (Red Hat SSO) | OpenShift elevation | Cloud provider access |
---|---|---|---|---|
OpenShift Cluster Manager |
R/W |
No access |
No access |
No access |
OpenShift web console |
No access |
R/W |
R/W |
No access |
Node operating system |
No access |
A specific list of elevated OS and network permissions. |
A specific list of elevated OS and network permissions. |
No access |
AWS Console |
No access |
No access, but this is the account used to request cloud provider access. |
No access |
All cloud provider permissions using the SRE identity. |
Red Hat personnel do not access cloud infrastructure accounts in the course of routine OpenShift Dedicated operations. For emergency troubleshooting purposes, Red Hat SRE have well-defined and auditable procedures to access cloud infrastructure accounts.
In AWS, SREs generate a short-lived AWS access token for the BYOCAdminAccess
user using the AWS Security Token Service (STS). Access to the STS token is audit logged and traceable back to individual users. The BYOCAdminAccess
has the AdministratorAccess
IAM policy attached.
In Google Cloud, SREs access resources after being authenticated against a Red Hat SAML identity provider (IDP). The IDP authorizes tokens that have time-to-live expirations. The issuance of the token is auditable by corporate Red Hat IT and linked back to an individual user.
Members of the Red Hat CEE team typically have read-only access to parts of the cluster. Specifically, CEE has limited access to the core and product namespaces and does not have access to the customer namespaces.
Role | Core namespace | Layered product namespace | Customer namespace | Cloud infrastructure account* |
---|---|---|---|---|
OpenShift SRE |
Read: All Write: Very Limited [1] |
Read: All Write: None |
Read: None[2] Write: None |
Read: All [3] Write: All [3] |
CEE |
Read: All Write: None |
Read: All Write: None |
Read: None[2] Write: None |
Read: None Write: None |
Customer administrator |
Read: None Write: None |
Read: None Write: None |
Read: All Write: All |
Read: Limited[4] Write: Limited[4] |
Customer user |
Read: None Write: None |
Read: None Write: None |
Read: Limited[5] Write: Limited[5] |
Read: None Write: None |
Everybody else |
Read: None Write: None |
Read: None Write: None |
Read: None Write: None |
Read: None Write: None |
Cloud Infrastructure Account refers to the underlying AWS or Google Cloud account
Limited to addressing common use cases such as failing deployments, upgrading a cluster, and replacing bad worker nodes.
Red Hat associates have no access to customer data by default.
SRE access to the cloud infrastructure account is a "break-glass" procedure for exceptional troubleshooting during a documented incident.
Customer administrator has limited access to the cloud infrastructure account console through Cloud Infrastructure Access.
Limited to what is granted through RBAC by the customer administrator, as well as namespaces created by the user.
Customer access is limited to namespaces created by the customer and permissions that are granted using RBAC by the customer administrator role. Access to the underlying infrastructure or product namespaces is generally not permitted without cluster-admin
access. More information on customer access and authentication can be found in the Understanding Authentication section of the documentation.
SRE access to OpenShift Dedicated clusters is controlled through several layers of required authentication, all of which are managed by strict company policy. All authentication attempts to access a cluster and changes made within a cluster are recorded within audit logs, along with the specific account identity of the SRE responsible for those actions. These audit logs help ensure that all changes made by SREs to a customer’s cluster adhere to the strict policies and procedures that make up Red Hat’s managed services guidelines.
The information presented below is an overview of the process an SRE must perform to access a customer’s cluster.
SRE requests a refreshed ID token from the Red Hat SSO (Cloud Services). This request is authenticated. The token is valid for fifteen minutes. After the token expires, you can refresh the token again and receive a new token. The ability to refresh to a new token is indefinite; however, the ability to refresh to a new token is revoked after 30 days of inactivity.
SRE connects to the Red Hat VPN. The authentication to the VPN is completed by the Red Hat Corporate Identity and Access Management system (RH IAM). With RH IAM, SREs are multifactor and can be managed internally per organization by groups and existing onboarding and offboarding processes. After an SRE is authenticated and connected, the SRE can access the cloud services fleet management plane. Changes to the cloud services fleet management plane require many layers of approval and are maintained by strict company policy.
After authorization is complete, the SRE logs into the fleet management plane and receives a service account token that the fleet management plane created. The token is valid for 15 minutes. After the token is no longer valid, it is deleted.
With access granted to the fleet management plane, SRE uses various methods to access clusters, depending on network configuration.
Accessing a private or public cluster: Request is sent through a specific Network Load Balancer (NLB) by using an encrypted HTTP connection on port 6443.
Accessing a PrivateLink cluster: Request is sent to the Red Hat Transit Gateway, which then connects to a Red Hat VPC per region. The VPC that receives the request will be dependent on the target private cluster’s region. Within the VPC, there is a private subnet that contains the PrivateLink endpoint to the customer’s PrivateLink cluster.
When you install a OpenShift Dedicated cluster that uses the AWS Security Token Service (STS), cluster-specific Operator AWS Identity and Access Management (IAM) roles are created. These IAM roles permit the OpenShift Dedicated cluster Operators to run core OpenShift functionality.
Cluster Operators use service accounts to assume IAM roles. When a service account assumes an IAM role, temporary STS credentials are provided for the service account to use in the cluster Operator’s pod. If the assumed role has the necessary AWS privileges, the service account can run AWS SDK operations in the pod.
The following diagram illustrates the workflow for assuming AWS IAM roles in SRE owned projects:
The workflow has the following stages:
Within each project that a cluster Operator runs, the Operator’s deployment spec has a volume mount for the projected service account token, and a secret containing AWS credential configuration for the pod. The token is audience-bound and time-bound. Every hour, OpenShift Dedicated generates a new token, and the AWS SDK reads the mounted secret containing the AWS credential configuration. This configuration has a path to the mounted token and the AWS IAM Role ARN. The secret’s credential configuration includes the following:
An $AWS_ARN_ROLE
variable that has the ARN for the IAM role that has the permissions required to run AWS SDK operations.
An $AWS_WEB_IDENTITY_TOKEN_FILE
variable that has the full path in the pod to the OpenID Connect (OIDC) token for the service account. The full path is /var/run/secrets/openshift/serviceaccount/token
.
When a cluster Operator needs to assume an AWS IAM role to access an AWS service (such as EC2), the AWS SDK client code running on the Operator invokes the AssumeRoleWithWebIdentity
API call.
The OIDC token is passed from the pod to the OIDC provider. The provider authenticates the service account identity if the following requirements are met:
The identity signature is valid and signed by the private key.
The sts.amazonaws.com
audience is listed in the OIDC token and matches the audience configured in the OIDC provider.
In OpenShift Dedicated with STS clusters, the OIDC provider is created during install and set as the service account issuer by default. The |
The OIDC token has not expired.
The issuer value in the token has the URL for the OIDC provider.
If the project and service account are in the scope of the trust policy for the IAM role that is being assumed, then authorization succeeds.
After successful authentication and authorization, temporary AWS STS credentials in the form of an AWS access token, secret key, and session token are passed to the pod for use by the service account. By using the credentials, the service account is temporarily granted the AWS permissions enabled in the IAM role.
When the cluster Operator runs, the Operator that is using the AWS SDK in the pod consumes the secret that has the path to the projected service account and AWS IAM Role ARN to authenticate against the OIDC provider. The OIDC provider returns temporary STS credentials for authentication against the AWS API.
For more information about WIF configuration and SRE access roles, see Creating a WIF configuration.