Overview

This topic describes the management of the overall cluster network, including project isolation and outbound traffic control.

Managing Pod Networks

When your cluster is configured to use the ovs-multitenant SDN plug-in you can manage the separate pod overlay networks for projects using the administrator CLI.

Joining Project Networks

To join projects to an existing project network:

$ oc adm pod-network join-projects --to=<project1> <project2> <project3>

In the above example, all the pods and services in <project2> and <project3> can now access any pods and services in <project1> and vice versa. Services can be accessed either by IP or fully-qualified DNS name (<service>.<pod_namespace>.svc.cluster.local). For example, to access a service named db in a project myproject, use db.myproject.svc.cluster.local.

Alternatively, instead of specifying specific project names, you can use the --selector=<project_selector> option.

To verify the networks you have joined together:

$ oc get netnamespaces

Then look at the NETID column. Projects in the same pod-network will have the same NetID.

Isolating Project Networks

To isolate the project network in the cluster and vice versa, run:

$ oc adm pod-network isolate-projects <project1> <project2>

In the above example, all of the pods and services in <project1> and <project2> can not access any pods and services from other non-global projects in the cluster and vice versa.

Alternatively, instead of specifying specific project names, you can use the --selector=<project_selector> option.

Making Project Networks Global

To allow projects to access all pods and services in the cluster and vice versa:

$ oc adm pod-network make-projects-global <project1> <project2>

In the above example, all the pods and services in <project1> and <project2> can now access any pods and services in the cluster and vice versa.

Alternatively, instead of specifying specific project names, you can use the --selector=<project_selector> option.

Using an Egress Firewall to Limit Access to External Resources

As an OpenShift Dedicated cluster administrator, you can use egress firewall policy to limit the external addresses that some or all pods can access from within the cluster, so that:

  • A pod can only talk to internal hosts, and cannot initiate connections to the public Internet.

    Or,

  • A pod can only talk to the public Internet, and cannot initiate connections to internal hosts (outside the cluster).

    Or,

  • A pod cannot reach specified internal subnets/hosts that it should have no reason to contact.

Egress policies can be set at the pod selector-level and project-level. For example, you can allow <project A> access to a specified IP range but deny the same access to <project B>. Or, you can restrict application developers from updating from (Python) pip mirrors, and force updates to only come from approved sources.

Project administrators can neither create EgressNetworkPolicy objects, nor edit the ones you create in their project. There are also several other restrictions on where EgressNetworkPolicy can be created:

  • The default project (and any other project that has been made global via oc adm pod-network make-projects-global) cannot have egress policy.

  • If you merge two projects together (via oc adm pod-network join-projects), then you cannot use egress policy in any of the joined projects.

  • No project may have more than one egress policy object.

Violating any of these restrictions results in broken egress policy for the project, and may cause all external network traffic to be dropped.

Use the oc command or the REST API to configure egress policy. You can use oc [create|replace|delete] to manipulate EgressNetworkPolicy objects. The api/swagger-spec/oapi-v1.json file has API-level details on how the objects actually work.

To configure egress policy:

  1. Navigate to the project you want to affect.

  2. Create a JSON file with the desired policy details. For example:

    {
        "kind": "EgressNetworkPolicy",
        "apiVersion": "v1",
        "metadata": {
            "name": "default"
        },
        "spec": {
            "egress": [
                {
                    "type": "Allow",
                    "to": {
                        "cidrSelector": "1.2.3.0/24"
                    }
                },
                {
                    "type": "Allow",
                    "to": {
                        "dnsName": "www.foo.com"
                    }
                },
                {
                    "type": "Deny",
                    "to": {
                        "cidrSelector": "0.0.0.0/0"
                    }
                }
            ]
        }
    }

    When the example above is added to a project, it allows traffic to IP range 1.2.3.0/24 and domain name www.foo.com, but denies access to all other external IP addresses. Traffic to other pods is not affected because the policy only applies to external traffic.

    The rules in an EgressNetworkPolicy are checked in order, and the first one that matches takes effect. If the three rules in the above example were reversed, then traffic would not be allowed to 1.2.3.0/24 and www.foo.com because the 0.0.0.0/0 rule would be checked first, and it would match and deny all traffic.

    Domain name updates are polled based on the TTL (time to live) value of the domain returned by the local non-authoritative servers. The pod should also resolve the domain from the same local nameservers when necessary, otherwise the IP addresses for the domain perceived by the egress network policy controller and the pod will be different, and the egress network policy may not be enforced as expected. Since egress network policy controller and pod are asynchronously polling the same local nameserver, there could be a race condition where pod may get the updated IP before the egress controller. Due to this current limitation, domain name usage in EgressNetworkPolicy is only recommended for domains with infrequent IP address changes.

  3. Use the JSON file to create an EgressNetworkPolicy object:

    $ oc create -f <policy>.json

Exposing services by creating routes will ignore EgressNetworkPolicy. Egress network policy service endpoint filtering is done at the node kubeproxy. When the router is involved, kubeproxy is bypassed and egress network policy enforcement is not applied. Administrators can prevent this bypass by limiting access to create routes.

Enabling Multicast

At this time, multicast is best used for low bandwidth coordination or service discovery and not a high-bandwidth solution.

Multicast traffic between OpenShift Dedicated pods is disabled by default. If you are using the ovs-multitenant or ovs-networkpolicy plugin, you can enable multicast on a per-project basis by setting an annotation on the project’s corresponding netnamespace object:

$ oc annotate netnamespace <namespace> \
    netnamespace.network.openshift.io/multicast-enabled=true

Disable multicast by removing the annotation:

$ oc annotate netnamespace <namespace> \
    netnamespace.network.openshift.io/multicast-enabled-

When using the ovs-multitenant plugin:

  1. In an isolated project, multicast packets sent by a pod will be delivered to all other pods in the project.

  2. If you have joined networks together, you will need to enable multicast in each project’s netnamespace in order for it to take effect in any of the projects. Multicast packets sent by a pod in a joined network will be delivered to all pods in all of the joined-together networks.

  3. To enable multicast in the default project, you must also enable it in the kube-service-catalog project and all other projects that have been made global. Global projects are not "global" for purposes of multicast; multicast packets sent by a pod in a global project will only be delivered to pods in other global projects, not to all pods in all projects. Likewise, pods in global projects will only receive multicast packets sent from pods in other global projects, not from all pods in all projects.

When using the ovs-networkpolicy plugin:

  1. Multicast packets sent by a pod will be delivered to all other pods in the project, regardless of NetworkPolicy objects. (Pods may be able to communicate over multicast even when they can’t communicate over unicast.)

  2. Multicast packets sent by a pod in one project will never be delivered to pods in any other project, even if there are NetworkPolicy objects allowing communication between the to projects.

Enabling NetworkPolicy

The ovs-subnet and ovs-multitenant plug-ins have their own legacy models of network isolation and do not support Kubernetes NetworkPolicy. However, NetworkPolicy support is available by using the ovs-networkpolicy plug-in.

Only the v1 NetworkPolicy features are available in OpenShift Dedicated. This means that egress policy types, IPBlock, and combining podSelector and namespaceSelector are not available in OpenShift Dedicated.

In a cluster configured to use the ovs-networkpolicy plug-in, network isolation is controlled entirely by NetworkPolicy objects. By default, all pods in a project are accessible from other pods and network endpoints. To isolate one or more pods in a project, you can create NetworkPolicy objects in that project to indicate the allowed incoming connections. Project administrators can create and delete NetworkPolicy objects within their own project.

Pods that do not have NetworkPolicy objects pointing to them are fully accessible, whereas, pods that have one or more NetworkPolicy objects pointing to them are isolated. These isolated pods only accept connections that are accepted by at least one of their NetworkPolicy objects.

Following are a few sample NetworkPolicy object definitions supporting different scenrios:

  • Deny All Traffic

    To make a project "deny by default" add a NetworkPolicy object that matches all pods but accepts no traffic.

    kind: NetworkPolicy
    apiVersion: networking.k8s.io/v1
    metadata:
      name: deny-by-default
    spec:
      podSelector:
      ingress: []
  • Only Accept connections from pods within project

    To make pods accept connections from other pods in the same project, but reject all other connections from pods in other projects:

    kind: NetworkPolicy
    apiVersion: networking.k8s.io/v1
    metadata:
      name: allow-same-namespace
    spec:
      podSelector:
      ingress:
      - from:
        - podSelector: {}
  • Only allow HTTP and HTTPS traffic based on pod labels

    To enable only HTTP and HTTPS access to the pods with a specific label (role=frontend in following example), add a NetworkPolicy object similar to:

    kind: NetworkPolicy
    apiVersion: networking.k8s.io/v1
    metadata:
      name: allow-http-and-https
    spec:
      podSelector:
        matchLabels:
          role: frontend
      ingress:
      - ports:
        - protocol: TCP
          port: 80
        - protocol: TCP
          port: 443

NetworkPolicy objects are additive, which means you can combine multiple NetworkPolicy objects together to satisfy complex network requirements.

For example, for the NetworkPolicy objects defined in previous samples, you can define both allow-same-namespace and allow-http-and-https policies within the same project. Thus allowing the pods with the label role=frontend, to accept any connection allowed by each policy. That is, connections on any port from pods in the same namespace, and connections on ports 80 and 443 from pods in any namespace.

NetworkPolicy and Routers

When using the ovs-multitenant plug-in, traffic from the routers is automatically allowed into all namespaces. This is because the routers are usually in the default namespace, and all namespaces allow connections from pods in that namespace. With the ovs-networkpolicy plug-in, this does not happen automatically. Therefore, if you have a policy that isolates a namespace by default, you need to take additional steps to allow routers to access it.

One option is to create a policy for each service, allowing access from all sources. for example,

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-to-database-service
spec:
  podSelector:
    matchLabels:
      role: database
  ingress:
  - ports:
    - protocol: TCP
      port: 5432

This allows routers to access the service, but will also allow pods in other users' namespaces to access it as well. This should not cause any issues, as those pods can normally access the service by using the public router.

Alternatively, you can create a policy allowing full access from the default namespace, as in the ovs-multitenant plug-in:

  1. Add a label to the default namespace.

    If you labeled the default project with the default label in a previous procedure, then skip this step. The cluster administrator role is required to add labels to namespaces.

    $ oc label namespace default name=default
  2. Create policies allowing connections from that namespace.

    Perform this step for each namespace you want to allow connections into. Users with the Project Administrator role can create policies.

    kind: NetworkPolicy
    apiVersion: networking.k8s.io/v1
    metadata:
      name: allow-from-default-namespace
    spec:
      podSelector:
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              name: default

Setting a Default NetworkPolicy for New Projects

The cluster administrators can modify the default project template to enable automatic creation of default NetworkPolicy objects (one or more), whenever a new project is created. To do this:

  1. Create a custom project template and configure the master to use it, as described in Modifying the Template for New Projects.

  2. Edit the default project template with the following command:

    $ oc edit template project-request -n dedicated-admin

    Include the desired NetworkPolicy objects.

    To include NetworkPolicy objects into existing template, use the oc edit command. Currently, it is not possible to use oc patch to add objects to a Template resource.

    1. Add each default policy as an element in the objects array:

      objects:
      ...
      - apiVersion: networking.k8s.io/v1
        kind: NetworkPolicy
        metadata:
          name: allow-from-same-namespace
        spec:
          podSelector:
          ingress:
          - from:
            - podSelector: {}
      - apiVersion: networking.k8s.io/v1
        kind: NetworkPolicy
        metadata:
          name: allow-from-default-namespace
        spec:
          podSelector:
          ingress:
          - from:
            - namespaceSelector:
                matchLabels:
                  name: default
      ...

Enabling HTTP Strict Transport Security

HTTP Strict Transport Security (HSTS) policy is a security enhancement, which ensures that only HTTPS traffic is allowed on the host. Any HTTP requests are dropped by default. This is useful for ensuring secure interactions with websites, or to offer a secure application for the user’s benefit.

When HSTS is enabled, HSTS adds a Strict Transport Security header to HTTPS responses from the site. You can use the insecureEdgeTerminationPolicy value in a route to redirect to send HTTP to HTTPS. However, when HSTS is enabled, the client changes all requests from the HTTP URL to HTTPS before the request is sent, eliminating the need for a redirect. This is not required to be supported by the client, and can be disabled by setting max-age=0.

HSTS works only with secure routes (either edge terminated or re-encrypt). The configuration is ineffective on HTTP or passthrough routes.

To enable HSTS to a route, add the haproxy.router.openshift.io/hsts_header value to the edge terminated or re-encrypt route:

apiVersion: v1
kind: Route
metadata:
  annotations:
    haproxy.router.openshift.io/hsts_header: max-age=31536000;includeSubDomains;preload

Ensure there are no spaces and no other values in the parameters in the haproxy.router.openshift.io/hsts_header value. Only max-age is required.

The required max-age parameter indicates the length of time, in seconds, the HSTS policy is in effect for. The client updates max-age whenever a response with a HSTS header is received from the host. When max-age times out, the client discards the policy.

The optional includeSubDomains parameter tells the client that all subdomains of the host are to be treated the same as the host.

If max-age is greater than 0, the optional preload parameter allows external services to include this site in their HSTS preload lists. For example, sites such as Google can construct a list of sites that have preload set. Browsers can then use these lists to determine which sites to only talk to over HTTPS, even before they have interacted with the site. Without preload set, they need to have talked to the site over HTTPS to get the header.

Troubleshooting Throughput Issues

Sometimes applications deployed through OpenShift Dedicated can cause network throughput issues such as unusually high latency between specific services.

Use the following methods to analyze performance issues if pod logs do not reveal any cause of the problem:

  • Use a packet analyzer, such as ping or tcpdump to analyze traffic between a pod and its node.

    For example, run the tcpdump tool on each pod while reproducing the behavior that led to the issue. Review the captures on both sides to compare send and receive timestamps to analyze the latency of traffic to/from a pod. Latency can occur in OpenShift Dedicated if a node interface is overloaded with traffic from other pods, storage devices, or the data plane.

    $ tcpdump -s 0 -i any -w /tmp/dump.pcap host <podip 1> && host <podip 2> (1)
    1 podip is the IP address for the pod. Run the following command to get the IP address of the pods:
    # oc get pod <podname> -o wide

    tcpdump generates a file at /tmp/dump.pcap containing all traffic between these two pods. Ideally, run the analyzer shortly before the issue is reproduced and stop the analyzer shortly after the issue is finished reproducing to minimize the size of the file. You can also run a packet analyzer between the nodes (eliminating the SDN from the equation) with:

    # tcpdump -s 0 -i any -w /tmp/dump.pcap port 4789
  • Use a bandwidth measuring tool, such as iperf, to measure streaming throughput and UDP throughput. Run the tool from the pods first, then from the nodes to attempt to locate any bottlenecks. The iperf3 tool is included as part of RHEL 7.