With the recent release of Red Hat OpenShift 4.9, we are proud to introduce OpenShift Service Mesh 2.1. OpenShift Service Mesh is included and installed via an operator as part of the OpenShift Open Hybrid Cloud Platform and is based on the Istio project, with Kiali as the included management console for Service Mesh. Service Mesh 2.1 is based on Istio 1.9 and brings significant updates to OpenShift, most notably support for Service Mesh Federation across OpenShift Container Platform (OCP) clusters. This release of service mesh includes a substantial Kiali update with new monitoring, troubleshooting, and mesh federation features to ensure a smooth service mesh experience.

Federated Service Meshes

Service Mesh 2.1 introduces new resources for federating meshes across OCP clusters. This enables meshes located in different clusters to securely share and manage traffic between meshes while maintaining strong administrative boundaries in a multi-tenant environment.

Unlike many multicluster Istio topologies, mesh federation does not require connectivity between Istio control planes (IstioD) and Kubernetes API server(s) across clusters. Such multicluster topologies involve sharing API server credentials between clusters. This creates a potential security risk unless all of the involved clusters are part of the same trusted administrative boundary.

In comparison, OpenShift Service Mesh federation takes a “zero trust” approach that maintains administrative boundaries and only shares information after explicit configuration on each of the involved meshes. Each mesh continues to run its own control plane (Istiod) and may have its own “mesh” administrator who specifies exactly which services are allowed to be shared with other meshes, and which remote services may be imported for local usage. With this setup, services are shared on a strictly “as needed” or “opt-in” basis, providing additional security.

image5-Nov-03-2021-03-02-59-72-PM

This extends OpenShift Service Mesh’s multitenant deployment model and ensures that communication across meshes remains secure and tightly controlled. This also maintains a separation between cluster administrators who may be responsible for a large number of clusters, and mesh administrators who have likely more domain knowledge of the services under management such that they can create fine grain traffic policies. With OpenShift Service Mesh, mesh administrators do not require elevated cluster-admin privileges.

Service meshes can be federated using the new ServiceMeshPeer custom resource definition (CRD), which configures mesh-to-mesh connectivity between multiple gateways and trust domains. Once two meshes are connected, they must declare which services can be exported to be available for the other mesh. This is done with the new ExportServiceSet resource. In the  example below, the mesh on the left is exporting Service A to the mesh on the right.

image9-Nov-03-2021-03-02-59-51-PM

Once services have been exported, the peer mesh must declare which services are to be imported and how they are to be imported. This is done with the new ImportServiceSet resource. In the example above, the mesh on the right is importing Service A from the mesh on the left. When a service is imported, it can be referenced, allowing traffic to be routed to it from the other cluster and managed by the mesh using Istio resources, such as Authorization Policy and Virtual Service, as if it were a local service. All traffic between the two meshes travels via the gateways configured in ServiceMeshPeer.

By default, the reference to a service in a different mesh will contain the identity of the remote mesh and the remote namespace. The ImportServiceSet resource also includes the setting importAsLocal, which allows a service to be imported and represented as if it were a local service. If this service already exists, the endpoints from both local and remote services will be aggregated together. This can be used to facilitate cross-cluster load balancing and potential fail-over scenarios.

Jumping over to Kiali, OpenShift Service Mesh’s management and visualization dashboard, services in different clusters and namespaces are shown in separate groupings. Kiali is able to render traffic to and from imported services but does not have visibility of services in other meshes that have not been imported. Kiali is only able to pull metrics and interact with the service mesh API of the local mesh.

For a working example of OpenShift Service Mesh Federation, see this example in the Maistra Github repo.

Service Mesh Extensions with WebAssembly

Previously, Istio used a component called Mixer as an intermediary for telemetry reporting, precondition checking, and rate limiting. It had the ability to create custom policies that could be applied outside of the proxy/service layer. This had drawbacks though, most notably that Mixer could become a bottleneck and consume excessive resources under heavy load. Creating Mixer extensions also had to be written in C++. Thus, Istio has moved away from Mixer in favour of WebAssembly extensions as a means of extending the Envoy proxy. Mixer was deprecated (and off by default) in OpenShift Service Mesh 2.0 and has been removed in 2.1.

WebAssembly (Wasm) is a portable bytecode format for executing code written in multiple languages at near-native speed. Using Envoy’s WebAssembly Extensions provides several benefits, most notably:

  • Unlike Mixer, WebAssembly Extensions live at the Envoy proxy level. Thus, there is no single point of contention.
  • Extensions are deployed within a sandbox with resource constraints and a clearly defined API, providing better security and isolation.

This change does have drawbacks though, most notably:

  • Mixer adapters will no longer work in OpenShift Service Mesh 2.1. Mixture extensions that have been enabled will need to be disabled and removed before users can upgrade to OpenShift Service Mesh 2.1.
  • Because there is no longer a point of consolidation, some adapters (such as rate limiting) will require an external service to be converted to a WebAssembly extension. This may create some functionality gaps which will take time to fill.

To support the use of WebAssembly Extensions, the ServiceMeshExtension API is now generally available. This custom resource definition (CRD) is used to specify how and where a Wasm plug-in is to be deployed. This API is also being contributed to upstream Istio as the very similar WasmExtension and will be included in a future Istio release. In a future release of OpenShift Service Mesh, we will converge the ServiceMeshExtensions and WasmExtensions APIs.

3scale WebAssembly Integration

Red Hat 3scale API Management makes it easy for businesses to share, secure, distribute, control, and monetize their APIs. While OpenShift Service Mesh focuses on service to-service communication, API management focuses on how end users access your applications, including access control, billing, rate limiting, analytics, and more.

Previously, Red Hat provided a Mixer-based adapter for using 3scale with OpenShift Service Mesh. With the removal of Mixer, this plug-in can no longer be used with OpenShift Service Mesh 2.1. A new Wasm-based 3Scale adapter that takes advantage of the above mentioned ServiceMeshExtensions API is being provided for integrating with 3scale 2.11+ so as to continue to provide tight integration between both capabilities.

Troubleshooting Enhancements with Kiali

As the backbone of a distributed system, debugging Service Mesh can be challenging both at the data plane (proxy + application) and control plane levels. This is why OpenShift Service Mesh includes Kiali, a web console that is indispensable when it comes to monitoring and troubleshooting the condition and performance of your service mesh and its applications. OpenShift Service Mesh 2.1 brings many new monitoring and troubleshooting features to Kiali.

To start with, a healthy control plane is critical to the functioning of your mesh. Kiali monitors the health of service mesh components including Istiod, Ingress Gateways, Egress Gateways as well as Prometheus, Grafana, and Jaeger.

The health of your service mesh depends on the ability of the Istio Control Plane (Istiod) to provide an up-to-date configuration to your application’s sidecar proxies. Kiali now monitors this as well and will warn if the configuration of any sidecar proxies are out of sync.

To drill deeper, the side car’s Envoy configuration can be explored using the “Show Envoy Details” options:

This will pop up the Envoy configuration and allow you to compare the configuration of different pods from the same workload. This can be useful for attempting to determine if new Istio configuration has been rolled out successfully to all of the pods that make up the same workload. The resources menu allows you to focus on the area of Envoy configuration of interest.

This dialog is the equivalent of the istioctl proxy-config command line tool.

Finally, to monitor the interactions between your application and its sidecar proxy, Kiali’s log view has been redesigned to provide a unified, interleaved view. This makes it easier to perform request-level debugging to ensure that your application and its Envoy proxy are working together as expected.

In the above example, the orange log lines represent proxy access logs. As these have a known structure, they can be further expanded by clicking on the small “i” icon on the left, opening the following dialog, which provides additional information on the different fields that make up the log:

For a detailed walkthrough of Kiali’s Envoy debugging features, see this blog post.

The above represents only a sampling of the exciting new features that have been introduced into Kiali with Service Mesh 2.1. For a full overview of Kiali, visit the Kiali project site.

Istio 1.9

This release of OpenShift Service Mesh (2.1) is based on Istio 1.9 which introduces a wide range of performance and functionality improvements from Istio 1.7, 1.8, and 1.9. While the majority of these new features are fully supported by Red Hat, there are a small number of features that remain in tech preview or are currently not supported. For full details, please see our release notes.

OVN-Kubernetes CNI

We are also pleased to announce general availability for the support of OVN-Kubernetes Container Network Interface on OpenShift Service Mesh 2.0.x, 2.1, and beyond (applies to select versions of OpenShift versions; see release notes for details). OVN-Kubernetes is based on Open Virtual Network (OVN) and provides an overlay-based networking implementation that is a community developed and vendor-agnostic network virtualization solution. OVN-Kubernetes uses the Geneve (Generic Network Virtualization Encapsulation) protocol rather than VXLAN to create an overlay network between nodes.

Upgrading to Service Mesh 2.1

Upgrading an OpenShift Service Mesh 2.0.x instance to 2.1 requires updating the version field in the ServiceMeshControlPlane resource. The Service Mesh operator will then oversee an in-place upgrade of the Service Mesh control plane. Once that has been completed, all pods that are part of the service mesh’s data plane must be restarted to update the Envoy sidecar proxies.

Note that in-place upgrades from a 1.1.x version of Service Mesh are not supported and require the creation of a new Service Mesh control plane, following the previous 1.1 to 2.0 migration guide. Note that Service Mesh 1.1.x support will end upon the release of Service Mesh 2.2, while Service Mesh 2.0 will move into a maintenance support phase in early December 2021. For full details of the Service Mesh and OCP support matrix, see the Service Mesh section of the OpenShift Container Platform Life Cycle Policy page.

Learn More & Next Steps

I think you’ll agree that the OpenShift Service Mesh 2.1 release offers a number of exciting new features and capabilities. Here on the OpenShift Service Mesh team, we continue to plan and develop new capabilities and enhancements that ensure that your microservices can be easily managed and kept secure, both now and in the future.

Try out OpenShift Service Mesh for yourself today by visiting learn.openshift.com and find out how OpenShift Service Mesh can help tame your microservices.


About the author

Jamie Longmuir is a product manager at Red Hat focused on OpenShift Service Mesh. Prior to his journey as a product manager, Jamie spent much of his career as a software developer, often focusing on distributed systems and cloud infrastructure. Along the way, he has had stints as a field engineer and training developer working for both small startups and large enterprises.

Read full bio