This is part two of the series discussing multicluster service discovery in OpenShift using Submariner and Lighthouse. In this part, we explore how to deploy and connect two OpenShift clusters running in AWS using Submariner and Lighthouse. Part one is here.

Lighthouse in Action

Prerequisites:

First, we deploy the two clusters. Although Submariner can connect two clusters with overlapping CIDRs using the Globanet feature, in this case we will deploy clusters with non-overlapping CIDRs:

Cluster A

Create cluster A with the default IP CIDRs:

openshift-install create install-config --dir cluster-a
openshift-install create cluster --dir cluster-a

Cluster B

Before creating cluster B, we will modify the default CIDRs to create non-overlapping clusters:

openshift-install create install-config --dir cluster-b
sed -i 's/10.128.0.0/10.132.0.0/g' cluster-b/install-config.yaml
sed -i 's/172.30.0.0/172.31.0.0/g' cluster-b/install-config.yaml

Next, deploy the cluster:

openshift-install create cluster --dir cluster-b

Next, run the following script to open the ports that Submariner uses to create tunnels and other prerequisites for deploying Submariner:

curl https://raw.githubusercontent.com/submariner-io/submariner/master/tools/openshift/ocp-ipi-aws/prep_for_subm.sh -L -O
chmod a+x ./prep_for_subm.sh

./prep_for_subm.sh cluster-a # respond yes when terraform asks
./prep_for_subm.sh cluster-b # respond yes when terraform asks

Install Subctl

Download the subctl binary and make it available on your PATH:

curl -Ls https://get.submariner.io | bash
export PATH=$PATH:~/.local/bin
echo export PATH=\$PATH:~/.local/bin >> ~/.profile

Install Submariner

We’ll deploy the Submariner Broker on Cluster A and then join both clusters to it.

Deploy Broker

subctl deploy-broker --kubeconfig cluster-a/auth/kubeconfig --service-discovery

Join Cluster A and Cluster B

subctl join --kubeconfig cluster-a/auth/kubeconfig broker-info.subm --clusterid cluster-a
subctl join --kubeconfig cluster-b/auth/kubeconfig broker-info.subm --clusterid cluster-b

Next, deploy a test NGINX service in ClusterB and try to access it from ClusterA:

export KUBECONFIG=cluster-b/auth/kubeconfig
kubectl -n default create deployment nginx --image=nginxinc/nginx-unprivileged:stable-alpine
kubectl -n default expose deployment nginx --port=8080
export KUBECONFIG=cluster-a/auth/kubeconfig
kubectl -n default run --generator=run-pod/v1 tmp-shell --rm -i --tty --image quay.io/submariner/nettest -- /bin/bash
curl nginx.default.svc.supercluster.local:8080

As you can see, the curl command fails. As mentioned earlier, Lighthouse only provides discovery for services that have been explicitly exported, which is done via the subctl export command:

export KUBECONFIG=cluster-b/auth/kubeconfig
subctl export service --namespace default nginx

The curl command should now succeed, although it may take a few seconds for information to propagate across the clusters.

Under the Hood

The --servicediscovery flag enables service discovery during submariner deployment. It installs the serviceimport.lighthouse.submariner.io CRD in the Broker cluster and also in the clusters that join the Broker. When a cluster joins the Broker, the Submariner Operator deploys the “lighthouse” DNS server and creates a ClusterIP Service for it as well.

NAME                                             READY   STATUS    RESTARTS   AGE
submariner-lighthouse-coredns-69b5cc5746-2crjv   1/1 Running   0      94m
submariner-lighthouse-coredns-69b5cc5746-h9w4p   1/1 Running   0      94m

NAME                        TYPE    CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
submariner-lighthouse-coredns   ClusterIP   172.31.18.217 <none>    53/UDP          86m

The Lighthouse DNS server owns the supercluster.local:53 domain name and is configured to use the Lighthouse plugin to respond for DNS requests as configured in the ConfigMap:

apiVersion: v1
data:
 Corefile: |
supercluster.local:53 {
lighthouse
errors
health
ready
}
kind: ConfigMap
metadata:
 creationTimestamp: "2020-06-24T09:47:17Z"
 labels:
app: submariner-lighthouse-coredns
component: submariner-lighthouse
 name: submariner-lighthouse-coredns
 namespace: submariner-operator
 resourceVersion: "66404"
 selfLink: /api/v1/namespaces/submariner-operator/configmaps/submariner-lighthouse-coredns
 uid: 023d1803-93b4-4ef4-8a6c-85349e56fce6

The Lighthouse agent is also deployed in the clusters that join:

NAME                                             READY   STATUS    RESTARTS   AGE
submariner-lighthouse-agent-6fb4bbf4f-cj7gd      1/1     Running   0          94m

The agent creates a ServiceImport CR and syncs it to the Broker namespace. For the NGNIX service created in this example, a ServiceImport resource will be created in the submariner-operator namespace on Cluster B and synced to Cluster A. The Status contains information about the service IP and the cluster it belongs to.

Name:     nginx-default-cluster2
Namespace: submariner-operator
Labels:   <none>
Annotations:  origin-name: nginx
          origin-namespace: default
API Version:  lighthouse.submariner.io/v2alpha1
Kind:     ServiceImport
Metadata:
 Creation Timestamp:  2020-07-23T05:50:15Z
 Generation:      1
 Resource Version: 69828
 Self Link:       /apis/lighthouse.submariner.io/v2alpha1/namespaces/submariner-operator/serviceimports/nginx-default-cluster2
 UID:             61f67a5f-cca8-11ea-b08c-0242ac110006
Spec:
 Ports:                <nil>
 Session Affinity:     
 Session Affinity Config:  <nil>
 Type:                 SuperclusterIP
Status:
 Clusters:
Cluster:  cluster2
Ips:
  100.92.83.229
Events:  <none>

The Lighthouse agent running in Cluster A imports this resource from the Broker.

The Submariner Operator also adds a forward plugin entry in the dns.operator.openshift.io resource for the supercluster.local zone and with the Lighthouse DNS server’s Cluster IP:

apiVersion: operator.openshift.io/v1
kind: DNS
metadata:
 creationTimestamp: "2020-06-24T06:58:28Z"
 finalizers:
 - dns.operator.openshift.io/dns-controller
 generation: 2
 name: default
 resourceVersion: "66429"
 selfLink: /apis/operator.openshift.io/v1/dnses/default
 uid: 996f61a7-a34d-498b-9b45-676685fde49c
spec:
 servers:
 - forwardPlugin:
  upstreams:
  - 172.31.18.217
name: lighthouse
zones:
- supercluster.local
status:
 clusterDomain: cluster.local
 clusterIP: 172.31.0.10
 conditions:
 - lastTransitionTime: "2020-06-24T07:03:43Z"
message: ClusterIP assigned to DNS Service and minimum DaemonSet pods running
reason: AsExpected
status: "False"
type: Degraded
 - lastTransitionTime: "2020-06-24T07:03:43Z"
message: All expected Nodes running DaemonSet pod
reason: AsExpected
status: "False"
type: Progressing
 - lastTransitionTime: "2020-06-24T06:58:54Z"
message: Minimum number of Nodes running DaemonSet pod
reason: AsExpected
status: "True"
type: Available

The OpenShift DNS operator uses this configuration and updates the CoreDNS coremap. This will result in CoreDNS forwarding all the DNS requests with search path supercluster.local to the Lighthouse DNS server:.

apiVersion: v1
data:
 Corefile: |
# lighthouse
supercluster.local:5353 {
    forward . 172.31.18.217
}
.:5353 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        upstream
        fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . /etc/resolv.conf {
        policy sequential
    }
    cache 30
    reload
}
kind: ConfigMap
metadata:
 creationTimestamp: "2020-06-24T06:58:28Z"
 labels:
dns.operator.openshift.io/owning-dns: default
 name: dns-default
 namespace: openshift-dns
 ownerReferences:
 - apiVersion: operator.openshift.io/v1
controller: true
kind: DNS
name: default
uid: 996f61a7-a34d-498b-9b45-676685fde49c
 resourceVersion: "66430"
 selfLink: /api/v1/namespaces/openshift-dns/configmaps/dns-default
 uid: 2bbcc099-c90e-42dc-9cd9-318065c3649e

Now let’s follow what happens when a curl is executed in the netshoot pod in Cluster A. The pod tries to resolve nginx.default.svc.supercluster.local:8080. The request reaches OpenShift CoreDNS, which will forward the request to the Lighthouse DNS service. The Lighthouse server consults its cache of cluster service information to locate and return the appropriate IP 172.31.173.226. Submariner’s dataplane will then ensure the client data reaches the backend service in the target cluster.

We encourage you to try out Submariner with Lighthouse if you are looking for a solution for multi-cluster service discovery. For further information, please visit the Submariner documentation site at https://submariner.io.