Until Kubernetes Federation hits the prime time, a number of solutions have sprung up as stop gaps to address geographically dispersing multiple cluster endpoints: stretch clusters and multiple clusters across multiple datacenters. The following article discusses how to configure Keepalived for maximum uptime of HAproxy
with multiple cluster endpoints. In the following documentation an HAproxy
and Keepalived
configuration will be discussed in detail to load balance to the cluster(s) endpoints.
In a production environment a Global server load balancing (GSLB) or Global Traffic Manager (GTM) would be used to give a differing IP address based on the originating location of the request. This would help to ensure traffic from Virginia or New York would get the closest location to the originating request.
In an event to simulate geographically dispersed DNS, two records were created to represent endpoints destined for either datacenter. Each HAproxy node owns that virtual IP address. The configuration resembles an active/active cluster configuration. Lastly, the HAproxy servers prefer to offer from their service pools in their home datacenter.
dig +noall +answer haproxy.example.com
haproxy.example.com. 1800 IN A 10.19.114.20
haproxy.example.com. 1800 IN A 10.19.114.21
This article assumes HAproxy
and Keepalived
have already been installed and partially configured. For more information on preparing nodes for HAproxy and Keepalived please see the following article. At the end of this document, fully functioning configuration files will be included as a template for use.
Two HAproxy
instances will be used, one for each datacenter and each OpenShift Cluster Platform.
This image is a reflection of the final HAproxy
configuration:
HAproxy Load Balancer Configuration
The HAproxy load balancers distribute traffic across port groups. A sample config for Datacenter A's HAproxy
is shown below:
frontend main80 *:80
default_backend router80
backend router80
balance source
option allbackups
mode tcp
server clus1-infra-0.example.com clus1-infra-0.example.com:80 check
server clus1-infra-1.example.com clus1-infra-1.example.com:80 check
server clus1-infra-2.example.com clus1-infra-2.example.com:80 check
server clus2-infra-0.example.com clus2-infra-0.example.com:80 check backup
server clus2-infra-1.example.com clus2-infra-1.example.com:80 check backup
server clus2-infra-2.example.com clus2-infra-2.example.com:80 check backup
Notice that the load balancer puts a preference on local datacenter nodes in clus1 and uses clus2 only in the event that the keepalive checks fail.
The opposite configuration in Datacenter B may look like this:
..omitted..
frontend main80 *:80
default_backend router80
backend router80
balance source
option allbackups
mode tcp
server clus2-infra-0.example.com clus2-infra-0.example.com:80 check
server clus2-infra-1.example.com clus2-infra-1.example.com:80 check
server clus2-infra-2.example.com clus2-infra-2.example.com:80 check
server clus1-infra-0.example.com clus1-infra-0.example.com:80 check backup
server clus1-infra-1.example.com clus1-infra-1.example.com:80 check backup
server clus1-infra-2.example.com clus1-infra-2.example.com:80 check backup
..omitted..
The inverse is applied above, HAproxy
B prefers nodes in clus2. Keepalived
performs the task of keeping the virtual IPs between either HAproxy
and in the event of a failure, will failover to either HAproxy
.
Keepalived Configuration
Much like the HAproxy
configuration above, the Keepalived
configuration is different based on each datacenter.
In Datacenter A, we have the following Keepalived
config for the VIPs
:
..omitted..
vrrp_instance OCP_vi1 {
state MASTER
interface ens192
virtual_router_id 51
priority 100
advert_int 10
unicast_src_ip 10.19.114.18
unicast_peer {
10.19.114.19
}
virtual_ipaddress {
10.19.114.20
}
..omitted..
vrrp_instance OCP_vi2 {
state BACKUP
interface ens192
virtual_router_id 61
priority 98
advert_int 10
unicast_src_ip 10.19.114.18
unicast_peer {
10.19.114.19
}
virtual_ipaddress {
10.19.114.21
}
..omitted..
The mirrored configuration for Datacenter B resembles the following:
..omitted..
vrrp_instance OCP_vi1 {
state BACKUP
interface ens192
virtual_router_id 51
priority 98
advert_int 10
unicast_src_ip 10.19.114.19
unicast_peer {
10.19.114.18
}
virtual_ipaddress {
10.19.114.20
# dev ens192
}
..omitted..
vrrp_instance OCP_vi2 {
state MASTER
interface ens192
virtual_router_id 61
priority 100
advert_int 10
unicast_src_ip 10.19.114.19
unicast_peer {
10.19.114.18
}
virtual_ipaddress {
10.19.114.21
}
Notice that on instance OCP_vi1
the load balancer is datacenter A is the preferred owner with datacenter B being the backup.
Testing failover
Additionally, a HAproxy
group has been setup to test failover via round robin
load balancing.
# Both VIPs are online and load balancing will bounce to either datacenter
[root@stretch-master-0 ~]# while [ 1 ];do curl haproxy:81 && sleep 5;done
clus1-infra-2
clus2-infra-0
clus1-infra-1
clus2-infra-2
clus1-infra-0
clus2-infra-1
# Fail datacenter A
[root@haproxy-0 ~]# systemctl stop keepalived
[root@clus1-master-0 ~]# while [ 1 ];do curl haproxy:81 && sleep 5;done
clus2-infra-0
clus2-infra-1
clus2-infra-2
clus2-infra-0
# Restore datacenter A then fail datacenter B
[root@haproxy-0 ~]# systemctl start keepalived
[root@haproxy-1 ~]# systemctl stop keepalived
[root@clus1-master-0 ~]# while [ 1 ];do curl haproxy:81 && sleep 5;done
clus1-infra-2
clus1-infra-1
clus1-infra-0
clus1-infra-1
clus1-infra-0
Conclusion
This post has described the installation and configuration of HAproxy and Keepalived to keep multiple OpenShift Container Platform's services online and highly available in the event of a failure. This configuration coupled with OCP's HA features provide maximum uptime for containers and microservices in your production environment.
Complete Configuration Files:
Datacenter A Configuration Files:
* haproxy-dc-a.cfg
* keepalived-dc-a.conf
Datacenter B Configuration Files:
* haproxy-dc-b.cfg
* keepalived-dc-a.conf
Categories