Cloud Experts Documentation

Ingress to ROSA Virt VMs with Certificate-Based Site-to-Site (S2S) IPsec VPN and Libreswan

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

Introduction

This solution uses a Site-to-Site (S2S) VPNexternal link (opens in new tab) as a mechanism in OpenShift Virtualization on ROSA to establish an IP route between the virtual overlay network that VMs are attached to, and the VPC outside your clusterexternal link (opens in new tab) without the need for NAT or load balancers. OpenShift Virtualization provides several built-in features to plug VMs directly into outside networks when deployed on-premises , but these depend upon mechanisms that are not exposed in cloud provider networks. This solution should be considered a stop-gap until there is a native OpenShift Virtualization feature to plug VMs into cloud provider networks.

When to use this

Customers migrating from other virtualization solutions often have architected their networking such that they expect direct communication between outside corporate networks and the networks their VMs are attached to, without Network Address Translation. This solution makes that possible in OpenShift Virtualization on ROSA.

OpenShift also provides built-in mechanisms for traffic to and from VMs, and you can choose on a case-by-case basis what works best.

Advantages

  • Direct, routable access to VMs: UDN/CUDN addresses are reachable from the VPC without per-VM LBs or port maps, so existing tools (SSH/RDP/agents) work unmodified.
  • Cert-based, NAT-friendly: The cluster peer authenticates with a device certificate, so it can sit behind NAT; no brittle dependence on a static egress IP, and no PSKs to manage.
  • AWS-native and minimally invasive: Uses TGW, CGW (certificate), and standard route tables—no changes to managed ROSA networking, and no inbound exposure (no NLB/NodePorts) because the VM initiates.
  • High availability: In the event that the node or availability zone hosting the IPSec VM goes down, a second node can take over both the VPN tunnel and the next-hop IP address that other VMs in the cluster use. Experiments have found a failover recovery time of about 5 seconds.
  • Scales and hardens cleanly: Advertise additional prefixes, or introduce dynamic routing later. As BGP-based UDN routing matures, you can evolve without re-architecting.

In short: this is a practical and maintainable way to reach ROSA-hosted VMs without PSKs, without a static public IP, and without a fleet of load balancers.

Limitations

You should consider these limitations of this VPN solution when deciding whether to apply it.

  • Complexity: This solution requires manual configuration of many different components that work together.
  • Throughput: Bandwidth is limited to 1.25 Gbps by the AWS VPN Tunnelexternal link (opens in new tab) for the entire cluster, and further limited by the performance of the VM acting as an IPsec gateway.
  • Availability:
    • VM Outage: In the event that the worker node hosting the active IPsec gateway VM fails (for example, if an Availability Zone has an outage), the VPN will not pass traffic for about 5 seconds unti the secondary VM takes over.
    • Tunnel Outage: Each AWS Site-to-Site VPN has two available tunnels, and by default, may take down either one for maintenance. Follow the steps for VTI configuration of Libreswan if you wish to have both tunnels active simultaneously to handle this case without interruption. Or, consider tunnel endpoint lifecycle controlexternal link (opens in new tab) to control when this maintenance occurs.
  • IPAM disabled on VM Network: In order to allow VMs to act as gateways, this solution requires disabling IPAM on the VM network, which has the following effects:
    • Port security disabled: Any VM on the VM network can listen or send from any IP address for traffic within that network. Consider the security implications of whether this is acceptable for your workloads.
    • Manual configuration of each VM: You will need to individually set an IP address and route tables on each VM in the VM network. You could theoretically run a DHCP server on the network or use other automation to streamline this.

Example VPN use-cases

  • You have outside services, such as security scanners or backup tools, that expect to be able to ingress non-HTTP/HTTPS connections to each VM on particular ports or a large number of ports per VM across a large number of VMs
  • Your VMs host clustered software that expects the IP address on the VM’s network interface to match the IP address that external services use to ingress traffic
  • Your VMs egress to outside services that rely on the source IP of the connection (for example, databases that use the source IP for access control)

Use-cases that have better alternatives than VPN

  • HTTP/HTTPS traffic: Use OpenShift Routes , which provide much more flexibility.
  • High bandwidth or many connections: The VPN connection does not provide the full bandwidth that is available when traffic is NATted through worker nodes’ ENIs. Use NodePort or LoadBalancer Services instead.
  • Greenfield architectures: If you are architecting from scratch, treat your VMs like containers, and expose them using Services and Routes.
  • Interactive management traffic: OpenShift has built in features for this using the virtctl CLI that use the cluster’s existing RBAC features.

Architecture

We build a Site-to-Site (S2S) VPNexternal link (opens in new tab) between an AWS VPCexternal link (opens in new tab) and a User-Defined Network (UDN/CUDN) that OpenShift Virtualization VMs are attached to. We deploy set of small CentOS VM inside the cluster running Libreswanexternal link (opens in new tab) that establish an IPsec/IKEv2 tunnelexternal link (opens in new tab) through an AWS Transit Gateway (TGW)external link (opens in new tab) .

We use certificate-based authenticationexternal link (opens in new tab) : the AWS Customer Gateway (CGW)external link (opens in new tab) references a certificate issued by ACM Private CAexternal link (opens in new tab) , and the cluster VM uses the matching device certificate. Because identities are verified by certificates—not a fixed public IP—the VM can initiate the VPN from behind NAT (worker → NAT Gateway) and still form stable tunnels.

On AWS, the TGW terminates two redundant tunnels (two “outside” IPs). We associate the VPC attachment(s) and the VPN attachment with a TGW route table and enable propagation as needed. In the VPC, route tables send traffic for the CUDN prefix (e.g., 192.168.1.0/24) to the TGW. On the cluster side, the CUDN has IPAM disabled; you can optionally add a return route on other CUDN workloads to use the IPsec VM as next hop when those workloads need to reach the VPC.

NAT specifics: when the VM egresses, it traverses the NAT Gatewayexternal link (opens in new tab) . If that NAT uses multiple EIPs, AWS may select different EIPs per connection; this is fine because the VPN authenticates via certificates, not source IP.

s2svpn-v3

We ensure this approach is highly available by provisioning a second VM that can take over the IPSec connection from the first one in the event of failure. We use Keepalived to handle leader election and to ensure that the active VM is also assigned a virtual IP address, which other VMs in the cluster use as a next-hop for routes to the VPC.

0. Prerequisites

  • A classic or HCP ROSA cluster v4.18 and above.
  • Bare metal instance machine pool (we are using m5.metal, feel free to change as needed), and OpenShift Virtualization operator installed. You can follow Step 2-5 from this guide to do so.
  • The oc CLI # logged in.
  • A CIDR block that will be used for the VM network inside the cluster and should not overlap with CIDRs used in your VPC or networks connected to your VPC. We use 192.168.1.0/24 in this guide.

1. Create Private CA and Certificates (ACM PCA)

Go to AWS Console and select Certificate Manager. Then on the left navigation tab, click AWS Private CAs, and then click Create a private CA.

On the Create private certificate authority (CA) page, keep CA type options as Root. You could leave the default options for simplicity sake. We would recommend, however, give it a name; so for example, here we give the Common name (CN) ca test v0. Acknowledge the pricing section, and click Create CA.

And then on the root CA page, go to the Action tab on upper right side, and select Install CA certificate. On the Install root CA certificate page, you can leave the default configurations as-is and click Confirm and Install. The CA should now be Active.

Next, create a subordinate CA by repeating the same thing but on the CA type options, choose Subordinate, and give it a Common name (CN) such as ca sub test v0. Confirm pricing and create it.

And similarly, on the subordinate CA page, go to the Action tab on top right side, and select Install CA certificate. On the Install subordinate CA certificate page, under Select parent CA, choose the root CA you just created as the Parent private CA. And under Specify the subordinate CA certificate parameters, for the validity section, pick a date at least 13 months from now. You can leave the rest per default and click Confirm and Install.

Once done, you will have these private CAs like this snippet below:

private-ca

Next, go the AWS Certificate Manager (ACM) page, and click Request a certificate button. On the Certificate type page, select Request a private certificate, and click Next.

Under Certificate authority details, pick the subordinate CA as Certificate authority. Then under Domain names, pick a Fully qualified domain name (FQDN) of your choice. Note that this does not have to be resolvable, we just use it as an identity string for IKE. For example here, we use something like s2s.vpn.test.mobb.cloud. You can leave the rest per default, acknowledge Certificate renewal permissions and click Request.

Wait for until the status is changed to Issued. Then, click Export button on top right side. Under Encryption details, enter a passphrase of your choice. You will be prompted to input this passphrase in the next steps, so please keep it handy. Acknowledge the billing and click Generate PEM encoding. And on the next page, click Download all, and finally click Done.

Once downloaded you will be seeing 3 files on your local machine:

  • certificate.pem
  • certificate_chain.pem
  • private_key.pem

Note if the downloaded files are in .txt, rename them into .pem files (you can simply mv certificate.txt certificate.pem and so forth for the rest of the files).

Next, create the PKCS#12 for Libreswan. Feel free to change the name of the cert, but be sure you’re on the same directory where the downloaded certificate files are:

This will prompt you with passphrase you created before.

You now have the following files, which you should save for a future step.

  • left-cert.p12 — the PKCS#12 you just created (leaf + key + chain)
  • certificate_chain.pem — the full CA chain (subordinate then root)

2. Create a Customer Gateway (CGW)

Go to AWS console, find VPC. Then on the left navigation tab, find Customer gateways → Create customer gateway.

On the Certificate ARN section, choose your ACM-PCA–issued cert. You can give it a name like cgw test v0, leave the default options, and click Create customer gateway.

With certificate-auth, AWS doesn’t require a fixed public IP on the CGW; that’s why this pattern works behind NAT.

3. Create (or use) a Transit Gateway (TGW)

Note that this setup also works (and tested) with Virtual Gateway (VGW). So when to choose VGW or TGW:

  • Use VGW when you only need VPN to one VPC, don’t require IPv6 over the VPN, and want the lowest ongoing cost (no TGW hourly/attachment fees; you will just pay standard Site-to-Site VPN hourly + data transfer).

  • Use TGW when you need a hub-and-spoke to many VPCs/accounts, inter-VPC routing, or IPv6 VPN. Expect extra charges such as TGW hourly, per-attachment hourly, and per-GB data processing, on top of VPN. Also add a one-line cost link.

Continue with this step if you choose TGW.

On the left navigation tab, find Transit Gateways → Create transit gateway. Give it a name like tgw test v0, leave the default options, and click Create transit gateway.

Next, let’s attach the VPC(s) to the TGW. On the navigation tab, find Transit Gateway attachments → Create transit gateway attachment

Give it a name like tgw attach v0, pick the transit gateway you just created as Transit gateway ID, and select VPC as the Attachment type. And on the VPC attachment section, select your VPC ID, and select the private subnet of each subnets you want reachable from the cluster. Once done, click Create transit gateway attachment.

tgw-attach

4. Create the Site-to-Site VPN (Target = TGW)

Still on VPC console, find → Site-to-Site VPN connections → Create VPN connection.

Give it a name like vpn test v0. Choose Transit gateway as Target gateway type and choose your TGW from the Transit gateway dropdown. Then choose Existing for Customer gateway, and select the certificate-based CGW from previous step from the Customer gateway ID options.

vpn-0

Choose Static for Routing options. For Local IPv4 network CIDR, put in the CUDN CIDR, e.g. 192.168.1.0/24. And for Remote IPv4 network CIDR, put in the cluster’s VPC CIDR, e.g. 10.10.0.0/16.

vpn-1

Leave default options as-is and click Create VPN connection.

At the moment, the status of both tunnels are Down and that is completely fine. For now, take note on the tunnels’ outside IPs as we will use them for the Libreswan config in a future step.

tunnel-outside-ip

5. Security groups and NACLs

On the navigation tab, find Security groups. Filter it based on your VPC ID.

Select one of the worker nodes’ security groups. Under Inbound rules, go to Edit inbound rules. Click Add rule. For Type, pick All ICMP - IPv4, and as Source, put in the CUDN subnet (e.g., 192.168.1.0/24), and click Save rules.

Optionally, you can also add rule for TCP 22/80 from the CUDN for SSH/curl tests.

Be sure to also check NACLs on the VPC subnets to allow ICMP/TCP from 192.168.1.0/24 both ways.

6. Create the project and secondary network (CUDN)

On your ROSA cluster, create vpn-infra project and the ClusterUserDefinedNetwork (CUDN) object.

Disabling IPAM also disables the network enforcing source/destination IP security. This is needed to allow each ipsec VM below to act as a gateway to pass traffic for other IP addresses. (Note that this helps with the VM’s being able to pass a virtual IP address back and forth, but would be needed even without the VIP for gateway routing to work.)

7. Create a set of IPSec VMs

We will use a pair of VMs within your ROSA cluster to establish the other side of the IPSec tunnel and act as gateways between the CUDN and the tunnel. One VM at a time will have the active IPSec connection and have the gateway IP address. The other, scheduled on a worker node in a different Availability Zone, will be able to take over if the primary VM fails (for example, if the AZ has an outage.)

First, create a Secret that will be used to configure password and SSH auth for the VMs. Change the changethis to the password you want to use to log into the centos user on the VMs’ consoles. Change the list of ssh_authorized_keys to a list of SSH public keys you wish to be able to log in as the centos user.

Now, create the virtual machines themselves.

These examples use CentOS Stream 10 as the distribution, but will work with trivial changes on RHEL 9/10 or CentOS Stream 9 as well. You can change this using the spec.DataVolumeTemplates[0].spec.sourceRef object.

The VMs are configured with two network interfaces. The first connects to the pod network to allow for Services, virtctl ssh, and egress to whatever networks pods in your cluster can normally egress to (for example, the Internet). The second connects to the cudn so that these VMs can act as gateways for the other VMs on that cudn.

Wait for a couple of minutes until the VMs are running.

8. Configure the ipsec VMs

Follow the instructions in this section for the ipsec-a VM, and then follow them again for the ipsec-b VM. We want both VMs to be nearly identical to allow for failover scenarios. The instructions will call out anything that needs to be different on the two VMs.

8.1 Log into the VM

Then click the Open web console and log into the VM using the credentials you specified when creating the VMs.

Alternatively, you can run this on your CLI terminal: virtctl console -n vpn-infra ipsec-a (or ipsec-b), and use the same credentials to log into the VM.

8.2 Install software

We will use libreswan as our IPSec client, and keepalived to manage failover between the two VMs.

Let’s first identify the ifname of the non-primary NIC. Depending on OS, this may either be disabled, or enabled with no IP address assigned.

8.3 Configure cudn network interface

We will next need to set up the network interface connected to the CUDN.

We will give the interface an IP address (VM_CUDN_IP). This address should be different on ipsec-a and ipsec-b. For this example, we will use 192.168.1.10 for ipsec-a and 192.168.1.11 for ipsec-b.

We will also configure a route to the VPC’s CIDR via a separate virtual IP (GATEWAY_VIRTUAL_IP). Keepalived will automatically assign this virtual IP to whichever ipsec VM is currently the active one.

Run the following, replacing the variables as necessary.

Kernel networking (forwarding & rp_filter):

Firewalld note: CentOS often has firewalld on. You don’t need inbound allows (the VM initiates), but outbound UDP/500, UDP/4500 must be allowed.

8.4 Importing certificates into the VM

You will now use the certificate files you generated earlier. Both VMs will use the exact same certificates.

Change ipsec-a below to ipsec-b when configuring the second VM.

Option A — using virtctl (easiest):

Option B — if you only have PEMs on the VM (build P12 on the VM):

Now run the import:

Tip: ACM’s certificate_chain.pem already contains subordinate + root in that order. If yours doesn’t, cat subCA.pem rootCA.pem > certificate_chain.pem before copying.

8.5 Creating Libreswan config

Let’s go back to the VM now, and as root (and be sure to replace the placeholder values, e.g. cert nickname, tunnels outside IPs):

Next, run ipsec status and now you should see something like Total IPsec connections: loaded 4, routed 1, active 1 which means that your tunnel is up.

And so now if you go back to the VPN console you will see one of the tunnel is up as follows:

vpn-up

After you have confirmed the tunnel is up, stop ipsec so that you can test it on each VM, and so that keepalived can control when it runs.

If you’ve only gone through these instructions on ipsec-a, go through this section again on ipsec-b.

9. Configure failover

Keepaplived on both ipsec VMs will communicate using the Virtual Router Redundancy Protocol to elect a leader. The leader will run ipsec and will assign its CUDN interface the gateway virtual IP address. The secondary will ensure IPsec is not running.

If at any point the current leader stops being available, the secondary will start ipsec, which will initiate new IKE sessions with the AWS S2S VPN tunnels. It will also move the gateway virtual IP address to its own CUDN interface so that VMs in ROSA can continue sending traffic to that IP address.

On ipsec-a:

On ipsec-b

Then, on both machines, run the following:

Note that we have not directly configured systemd to start ipsec at boot on either VM. This is deliberate. Instead, keepalived will run at startup on both VMs, and ipsec will only run on the leader.

10. Associate VPC to TGW route tables

Now that the VPN is up, let’s configure routing within the VPC to send traffic destined for the CUDN CIDR to the VPN’s Transit Gateway.

Back in the AWS console, on VPC navigation tab, find Transit gateway route tables, and go to Propagations tab, and ensure that both VPC and VPN resources/attachments are Enabled.

tgw-rtb-0

Then click Routes tab, look under Routes → Create static route. For CIDR, put in CUDN CIDR 192.168.1.0/24 and under Choose attachment, pick the VPN attachment and click Create static route.

tgw-rtb-1

Wait for a minute and it should now look like this:

tgw-rtb-2

11. Modify VPC route tables

Next, we will add route to the CUDN targeting our CGW for each VPC that should reach the cluster overlay. On the navigation tab, find Route tables. Filter it based on your VPC ID.

Select one of the private subnets. Under Routes tab, go to Edit routes. Click Add route. For Destination, put in CUDN subnet (e.g., 192.168.1.0/24), and as Target pick Transit Gateway and select the TGW you created, and click Save changes.

Repeat this with other private/public subnets you want to route CUDN to as needed.

12 Configure networking on other VMs

Other OpenShift Virtualization VMs will each need some configuration to make networking work for them.

12.1 Add a secondary network interface for the CUDN

Like the ipsec VM, other VMs will also need a secondary network interface connected to the vm-network ClusterUserDefinedNetwork.

When creating a new VM, select Customize VirtualMachine, then select Network on navigation bar. For existing VMs, go to the VM’s Configuration tab and then select Network from the side navigation. Under Network interfaces, click Add network interface. Name it cudn. Select the vm-network you created earlier.

virt-cudn-0

Depending on the specifics of the VM, you may need to reboot the VM before it can see the new network interface, or it may be available immediately.

12.2 Set an address for the network interface

Since IPAM is turned off on the cudn, each VM has to be given an IP address manually.

Log into the VM using Open web console, virtctl console, or virtctl ssh, if configured.

Then as root (run sudo -i), let’s first identify the ifname of the non-primary NIC. Depending on OS, this may either be disabled, or enabled with no IP address assigned.

Run the following inside the VM to give the second NIC (cudn) an IP. Replace enp2s0 with the name of the interface from the previous command. Replace the 192.168.1.20/24 with a unique address per VM within the CUDN CIDR (which in our examples has been 192.168.1.0/24) and ensure the number after the slash matches the subnet mask of the CIDR.

12.3 Set the gateway virtual IP as the next hop for VPC traffic

Each VM needs to know that it should send traffic destined for the VPC through the gateway virtual IP, which belongs to whichever ipsec VM is currently the leader.

As root, run the following. Replace 10.10.0.0/16 with your VPC’s CIDR. Replace 192.168.1.1 with the gateway virtual IP address.

13. Ping test

Now that everything is set, let’s try to ping from VM to an EC2 instance in the VPC. Pick an EC2 instance, e.g. bastion host, or bare metal instance, etc. to do so.

Note that if you launch a throwaway EC2 for testing, private subnet is safer (no public IP). Attach the same TGW RTB, and use Session Manager instead of a keypair if you want to skip inbound SSH. Security group needs ICMP (and optionally TCP/22) from 192.168.1.0/24.

Take a note on the private IPv4 address. Then on the VM console run:

And then from the EC2 instance:

14. Optional: Route-based (VTI) IPsec

Why do this?

  • Scale & simplicity: with VTIs you route like normal Linux—no per-subnet policy rules. Adding more VPC CIDRs later is just adding routes.
  • Better availability: you can ECMP two tunnels (one per TGW endpoint). That gives fast failover on the tunnel path. (Note: this is not AZ-resilient if you still have only one VM.)

14.1. Replace policy-based with route-based config

First, we need to find the Inside IP Address for the Customer Gateway for each tunnel. We will use this in the leftvti parameter when configuring Libreswan.

In the AWS VPC console, find → Site-to-Site VPN connections and click on the Site-to-Site VPN you made earlier. Click the Download Configuration button. Choose Vendor Generic, set IKE version to Ikev2, and click Download.

The file has sections for each tunnel. Under IPSec Tunnel #1 look for #3 Tunnel Interface Configuration and find the Inside IP Addresses line for Customer Gateway. Note the CIDR that appears there for tunnel 1. Repeat the process in the IPSec Tunnel #2 section.

Here’s a sample of the configuration file with … where lines were elided for clarity:

If you haven’t already, import your certs to the VM per Step 3 above.

14.2. System tunables for route-based IPsec

Two key rules:

  1. VTIs: disable_policy=1, disable_xfrm=0 (encrypt on the VTI, but don’t do policy lookups).
  2. Physical NICs: disable_policy=1, disable_xfrm=1 (never apply XFRM on the underlay).

Optionally, you can also persist VTI sysctls to survive reboots:

14.3. Route VPC CIDRs over the VTIs (ECMP)

14.4. Quick verification

And if you go to VPN console:

both-tunnels-up

Availability footnote

ECMP across two tunnels on one VM protects against a single TGW endpoint flap and gives smooth failover, but it’s not AZ-resilient. For true HA, run two IPsec VMs in different AZs (each with both tunnels) with keepalived as described above, but configure them with VTI.

Interested in contributing to these docs?

Collaboration drives progress. Help improve our documentation The Red Hat Way.

Red Hat logo LinkedIn YouTube Facebook Twitter

Products

Tools

Try, buy & sell

Communicate

About Red Hat

We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Subscribe to our newsletter, Red Hat Shares

Sign up now
© 2023 Red Hat, Inc.