Red Hat is excited about bare metal and the potential benefits it can bring to both public and on-premises computing. Once upon a time, bare metal was declared a dirty word in the datacenter (whisper). It was something that needed to be virtualized with a hypervisor (ESXi, KVM, Hyper-V, Xen, and others) to offer its true potential. Something dramatic and evolutionary would need to happen to change this operational and economic position—a position that forced many people to pay to get hypervisors on their hardware. Luckily for the world, not one but two innovations hit within a 5-year span to change history.

The first thing that opened the door to bare-metal computing coming back into popularity in the modern cloud was Linux containerization and the ability to have portable workloads packaged in an open container initiative (OCI) format. Linux containers transformed modern workloads and led to standardizations in orchestration. Those innovations allowed developers to attach their application services to an infrastructure at a higher and more powerful API level. The second thing that really made bare metal move came from intelligent application services. These include big data analytics, high-performance computing, machine learning (ML), deep learning, and artificial intelligence (AI). These high-bandwidth and low-latency workloads, and the distributed toolsets or development models that leverage them, typically require large compute resources commonly found in bare metal. Connecting the results and inference back to traditional application in ways that allowed data proximity to shine through to a powerful end-user experience, made better through analytics, excelled on bare metal. These specific use cases exploit the underlying hardware in a much more efficient and performant way as compared to passing instruction sets through a virtualization layer.  This combination of Linux containerization and analytics at scale changed how people wanted to interact with bare metal and opened the floodgates of people seeking this competitive advantage for their businesses.

At the same time, specific commercial verticals had innovation spikes that benefited from bare metal. A good example is 5G in the telecommunications space. As providers and suppliers revamped their solution stacks for 5G, they modernized and incorporated the agility of containerization. This drove the need for IPv6, SR-IOV, Container Network Functions (CNFs), NUMA topologies, and other innovations in containerized applications on bare metal. Another good example? The media and entertainment vertical driving new human interactions with data at the edge. Cheap, simple, and lightweight devices, which need to run containers to process large amounts of data, are cost prohibitive to bring back into the datacenter. All that requires containerization and its framework to be able to leverage unique, bare-metal solutions that cross CPU chip types (that is, ARM).

Red Hat, being a leader in Linux operating systems, OpenStack, containerization, and Kubernetes, saw an opportunity to do something special through the combination of these technologies, which would benefit customers looking to create a competitive advantage with bare metal. We've been thinking about this for a while. By laying RHEL down on bare metal, as people have been doing for decades, we have everything we need to form a great user experience.

The first thing we created on this journey was an open source community around how to automate server management for Kubernetes clusters. On this front, we were fortunate enough to have been already doing it for years in another community. Leveraging the OpenStack Ironic open source project, we created Metal3. Metal3 allows us to automate the orchestration of servers in racks through such popular interfaces as IPMI and Redfish to alleviate the operational burdens of owning them. After the server is up, we have networking. Ansible offers an exhaustive list of network device automations to line up your racks. We have zero touch provisioning (ZTP) that places all these steps in a global deployment pattern ready for thousands of end points.

We believe there will always be a need to have virtual machines (VMs) in a complete solution, so we created the KubeVirt open source community from our knowledge of KVM and RHV. It's a fresh look at modernizing legacy VMs by treating them as if they were pods on Kubernetes: same network, same scale up or scale down, same YAML or resource objects. The project has ported it all over to native Kubernetes API, all right next to your containers. Adept at storage, GPU, and sophisticated application deployment patterns for databases through operators, Red Hat has invested in numerous open source projects to make sure people can get the most out of running Kubernetes on bare metal where appropriate, such as:

RHEL

Metal3

Zero touch provisioning

KubeVirt

Ansible networking

Assisted installer

Rook

Ceph

GPU

Operators

Multi-cluster management

Container network functions

Red Hat Edge

Katacontainers

Resource Control

Now that you understand the backdrop against which bare metal is important in modern computing, let’s turn our attention to some common misconceptions in the industry. I'll use an example I saw recently on the internet. In a nutshell, some people conflate the ability to add more kubelets to linearly increasing the work coefficient of a cluster. What people mistakenly point out during such a conversation is that the open source Kubernetes community limits how many pods the kubelet can have, and that results in a pod limit of around 500 pods per kubelet. They'll take that limit number and compare it to carving up the same node into VMs in an effort to point out that you can get more pods from the same machine by having more kubelets, each with a 500-pod limit, than having a single kubelet with one 500-pod limit.

I'll point out that this pod limit is not a hard one, and it's not really a best practice to think of it this way. You shouldn't think of the number of pods per node as an independent number that is not influenced by other things in the same cluster. Most people who are leveraging bare metal are doing so for larger workloads that would consume most boxes at pod densities of around 250. Or, they are doing so to better access a raw hardware resource from a PCI bus or other location. To date, there's been little need to redesign the kubelet around the issue. Maybe someday in the future it will be needed.

The other thing people should consider when thinking about resource consumption is the workload itself. Now with public clouds, datacenters, VMs, bare metal, and new specialized devices for servers, we have a lot of options for pairing our workloads to the correct target. We no longer need to be forced into a simplistic world of only VMs. Running as many lightweight API endpoints as you can on bare metal is likely not the best use of bare metal. Such a workload’s characteristics belong on a public cloud or on a VM with containers running on top and does not need to run on bare metal. This is why OpenShift offers an ability to leverage OpenShift Virtualization (kubeVirt) so that users can make the right choices for their workloads.

When you look at getting the most out of your Kubernetes clusters, there's more to consider than the pods. Kubernetes has a variety of API resource objects, and they all have a relationship or consequence to each other. This is especially important as you add more container images, more pods, more kubernetes services, more ports, more LoadBalancers, more secrets, more namespaces, more PVCs, and on and on. They can all trigger the use of one another as you deploy more real-world, even lightweight, applications. Eventually, one will normally reach exhaustion in the cluster on one of these other resources before the pod limit. In terms of the business-critical application loads we see, we find very large clusters normally will hit limits on the etcd and service APIs before pod limits are reached.  When thinking about resource object exhaustion, there is more than pods to consider.

The Future

With OpenShift, you can run on multiple public clouds, bare metal, and VMs. You can let the requirements of the application service drive the infrastructure choice. Think about the power of running a mixed KubeVirt, container, and Knative Serverless composite workload in the same cluster. Now close your eyes and see it across multiple clusters. Some clusters you run and some clusters run for you, all of them being the same OpenShift. We at Red Hat cannot wait to see what you turn your bare metal into. Let us know.