If you’re an old school computer user, you could be forgiven for thinking it crazy that Microsoft SQL Server runs on Linux. Since the 2017 release of that enterprise database, Red Hat Enterprise Linux has been a supported platform. Yet, this strategic coming together of two former competitors isn’t over yet.

Microsoft SQL Server 2019 offers a host of quality-of-life improvements for users. These include faster query processing, in-memory database improvements and a new feature Microsoft is calling Big Data Clusters. 

Big Data Clusters is a fairly direct name for some truly impressive capabilities being embedded into MS SQL Server. Rather than pushing yet another database into your ecosystem, alongside dozens of others most likely, MS SQL Server 2019 will be able to act as a sort of data gateway for all of those other datastores. The goal is to allow a query written in Microsoft’s TSQL to be spread out to other systems that may use other variants of SQL, such as Oracle’s PLSQL. Thus, developers will be able to write all of their database queries in TSQL, yet have them parsed by MongoDO, Terradata or Oracle, among other planned target databases.

This enables developers to focus on a single query language, while still being able to take advantage of some of the benefits of other database types. That means all the powerful machine learning features inside SQL Server 2019 can be brought to bear on outside data sets. If you’re already got Spark and Hadoop, however, SQL Server 2019 can talk to those systems as well.

Vin Yu, program manager at Microsoft, said that, “We want customers building apps with SQL Server. If they are familiar with using SQL Server, they can continue to build their apps. What’s challenging is [having] the right database components and multiple endpoints. You can connect to SQL Server and connect to other datastores. You don’t have to know the different semantics of SQL,” said Yu. This particular feature is known as PolyBase.

Yu said this type of interaction is enabled by the transition to container-based operations. While traditional thinking has held that databases don’t run well in containers, primarily due to storage and high availability concerns, Yu said that SQL Server made the jump to containers easily, thanks to the existing work that had been done to port the database to Linux.

“When we support RHEL 8, we will also have SQL Server 2019 on  RHEL 8 containers coming out. From our perspective that was the biggest enabler. From then on, it’s like packaging any app in a container,” said Yu, highlighting the timeline for the planned addition of RHEL 8 support sometime next year.

The work at Microsoft to enable SQL Server 2019 on Red Hat OpenShift is also focused on building out a High Availability Kubernetes Operator for the database. Kubernetes Operators codify human knowledge by automating the deployment and lifecycle management of applications (including databases) running on OpenShift.  “This operator we’re working on for Big Data Clusters is really going to be focused on setting up HA. This is one of the biggest pain points for any enterprise customer, because it involves manually setting up multiple SQL Servers. We’re taking a huge pain point away,” said Yu.

The operator pattern of deploying services inside a Kubernetes cluster was appealing to Microsoft because it enables developers to be their own administrators, rather than monopolizing database administrators with the day-to-day task of building out infrastructure for projects.

“If you look at how databases were deployed even 10 years ago, you had to have a specialist install each individual machine. A DBA’s job was not just to support the databases, but to help developers. Every time a developer messed up, a DBA had to come fix it. Containers allow DBAs to package a database, and anyone in the company can deploy this with the single click of a button. That’s one area containers solve a problem.

And this is why Microsoft has begun to implement features in MS SQL Server that cannot be found outside of a containerized environment. “You can only use Big Data Clusters in a containerized deployment,” said Yu. “There is no other way. With SQL Server, Microsoft is making bets on the container ecosystem as a whole. In order to use specific features, you have to be using containers. In terms of ease of use over time, we’re going to see this evolve in a more mature fashion. One thing we always talk about is that containers are the new VMs. It took many years to adopt VMs, and people argued if they were the right choice. Now they’re the standard. Instead of questioning [Containers], we’re making a bet and jumping onto it early.”

SQL Server 2019 is now generally available to run natively on RHEL 8 or as a RHEL-based container image.  Support for Microsoft SQL Server 2019 Big Data Clusters should arrive on OpenShift platforms in early 2020.


AI/ML, How-tos, Microsoft, Azure, Machine Learning, Windows Containers

< Back to the blog