Brent Ozar is a SQL Server expert and Microsoft Certified Master who’s helped companies like Google and Stack Overflow get the most out of their SQL Servers.
We sat down (virtually) with Brent to talk about the future of Microsoft SQL Server, and the DBAs who’ve built their careers on it, in the new world of the cloud.
Forrest Brazeal: Brent, you’re a performance specialist who’s worked with massive SQL Server deployments; how does it make you feel when you hear cloud architects saying things like “relational databases don’t scale?” What do you wish they knew?
Brent Ozar: They’re right. Nothing scales worth a damn. Even stuff that looks easy, like Amazon S3 file storage, can run into scaling problems if you design it without an analysis of how AWS partitions reads, and how you have to watch out for your object prefixes.
But the thing is, nobody reads the documentation.
Oh sure, they read page 1, they build the Hello World version, and then they start building an app without any regards whatsoever for how the design is gonna perform. (Sometimes, there’s not even a design – folks just start coding willy-nilly.)
So I kinda chuckle when I see folks say that something won’t scale – nothing will. It’s a poor musician that blames his instrument. Pick the persistence layer that you know the best, because that’s a hedge against you making dumb design decisions that will come back to haunt you later. The more mistakes you’ve made in the past with a tool, the more likely it is that you won’t make those mistakes again.
If you choose a persistence layer you’ve never used in production before, and your app becomes popular, you’re gonna be paying for the mistakes you made early on – or the mistakes the persistence layer made for you.
So with that said, are there other reasons to choose SQL Server today, beyond “it’s what I already know?”
When your application gets implemented at a 10,000-employee company, you start to get a lot more enterprise-sounding requests from different departments. For example, Sam from security needs to audit which users have changed data over the last 30 days, and Annabel from accounting needs your app to sync tax data with SAP.
In environments like this, it’s much more common to use Grandpa’s relational database because it’s already got a lot of built-in functionality to help accomplish these goals, and it’s easier to hire people off the street who know how to do it.
If you’re building a new web site or SaaS tool, and you’re going to be the only company that ever hosts it, then you have a lot of flexibility to choose the data persistence layer you want. You can use Database du Jour without problems. But if you wanna sell an app into 10,000-employee companies, you’re gonna get faster sales if you use a database they support.
You’ve achieved such renown for your eponymous SQL Server helper tools that your name is sometimes simply listed as a required skill on DBA job postings. This is probably not a realistic goal for most of us. That said, what are a few career steps you recommend to become a trusted DBA?
[Laughs] Yeah, that was surreal.
To become a trusted DBA – or a trusted anything – freely share what you know on a web site with your name on it.
Sharing knowledge accomplishes several different things. First, if a prospective boss or coworker Googles you, they’ll find your name, read what you wrote, and say, “Hey, this person really does seem to know stuff that I don’t know.” The dates on your blog posts also establish a track record of what you’ve known over time, and how it’s grown.
In the beginning, your goal is to be found if people Google your name. That’s fairly easy. Later on, as you start to accumulate more writings, people are also going to run across them when they Google for the subject matter, too, even without including your name. That’s when you’ll start building an audience – and people worry about that first, but hold off. Just start by building a public track record under your name, and the rest will start to snowball.
If you’re looking to get skilled up on Azure, Microsoft is currently offering 50% off AZ-900 exam vouchers to anyone who completes ACG’s free prep course!
AWS has recently announced an open-source translation tool called Babelfish, which is an aggressive attempt to get people off SQL Server and onto open-source database engines such as Postgres. What are your thoughts on this tactic?
I adore Babelfish because it keeps the pressure on Microsoft to continue innovating. The database market is kinda like this:
- Oracle – massively expensive
- Microsoft SQL Server – pretty doggone expensive
- AWS RDS Postgres and Aurora – inexpensive to mildly expensive
- Postgres – free to inexpensive, depending on support
If you’re Microsoft, and you wanna make more money, you look at Oracle’s pile of cash, and you build features that will help Oracle users migrate down to SQL Server and save money. If you’re AWS, you look at Microsoft’s pile of cash, and you build features like Babelfish that will help SQL Server users migrate down and save money. Everything slowly becomes a less expensive commodity. (If this kind of thing interests you, search for Wardley value mapping – it’s really neat stuff.)
I’m really excited about AWS Babelfish because it encourages Microsoft to start defending their value proposition. Microsoft has to start shipping new stuff that makes AWS RDS & Aurora customers aspire to using the Real Deal SQL Server (and/or Azure SQL DB).
SQL Server 2017 & 2019 started bringing that with query optimizer improvements such that your code will just run faster – that’s a good down payment, but they gotta keep making those payments. If they don’t, projects like Babelfish will start to win over customers if/when they ship.
SQL Server on Linux: smart play by MSFT, or failed experiment?
If you go down the list of popular databases, the only ones that are Windows-only are SQL Server and Access. I think it made sense for Microsoft to hedge their bets and start making the investments necessary to survive in datacenters 10-20-30 years from now. It’s a hedge, though: it shouldn’t be your first default choice for running SQL Server. The full feature capability isn’t there, and the high availability & disaster recovery story is still pretty sketchy.
I think of SQL Server 2019 on Linux as an alpha release. It’s there if you have to have it, but otherwise, you should wait for a few years until you really need it. Right now, it’s just kinda there for experiments.
The one place where you might have to have it is throwaway environments for continuous integration. If you need to quickly and inexpensively stand up and shut down a lot of temporary SQL Servers to test your code, then sure, it makes sense there.
What’s the future of SQL Server in the cloud, anyway? Is it the next Oracle (as AWS is trying to position it), widely disliked and mostly legacy, or can it compete with managed OSS engines like Postgres and MySQL on its own terms?
Don’t think about the dislike & legacy in terms of the name brand of the database engine.
Think about dislike & legacy in terms of the features that have to be implemented: auditing, fine-grained single sign on security, integration with Active Directory, compatibility with reporting systems, GDPR compliance, encryption, etc. Here’s a great place to start: https://www.enterpriseready.io
Developers HATE those features. That’s why Amazon and others say that Oracle and SQL Server are disliked, legacy applications: they support all the features that enterprise users demand, but developers hate. I hate those features too, but…enterprises demand ’em.
If you’re not in an enterprise, and you don’t have plans to sell your app to an enterprise, then don’t use an expensive relational database. Use something cheaper and cloud-native like Amazon’s touting. I agree that it’s a better choice.
Finally, what does a DBA role look like in, say, 2025?
2025 is just four short years from now, and most orgs still aren’t even planning for a cloud-native future. They’re just talking about it. The DBA job description in 2025 will still look the same as it looks today: managing data stored in a bunch of diverse platforms, all of which have their own management, troubleshooting, and performance tuning.
If you stretch it out to 2031 or 2036, 10-15 years from now, it’ll be easier. We’ll have less worries around backups, corruption repair, and restores. But you know what we’ll still be dealing with? Troubleshooting (because everybody assumes it’s a database problem first), security, encryption, performance tuning, and architecture.
The cloud doesn’t architect itself.
This interview is part of ACG’s Cloud Conversations series.