This cloud conversation with Troy Hunt, Microsoft Regional Director and founder of Have I Been Pwned, has been edited and condensed for clarity.
Forrest Brazeal: Troy, what’s your origin story? How did you become a legend in the world of application security?
Troy Hunt: Kinda accidentally, to be honest. My background is in software development; I’ve been developing web apps since I was a teenager. I ended up in the corporate world for 15 years, making myself thoroughly miserable. I went through a cycle, like so many in big companies: after a few years, they tell you that you should stop writing code and become a manager. There’s no career path to stay technical.
So … I became an architect. (I’m still not sure what that means!) I drew a lot of diagrams on a lot of whiteboards, but I missed being hands-on. I started writing a blog in late 2009, which gave me an opportunity to articulate technical thoughts to the technical audience I missed. Blogging was a good outlet for me.
That’s not a typical infosec background … where in all this did you pick up your security expertise?
I’d written a bit about security, because I found there wasn’t much content out there written for developers. There’s lots written for infosec people, but they live in a different world. I got traction writing in a dev-centric fashion about security, eventually got the Microsoft MVP award, and everything sort of snowballed from there.
I would love to say I had a master plan. I’d be so happy 11 years ago if I could have looked ahead and seen where I am now. But my career path has been literally accidental all the way.
Several years ago, you started a project called HaveIBeenPwned that helps people see if their personal data has been exposed in any known breaches. Hundreds of verified breaches later, the constant flood must start to feel like deja vu. What are some of the most common ways you see developers leave their data unprotected?
Certain patterns do repeat over and over. Usually, it’s a database of some kind that is facing the web. For awhile it seemed like all the culprits were unsecured S3 buckets, then we went through a MongoDB phase. Right now public Elasticsearch instances seem to be a common target. So picking up on those patterns is certainly interesting.
But most interesting, to me, is the social perspective – the way organizations respond to breaches. Like, I’ll drop pretty convincing evidence of a breach, and some companies’ first move is still to deny it. And I’m like, “come on.” You can never be absolutely sure, but I don’t go public with a breach unless I’m 99% confident it’s legit.
I’ll give you a recent example: there’s a ride-sharing company in India called ZoomCar. Their data leaked and was being passed around through hacking forums. It was easy to establish it was legitimate data, because you could go plug in the credentials on their site and reset people’s passwords. I went to a forum and had one of the breach victims tell me the last digit of their phone number, and I provided the preceding two. The breach had already been reported in the press. Zero percent chance this isn’t real. And yet ZoomCar was still vehemently denying a breach.
I call it “the five stages of data breach grief”. At first it’s deny, deny, deny, and then you get angry. But one way or the other, I am going to get you to acceptance. We can do that the easy way or the hard way. But acknowledging that you have a problem is the first step to fixing it.
You help developers protect themselves from breaches like this by showing them a hacker’s perspective. Let’s say I’m building a cloud application, can you walk me through some things I as a dev might do to protect myself?
Well, there’s a heap of things, so let’s pick some easy wins.
First off, put passwords on your databases!
Then, reduce your attack surface by avoiding public-facing data. People leave databases publicly accessible because it’s easy, but it’s not that hard to create a DMZ and only accept whitelisted connections.
But there’s a movement in cloud security to avoid private networks and require services to take care of auth. Do you feel that zero-trust is a bad idea?
I mean, the principle of it is good. The whole idea of an impenetrable perimeter around the corporate network started disappearing with USB sticks and bring-your-own devices. Now, with remote work and cloud services, it just doesn’t fly anymore.
But there’s a common-sense balance here. I’m not saying rely on perimeters or throw them out. I’m saying let’s apply defense in depth. Each device should be resilient in and of itself, sure. But some of them aren’t.
Like, I have no idea if I can trust my washing machine. It’s on my home network, not entirely sure why, it seemed like a good idea at the time. I try to keep it patched. But if it gets compromised, at least it’s isolated.
Maybe we should stop thinking about that and go back to application security tips.
I always try and get people to think about data minimization and data retention. That isn’t even really a technical conversation, it’s just common sense. If you don’t store the data, you don’t have to worry about losing it. So ask yourself: “Do I really need this data, and for how long?”
I’ll give you an example of how to screw this up. There’s a website called CatForum.com. It’s exactly what it sounds like: people go there to talk about their cats. A typical discussion would be: “Is it safe for my cat and me to eat out of the same bowl?”
But in order to chime in on this conversation, you must provide personal information: CatForum.com makes you give them your date of birth to sign up. Why?? It makes no sense! If they ever do experience a breach, they’ve just exposed a bunch of static, knowledge-based identification that can be used to guess access to more sensitive services.
Now, you might say “I need to ask this question, because my terms of service require the user to be over 13.” So why not just ask them if they’re over thirteen? “Well, they could lie to me.” I mean, they could lie to you about the date of birth too, so that makes no sense. Or if you really want them to enter the date, just do the math on their age and don’t store the DOB long-term. My point is, there’s so many ways around this problem that don’t involve unnecessarily storing personal data.
You do some interesting things running haveibeenpwned.com really cheaply on serverless, with Cloudflare and Azure Functions. Would you advocate that others try this? Is the cost savings still meaningful for you as you scale?
Well, it’s mostly serverless. Everything that gets hit in anger — the search box, the pwned passwords page which is hit 25 million times a day — all serverless. The front page of the website runs on an app service, but that’s still a PaaS, and it’s massively cached on the edge with Cloudflare.
As for the bill: a full third of my costs is logging for AppInsights. So I had to downsample that. And as I add more services, that bill gets more complex. Like, I have support costs through Cloudflare — where do you add that in? It’s more amorphous.
But to put it in perspective, last January haveibeenpwned.com got a big traffic spike with 10 million visitors in one day. I took a snapshot of 72 hours of my Azure Functions costs. 72 million requests to the service worked out to, like, $34 dollars.
For me at this point, building with serverless is kind of like gamification. I’m always tweaking to see how low I can get the costs.
Back to the security theme, I think a lot of people still have an idea — certainly underscored by this steady background noise of breaches — that the cloud poses heightened security risks. Do you agree?
The charitable way to put it: cloud security is a shared responsibility. Individuals need to secure their stuff. I have a lack of patience with corporations who say “We’ll just dip a toe in cloud. But we won’t invest in doing it properly.” And now they have cloud systems operated by people who aren’t aware of security best practices.
So we need way more accountability around education of professionals in cloud environments.
But when these breaches happen over and over, you have to ask – is cloud security too easy to screw up? In the case of, say, Amazon S3, for a long time the answer was yes! And thankfully AWS has made changes in how they highlight potentially dangerous S3 settings. Because they have a responsibility to do that.
Let’s say I want to work in cloud security, but like Troy Hunt circa 2009 I have no specific experience in that field. What is the best thing I can do for my career right now?
And second, the cost is so low. The cloud services you can get access to — it’s amazing how cheaply you can do it.
Spin up a VM, play with it, shut it down. The hands-on piece is what makes learning stick. I dropped out of university, I don’t have a formal education. What I’ve learned has been through getting hands-on and seeing how things work.
And because the cloud allows me to spin things up and make mistakes without consequences, that can help me educate myself to do the right thing when the stakes are so much higher.