Everyone gets critical. It’s part of our vernacular: critically injured, critical condition, critical strike. It makes sense to us. So when a critical vulnerability is discovered, you know it’s bad. High is a bit less clear, but if we say “it’s just a little below critical”, it inherits impact by association. We prioritize critical and high-severity findings in a report because those are the most severe issues uncovered during an assessment.
I know I am not breaking new ground here but bear with me.
On the opposite end of the spectrum are info and low-severity findings. These end up on the bottom of everyone’s to-do list and may stay there for years. Oftentimes, this makes sense. If you’ve got criticals to deal with who’s got time for lows?!
Early in my career, I got the fear put in me. The fear that a machine would take my job. The fear that I would be replaced by a piece of software. It’s been a serious source of motivation for me and one of the big reasons I was attracted to penetration testing: done well, it’s hard for a machine to replicate. One of the best examples of this is the chained-vulnerability.
Chained vulnerabilities are vulnerabilities that have been linked together to become more than the sum of their parts. Alone they’re not that interesting, but together they can punch far above their weight. I love chained vulnerabilities because they’re fun to put together and they require a human; software can help you find them, but a human is required to give each one context and tie it to the next.
3 Lows and an Info Walk Into a Tester
I was recently testing the login functionality of a web application and not finding much interesting to work with. Lockouts were in place and I hadn’t found any way to enumerate valid usernames for the application. Until I started to explore the password reset process. The reset process worked like this:
- Input username, get security reset question
- Answer security question
- Pick new password
2 and 3 were really interesting on their own, not least because they happened at the same time, but I found something fun on 1 as well. The devs were clever and presented a security question regardless of whether the username was correct or not…twice. On the third try for an invalid account, the question would change. Valid accounts would always return the same question.
Weak Security Questions
This was interesting, but info-level severity on its own. I wrote up my finding and turned my attention to the security question. Singular. Each user can choose exactly one security question to be used as part of the password reset process. Since this was an authenticated whitebox test, I was able to confirm that this is the case. One security question per user. All eggs, one user-chosen basket.
Having a user-selected security question is great because it lets users add their own dose of variability to the password reset process. Having that as your only security question is risky because you’re pinning the security of your users on their ability to pick robust questions. But still just a low-severity finding. Into the report it went.
Weak Security Answer
With no control over the quality or content of a user’s question, it’s difficult to control the quality of the answer. “A duck” may be a terrific answer to “What is the square root of 49?”, but an awful answer to “If it walks like a duck and quacks like a duck, what is it?”. It’s hard for us to technically enforce the quality of this answer. If the users are choosing questions with high-entropy answers, we’re still OK, though.
My test account was preconfigured with a question of “What is your favorite color?”. I pretended I didn’t know the answer and used Burp to find out.
Bingo! A weak or easily-guessable security answer isn’t good since it is essentially a stand-in for a password. Still, on it’s own it’s just a low-severity finding. As we’ll find later, this was systemic so I added it to the report and moved on.
Insecure Password Reset Process
A great way to protect weak security questions and answers is to make it hard to guess them in an automated fashion. There are a few ways to do this:
- Rate limiting guess attempts, either with a CAPTCHA or lockout after a certain number of bad attempts.
- Blacklisting of IPs associated with bad guesses.
- Having multiple security questions presented out of a larger pool, forcing the attacker to find the right answer to many questions.
- Mixing user-chosen questions with application chosen ones with strong filtering on question and answer quality (no “What year were you born?”, for example).
- Sending a time-limited password reset link to the email address associated with the user requesting the reset.
During the previous attack, I established that none of these were in place. And since the new password is submitted along with the answer to the reset question, anyone who can guess the answer to the question can specify the new password.
One of my favourite things about penetration testing is delivering reports that say “I did” rather than “an attacker could”. This quickly moves us beyond hypotheticals and towards actually fixing the problem.
By now I had enough pieces of the puzzle to build a PoC of my chained vulnerability. Up until now, I’d been using credentials our client had provided to me at the start of the test. To demonstrate the impact I needed to prove that my attack would work on its own.
I started with a list of statistically likely usernames and then used Burp Intruder to initiate a password reset request for each username three times. With a bit of Excel magic, I extracted all the users who had static password reset questions.
I then reviewed my list of valid users and their associated security questions and grouped them by question. My goal here was to find users who had chosen easily guessable questions and then brute force the answers. A surprising number of people used “what is your favourite color?” and, since I already had a list of colours made from my initial testing, I stuck with it and used Intruder again to iterate through the valid usernames and the list of colours.
The attack was immediately successful, netting me a successful password reset across 20 accounts. Due to the nature of the application, this also granted me access to payment card information, PII, and banking information associated with each account, a solid win for me but a major headache if an attacker did the same.
One of the major factors in assessing the severity of a finding is how easy it is for someone to pull off. Looking at the above attack, it’s obvious that this attack did not require much in the way of skill on my part. The entire attack could have been carried out using just a browser, albeit a little slower. That it was so simple, combined with the impact of the sensitive data exposed in the 20 compromised accounts was enough to rate this chained vulnerability as a critical.
There’s a bit of cognitive dissonance involved in offensive security. We get really excited about really awful bugs and there’s a source of pride in thoroughly owning something, even though this means that our client’s security controls could use a bit of work. With that firmly in mind, I always try to tone down my call to the client when delivering news of a critical or high-severity finding–these are often day or weekend-wrecking and should be treated as such.
In this case though, our client was over the moon and said: “I’ve been trying to get them to fix these bugs for years but they just ignored me!”. He’d been having a hard time getting the devs to take him seriously about security issues. By linking and exploiting these four low-priority findings I’d managed to clearly demonstrate the impact he’d been struggling to convey for years. Totally made my day, and a few days later they’d brainstormed up a robust solution to the problem.
It’s hard to argue against prioritizing critical findings. They’re critical for a reason. I hope, though, that I’ve made it easier to argue for human-driven testing. Humans are the only (so far) way to provide context for vulnerabilities and should be an integral part of both your offensive and defensive strategy.