One of the most common vulnerabilities of any significance I see during web application security assessments is a logic flaw that allows username harvesting. I typically find this flaw in username/password recovery, user registration, and every now and then in login forms. You would be surprised at the success I've had pairing this logic flaw with one other vulnerability: weak password policies. This is a simple but interesting attack that isn't rocket-science but can be very successful under a reasonably wide variety of circumstances.
You know the deal. A password recovery form takes in a username and responds with either, "Check your email for a link to reset your password" or "Invalid username attempted." Maybe it's the registration page and when you attempt to register an existing username you get, "Sorry, the username you requested already exists. Please choose another." Using inference based on the application's responses, one can ascertain the existence of usernames. Here are some examples on purposely vulnerable training applications:
 |
| ACUNETIX ART Registration Form Username Disclosure |
 |
| CrackMeBank Registration Form Username Disclosure |
 |
| Google Gruyere Registration Form Username Disclosure |
Sometimes, rarely, I'll come across a login form that tells me "Invalid Username" when I try a nonexistent username or "Incorrect Password" when the username exists but the password was wrong.
 |
| Altoro Mutual Login Form Username Disclosure (Valid Username) |
 |
| Altoro Mutual Login Form Username Disclosure (Invalid Username) |
Even better is the well-known WordPress user harvesting flaw that allows one to pull down WordPress usernames by requesting /?author=1,
/?author=2, etc. That's just simple iteration and doesn't even require a
dictionary attack.
 |
| WordPress User Harvesting Google Search |
 |
| Example WordPress User Harvesting (author=1) |
 |
| Example WordPress User Harvesting (author=2) |
Whatever the case, the attack is usually the same. You send the request
to Burp Intruder, use a dictionary of common usernames, and grep - match the response string
that indicates a successful username was entered. Just from sheer odds,
assuming you are using a sizable/decent username dictionary, you likely
guessed at least one valid username. On a site with hundreds of
thousands of users, your list is likely to be pretty large. On a site
with hundreds of users, your list is likely to be pretty small. The
screenshot below shows an attack against the well-known Cenzic
CrackMeBank. Using the default BurpSuite list of 8861 potential
usernames, 69, or about .78% are actually valid. So big deal; now I
have a list of valid usernames.
 |
| CrackMeBank User Harvesting Attack Using Burp Intruder |
Here's the horizontal part. My typical approach is to first look at the application's password and account lockout policies and take some notes. If there are none, good for me, bad for them. If there are policies in place, I factor them into my attack. Such policies are typically disclosed on registration and password change pages. I take my list of valid usernames and perform a horizontal password attack. I'm not looking to crack the password of one user by brute-forcing or using a dictionary of thousands of passwords. I want to use just a few passwords that are likely to be common within the target application and maybe compromise one or more of the known-valid usernames I have. The attack is only an inch deep, but it's a mile wide. Here's the general approach to choosing passwords.
- First I want to know if the application has an account lockout policy. If so, I try to determine how many invalid login attempts it takes to trigger a lockout. I don't want to anger my client and lockout hundreds of their users so I limit the attempted passwords to one or two less than it takes to lock accounts out.
- Second, I follow the application's password length and complexity policies so I'm not wasting my attempts. If the minimum password length is 8, I use passwords of eight or more characters. If the password complexity policy requires mixed-case, alphanumeric, and special characters, I use password values that meet those requirements.
- Third, I utilize widely available lists of passwords that are in order of prevalence and have been observed in password dumps gained from various, successful real-world hacks. I just pick the top few passwords that match my target application's password composition policies. Mark Burnett has a really cool tag cloud image of the top 500 passwords according to his research. Please excuse some of the more tasteless values as it appears many in our species are still operating on a fairly base level, obsessed with genitalia and reproductive activities.
 |
| Mark Burnett's Top 500 Password Tag Cloud |
In the case of CrackMeBank, there are no password or lockout policies to speak of so I'm going to use the stupidest (most common) passwords. Sure it's not the most real-world example but it suffices for an example and we actually still do run across apps without password/lockout policies every now and then. Just to prove my point, I used only five password attempts for each username. That's only 5 x 69 = 345 total requests. I used the following passwords:
- 123456
- password
- password1
- abc123
- ninja
Out of the 69 valid usernames I harvested, five or 7.25% were compromised on my first attempt using these five passwords. One account used "123456" and four accounts used "password" with "password1," "abc123," and "ninja" being unused.
 |
| CrackMeBank Horizontal Dictionary Attack Using Burp Intruder |
The name of the game is chaining attacks together against different vulnerabilities. To take the concept a bit farther, imagine an application with some loose file upload functionality that requires authentication. Now those five compromised users are a really big deal because I only need one of them to upload a PHP or ASPX backdoor. From there I upload a meterpreter session, pivot from that machine, and start working over your internal network because your DMZ->Internal firewall rules are as loose as the file upload functionality was. It's a whole chain of relatively minor flaws: username disclosure, weak user passwords, weak file upload functionality, and weak internal firewall rules that are exploited together to cause real damage.
Imagine another scenario, which I ran across earlier this year in an assessment: I compromise dozens of accounts using this horizontal dictionary attack on a heavily-used site that allows users to store their payment information. Against PCI regulation, the application outputs the full PAN with name, address, exp date, and security code on the "edit my payment information" page. Instead of being mister-good-guy-pen-tester I'm some kid who holds a nasty grudge against this company. Instead of using the payment information to buy a plasma and some air jordans, I dump the user creds and cardholder data on pastebin.com. I don't compromise the web server at all but the company is subject to PCI fines and is mentioned in the news causing some damage to the brand.
Enough with the fear mongering then. What are some simple things we can do to stop this type of thing? As always, I like to see controls at multiple levels, starting at the root cause, so that if one fails, another steps in to save the day. From secure coding (avoiding logic flaws), to CAPTCHAs, and then even WAFs let's break it down:
- Use CAPTCHAs: At the very least, use decent CAPTCHAs on registration and password/username recovery forms. This makes it pretty difficult to get that initial dictionary attack running in order to harvest usernames. I know there are business and useability reasons to avoid CAPTCHAs but use them on your login forms if you can get away with it. CAPTCHAs don't solve underlying vulnerabilities but they prevent automated attacks.
- Be Discerning: Require more than one piece of unique information to initiate password/username recovery. Ask for a username, email address, zip code, and last four digits of a phone number. I remember one utility payment application I assessed that simply asked for a username and zip code. I simply looked up all the zip codes that the utility company serviced, picked a few of the most populated ones, and harvested away.
- Don't Be Too Friendly: Avoid detailed errors about what the user did wrong if possible. On a login form, it definitely is not necessary to provide two different error messages based on whether the username or password were incorrect. For password/username recovery, be generic. "Something went wrong" will suffice most of the time.
- Enforce Password Requirements: Make sure users are required to enter secure passwords. Eight characters or more, with alphanumeric, mixed-case, and special characters required makes my job a lot harder than allowing "123456." You might even go ahead and blacklist really bad passwords. Most really bad passwords won't meet the requirements above but it's worth noting that "P@55w0rd" is not much better than "password."
- Use Two-Factor Auth: If you're serious about protecting your users, use two-factor authentication. It's not all about a physical token any more as there are plenty of other options. They each have their flaws but they do stop this type of attack.
- Lock Them Out: Enforce account lockout after 3-5 invalid login attempts.
- Throttle or Block Offenders: You shouldn't see 1000 or 100 or even 25 HTTP requests/second from the same session. Use your WAF, IPS, or even a web server module to limit concurrent connections if possible.