Secure Password Handling
Just Add Salt
I've always been annoyed by websites and systems that have requirements for what kind of password you can use. GoDaddy, for instance, requires its FTP passwords have an uppercase letter and a number or it will not accept your password; and UC Davis requires your password be at least 7 characters long (and strangely, no more than 8). At first glance, this seems like good security practices, preventing users from picking weak passwords. But the fact of the matter is, a properly build system should be just as secure if a user picks a terrible password like apple as if they picked something ridiculous like'@pPl3S4uCe'. Furthermore, creating these requirements and limiting the selections the user may chose from actually weakens security, as the user is limited, forcing them to create passwords they are unlikely to remember, meaning they will use more easily guessable passwords or write them down somewhere anyone can see. If you already know all about password storage, hashes, and salts, go ahead and skip this next section.
A Brief History Of Password Security
Let's quickly go through the methods of password storage an average developer would consider using when building a password managment system.
When a developer first starts trying to store passwords, he (or she) will normally go through several steps in their head. The first is of course plaintext storage. Just throw the passwords in a field of the database, and check against it when the user tries to log in. At first glance, the database is supposed to be secure, so these should be secure, but it can lead to abuse by people with database access, and if someone without authorization manages to get into the database, everything is out in the open. The next step, as nearly any article on user management will cover, is to hash the passwords. A hash function, for those of you who don't know, is an operation which takes a string of text and outputs a hash - a short string which cannot be converted back to the original string. This is much more secure, since this (theoretically) means even if the contents of a database are stolen, the thief cannot access the passwords. This is where the concept of insecure passwords comes into play, since some hashes are easier to break than others. While hashes are designed to be one way operations, it is still possible to determine the original string from the hash by pre-generating a list of hashes and looking up the stolen passwords against that. Such a list is called a Rainbow Table - you can see a very simplistic one to the right. Many rainbow tables exist; many comprise just dictionary words, or alpha numeric words up to a certain length. Perhaps the most comprehensive of the MD5 hash (the hash function most commonly used to store passwords) is at FreeRainbowTables.com, which can lookup poor passwords almost instantly, and has more than 500 gigabytes of data to search through. A search for the hash '
1F3870BE274F6C49B3E31A0C6728957F' returned 'apple' in mere moments, and even the more complex string '!@#ab' showed up in only a few hours.
It is at this point that many administrators conclude the solution is to instruct, and in some cases require, users to generate stronger passwords. This too is flawed, however. While it does help prevent poor passwords, it doesn't resolve the underlying problem of poorly stored passwords. Even though a tougher password may be less easily cracked, to believe that a password with a larger character set (alphanumeric plus symbols vs. alpha-only) is anything more than slightly safer is foolish. This can be easily demonstrated by searching through Free Rainbow Tables. Try an alphabetic password, an alphanumeric password, and a 'hard' password - we'll define that as "at least eight characters, at least one upper and one lower case, at least one number, and at least one symbol. More than likely, Free Rainbow Tables will spit back answers for all of these within a few hours. And even if it doesn't find the answer, it's a pretty safe bet that soon enough, all 8, 10 or even 16 character strings will be hashed and stored in their database. Now you may claim that even if they will eventually be hacked, the fact that they haven't yet is proof enough that requiring hard passwords is beneficial. Read on, however, you'll see that this next option is both more secure than the 'safe for now' argument and eliminates the need for hard password requirements altogether.
Having eliminated plaintext storage and hashes, and since we believe hard passwords aren't all that hard, we come to the first reasonably secure solution to password security: hashing with salts. A salt is a string added to an input before it is hashed in order to make the string more secure. The simplest type of salt is a static string appended (or prepended, doesn't really matter) to all passwords. Imagine if we append the fairly short salt '$3cr3t^2' to all passwords before hashing them - even easy to crack passwords like 'apple' become nearly impossible without very large rainbow tables, or at least a foreknowledge of the salt. Even with such foreknowledge the cracker would need to take the time to generate custom rainbow tables, a far more difficult endeavor than just looking it up in an already existing resource. We can even go further than that – the salt can be made unique to each user, making the task that much more difficult. Imagine not just appending a static piece of text, but also a dynamic piece of data unique to each user. Unique salts have an added bonus of preventing password collisions (not to be confused with hash collisions) - even if two users have the same password, the hashes stored in the database for each of them will be different, thus it cannot be determined just by looking at the hashes that two users use the same password. Here is a fairly simplistic, but highly effective method of generating secure salts (in PHP)
<?php $hash = md5($_POST['password'].'$3cr3t^2'.md5($_POST['user'])); ?>. I use the username to ensure the salt is unique, but far better unique data can and should be used - see the next paragraph for more. To break these hashes, the rainbow table would have to handle every possible string 38+ characters long (8 from the static salt, and 32 from the hash of the username). Far, far more possibilities than could likely ever be computed in all the time in the universe – and all but impossible for a rainbow table - even just storing such a table would likely take more disk space than has ever been created.
Even better, PHP has a built in function crypt() which does all that work for you. Pass it a string, it will return a uniquely salted hash. To compare one hash with another, pass as a second parameter the hash to compare against. That's it, no hard work on your part, with (all but) completely secure passwords in your system. No one, not the database administrator, a hacker, another user, nor anyone else, can obtain or compute the original password in any reasonable amount of time. Even better,
crypt() is not limited to just MD5 hashes, which, though generally accepted as perfectly secure for password storage, can be improved upon. It's important to note that if your server is not configured correctly, crypt() will use the insecure and outdated DES encryption. If you're not sure if you're server is setup correctly, check the full PHP documentation on crypt before implementing it.
For completeness sake, there is a further option, getting rid of passwords altogether and using public/private key pairs for an additional level of security, however this is not an option the general public would probably even understand, nor is generally considered as an option in most places a password will suffice, so for the sake of brevity we will ignore this additional possibility.
Requiring 'Hard' Passwords vs. Salting
So what are the advantages of requiring hard passwords? There are two primary advantages, the first for the developer, and the second for the user. The developer has the advantage of knowing every user on his system is using a fairly secure password. This ensures a minimum level of security for the system (assuming passwords were the weakest link). On the user's side, even though it's normally perceived as an irritation, there is theoretically an implication, especially on sites that really need to be secure, like banks, that a system which requires a strong password is secure overall.
In contrast, the salted hash has many, many more advantages, for both the developer and the user. The developer can be confident that his user's passwords are secure from exploit, even if an intruder had full access to both the database and the source code. Even better, in the highly unlikely case that rainbow tables becomes large enough or the hash function is found to be breakable to compromise the salted hashes, the developer can migrate to a new hash the next time each user logs in, and the user never needs to know – whereas in the far more likely event that hashes for ‘good' passwords become easily breakable, the only option available to the developer is to prompt the user for a new, longer password. Soon enough ‘hard' passwords will need to be fifteen to twenty characters long, which quickly eliminates the point of passwords as short strings you can remember.
For the user, they're able to use any password they would like, even passwords that might be described as weak, since weak or strong, they'll all be stored as unbreakable hashes. And while it's true that requiring hard passwords does sometimes imply security, most users will not think that far, trusting (as they should be able to!) that the site is storing their data safely. The users who do stop and take the time to consider the security of their passwords are people like you, reading this article, who know that the truth is, any site which requires hard passwords very likely does not store their users passwords safely at all.
There are two primary responses to the claim that salted hashes are secure, or more secure than requiring hard passwords – the first, users can create guessable passwords, which is harder to prevent, and the second, exploitable frontends, point to security hole on the developer's end.
Not to be confused with easy or weak passwords, guessable passwords include single character passwords and passwords that match or are very similar to the username or other easily discoverable personal information. These sort of passwords are incredibly dangerous because a malicious user can in a reasonable amount of time break into an account by simply guessing, and are often hard for systems to prevent – though of course developers should try; minimum character limits of four or six are perfectly acceptable since it is indicative of a truly poor password if it is not at least that long. The only way to prevent guessable passwords is user education, as is true for nearly all modern security exploits – machines that are not updated regularly; people who download files or load webpages that should not be trusted; the only prevention is education. While salted hashes do not prevent against this possibility, users are becoming better educated each day, and this issue is becoming less and less prevalent.
One other possible way a malicious user could exploit weak passwords which must be addressed before we can really declare salted hashes superior to hard passwords is a brute force attack on the front end login. It's pretty simple to imagine, and almost as simple to set up: a short little script runs which submits username/password combinations for known usernames and tries easy passwords. It would probably take a fair while, but unless it's a particularly sturdy application, no one would be alerted to the attack. Several methods can be employed to prevent this, including failed login limits or captchas (which I hate), but my personal favorite is a little bit more passive aggressive – we track the number of failed logins and throw a sleep() command (or the equivalent if you're not using PHP) for, say, 2^(x-3) seconds, where x is the number of failed logins. This makes initial login attempts load all but instantly, but very quickly after that point the time between each successive page load doubles, becoming all but unmanageable for anyone trying to log in too many times at once. This isn't an article on building a secure and usable frontend, but any developer implementing this method (or any of the others, for that matter) will want to make sure they aren't tracking failed logins by sessions (easily defeated on the client end) and also not solely as a number attached to the user – for that could harm your real user if they try to log in during or after an attack.
Well, this ended up being a lot longer than I'd expected, but I hope it's been informative and useful, and I encourage anyone developing a security system to look into the user manager I built a while back, which uses the crypt() function and should be secure for your use – though as it is indented to be a fairly low level tool, it does not prevent against brute force login attempts, something your application will have to handle.
While I do believe everything I have said in this article is valid and strongly advisable, I am not a security expert in the field, and you should do your own research and consult such experts before making decisions which could affect your users. I invite your thoughts and corrections to improve this article.