Files
hpr-knowledge-base/hpr_transcripts/hpr1598.txt

491 lines
20 KiB
Plaintext
Raw Normal View History

Episode: 1598
Title: HPR1598: Hashing and Password Security
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1598/hpr1598.mp3
Transcribed: 2025-10-18 05:37:49
---
It's Wednesday 17th on September 2014.
This is an HPR episode 1,598 entitled,
Hashing, and Password Security,
and is part of the series,
Privacy, and Security.
It is hosted by Aukka,
and is about 26 minutes long.
Feedback can be sent to Wilnick at Wilnick.com
or by leaving a comment on this episode.
The summary is,
Understanding Password Security begins
with Understanding Hashing.
This episode of HPR is brought to you by An Honesthost.com.
Get 15% discount on all shared hosting
with the offer code HPR15.
That's HPR15.
Better web hosting that's Honest and Fair
at An Honesthost.com.
Hello, this is Aukka.
Welcome to Hacker Public Radio
and another episode in our series
on Security and Privacy.
And what I want to do now is go off
in a somewhat different direction.
We've taken a fairly in-depth look
at encrypting email in various ways.
And we just put in an episode about a sensible security model
based on some work of Bruce Schneier,
who I greatly admire as an expert on the subject.
So now what I want to do is get into some issues
about passwords and start unpacking
some of the technology involved with this.
Now, the most common way of providing secure access
to any kind of system these days
is through the use of passwords.
If you're like me, then you probably
are asked to create a password for just about every website
you go to these days if you want to do anything at all.
Certainly e-commerce, but also for posting comments
or posting to social media, any of these things.
And it just becomes a requirement that you have a password.
Now, the implication is that that's
going to provide a level of security.
Now, following on our last tutorial,
we should ask a few questions about just how effective
this measure is.
Since someone posting in your name to Twitter
is significantly different from someone
accessing your bank account, although there
are certainly examples of people causing a lot of trouble
for other people by posting in their name.
Now, since the assets being protected are very different,
it would be reasonable to approach the problem of security
somewhat differently in these cases.
But given the ubiquity of passwords
as the authentication for online accounts,
we need to look at the security involved.
Now, note that I'm approaching this
from the standpoint of the owner of the site in question
for this tutorial.
And then we'll follow up with one that
takes a look at it from the side of the user.
Now, the asset you are trying to protect
is access to your account on some website or IT system.
Given that you are not the owner,
you don't get to choose how this is done.
The owner chooses and does so for their own reasons.
If they do a bad job, there can be potential consequences,
of course.
But you will always have to agree to follow their rules
if you use their system.
Passwords are not the ideal way to protect this access.
And I would not be at all surprised.
In fact, I read about things almost every day
that talk about major changes coming.
And certainly, 10 years and I'd be very surprised if passwords
is the primary way that we authenticate people.
But for now, it's what we have.
So, following on our last episode,
the first question is, what are the threats?
Now, one threat is you.
If you can be tricked into revealing your password,
then you have given away the access.
And what are called social engineering attacks
are frequently successful.
In fact, they probably the most successful way
for bad guys, let us say, to get into systems.
Now, if you are asked by someone you believe to be legitimate
to tell them your password, will you?
One approach that often works in large organizations
is to call someone on the phone and say something like,
I'm so and so from the IT department
and I need to check your password.
Maybe they use words like verification as part of this.
And instead of making a phone call,
it could be an email asking you to click a link
and it might look very official
and have the right graphics from the company.
If you can be tricked like this, they're in.
Oh, generally, these kinds of attacks
are examples of a targeted attack
against a specific company or individual.
If the attacker has a good reason to want to get in,
it is worth putting in the effort.
But you should recognize that this kind of attack
does require a lot of effort to get a single password.
And it is only worth doing if the payoff is great.
If you're trying to steal the intellectual property
of a competitor, for instance, it is worth doing.
But the vast majority of cases are not like this.
Most attackers are looking for a financial reward
and that usually comes when you can get large numbers
of passwords which can then be leveraged
to get into bank accounts in the like.
One of the big weaknesses of passwords
as we saw last time is that really secure passwords
are hard to remember.
And since every site you go to now requires an account
to do pretty much anything, we all end up
with hundreds of passwords.
And people being what they are,
the odds are very good that most people
are using the same password on a whole bunch of sites.
Now, if that password is used to post a comment
on someone's blog, you probably don't feel
like it is a security risk.
And in that context, it isn't.
But if you also used that same password
for your bank account, anyone who could hack the blog site
and get your password would then try to use it at the bank
and bingo, they now have access to all of your money.
Of course, most hackers are not targeting blog sites
because there are not large numbers of passwords
to be harvested there.
They go after the large sites
that have millions of accounts in passwords.
And when they get them, they can then try to use them
at other sites.
And if the other sites have a standard way
of creating account names,
or if you always use the same account name each time,
combining an account name in password is a piece of cake.
And with computers to automate the process,
you can try large numbers of them every second.
Even if only 10% of them work,
that can be a huge payoff for a cyber criminal.
So this is what you need to worry about.
Now, we'll get into the steps you can take
to protect yourself in the next episode,
but for this one, I wanna look at how the site owner
should protect your password.
The worst case scenario is that they simply store
your password in the clear in a database,
which means they don't encrypt it in any way.
They just take the text and stick it
in some database field somewhere.
All right, in that case,
anyone who can get into the system
has all of the passwords game over.
The owner may try to interpose a password
to access the database,
but this is not sufficient security.
A good clue that the owner is doing something like this
is that they limit the length of the password.
You see, they may be using a database
that sets the field length on what they store,
and you cannot exceed that field length.
If you ever encounter a system that says your password
cannot have more than eight or 12,
or whatever number of characters be extremely suspicious.
And for anything really important, test it.
They may let you enter a 20 character password,
but try logging in using only the first 19 characters.
If that works, you have pretty compelling evidence
that they not only store passwords in the clear,
but that they throw away any characters over some limit.
You may not care if it's someone's blog,
but I would never do online banking that way.
I would either change banks or just decide
to never use the online function.
And if you are not using the online function,
you should see if there's any way to instruct the bank
to ignore any attempt at online activity.
Now in the US, two recent court cases
are pointing towards an interesting standard here.
They involve attackers who are able to fraudulently
access bank accounts to send wire transfers
to accounts outside the United States.
In one case, the bank had urged the customer
to adopt better security practices,
essentially requiring multiple authorizations
for any transfers.
And the customer declined in writing to do so.
The fraud occurred, the customer sued the bank
for negligence and the court sided with the bank
since it had urged better practice on the customer.
In a very similar case, the customer prevailed in the lawsuit
because there was no evidence
that the bank had any good security around the process.
The moral of the story is that you need to take advantage
of any good security practices offered
and that you should turn off anything you don't need to use.
I personally would never allow any wire transfers
to be initiated through online banking.
I would require the bank to only accept such transfers
if I show up in person with good identification.
Or maybe even turn off the ability
to do wire transfers if you don't need it.
Most people don't.
Now, back to passwords.
At a minimum, passwords should be stored
in an encrypted form to make it difficult
for an attacker to get them,
even if they have access to the database
where they are stored.
Note that I said, difficult, not impossible.
A sufficiently motivated hacker has any number
of tools available to get encrypted passwords.
But anytime you can put a speed bump in their path,
that is good.
And that brings us to the concept of a hash.
Hashing uses cryptography to generate an encrypted form
of the password and employs a one-way function to do so.
A one-way function is something we looked at before
with public key cryptography and means a function
that is easy to compute in one direction,
but extremely difficult to compute in reverse.
So here are the characteristics of a good hashing function.
And if you check the show notes,
I've given you a link to a Wikipedia article
that addresses this.
So what a good hash function should do,
it is easy to compute the hash value
for any given message.
So we need to be computing hash values all the time
if it takes up a lot of computer processing power,
that becomes a problem.
You need to do it quickly and easily.
It should be infeasible to generate a message
that has a given hash.
All right?
One of the things we do with hashing
is we use it to attest to the authenticity of something.
So if I could start with a hash
and then work backwards to generate a message
that had that hash, that would be a problem.
So you want something that if you start with the hash value,
you cannot create a message that is going to generate that
or at least not at all easily.
Third, it should be infeasible to modify a message
without changing the hash.
So this is very important again
because the hash is used to attest to the authenticity
of the message.
And finally, it is infeasible to find two different messages
with the same hash.
Now, like all other forms of cryptography,
advances in computer technology can make older forms
vulnerable to brute force attack.
As you can see from the Wikipedia article that I've quoted,
the term we use is infeasible, not impossible.
For instance, the MD5 algorithm was once used for this purpose.
It was developed in 1991 by Ron Revest,
who is the R of RSA, but a flaw was found in 1996
and it is no longer used for security.
It is still used for verifying file integrity though
since it meets condition three above.
If even one bit gets flipped in a file,
the MD5 hash will be completely different.
So if you ever download ISOs on the internet,
chances are they will come with an MD5 hash
that can be used to validate that the file has not been corrupted.
And for that purpose, it is perfectly good.
But for security purposes, it should be avoided.
I recently read an alarmist article
on how every password was crackable.
And noted that in this article,
all of the passwords were hashed with MD5.
Needless to say, I was skeptical of the conclusions.
Now, when MD5 was replaced, the next replacement was SHA1,
which stands for Secure Hash Algorithm number one.
This was designed by the NSA
and was required in many government applications.
In 2005, weaknesses were found however,
which led to SHA2.
And recently, a competition led to a new replacement,
which will be known as SHA3.
SHA2 has never been found to have a weakness,
but because it shares some features with SHA1,
the government decided it wanted an alternative
that used a very different approach.
So for password security, you would want to use either SHA2
or SHA3.
Now, since SHA3 is very recent, it's not much in use yet.
So SHA2 is what you would hope to encounter.
Now, note that SHA1 is still in use,
but Microsoft, as an example,
announced that it stopped accessing,
it will stop accepting SSL certificates
encrypted with SHA1 by the year 2017.
So it's days are numbered as indeed they should be.
So, looking at what a hash should do, what does it all mean?
First, creating a hash is supposed to be easy.
That is what one-way functions are all about.
This is similar to what we saw with PGP,
which is a different technology,
where you could generate a key pair on a home computer.
Generating a hash should be similarly easy
and require very little computing power.
And entropy is not a factor here,
so you can generate thousands of hashes without any problem.
The second criterion says that you cannot reverse the process easily.
If you have a given hash,
there is no way to generate the message that produced it,
at least with the current technology.
That is the other half of one-way functions.
Now, technically, this is not impossible in the strict sense,
given enough computing power,
it is theoretically possible
to try every conceivable message,
compute the resulting hash,
and compare it to the hash in question.
This is the principle used in dictionary attacks,
which we will discuss below.
But this also means that hashing is not a good way
to encrypt a message to someone,
since there would be no way for them to decrypt the message.
The third criterion says that any change at all to the message,
even a single bit changed,
would cause the hash to be completely different.
And we do mean completely.
The resulting hash would look totally different from the original.
This makes it excellent for ensuring that the contents have not been altered,
which is why hashes are used to validate downloads.
Note that this is essentially the same function
that digitally signing your email performs.
It assures that the message you sent has not been altered in any way.
The last criterion says it should be highly unlikely
that two different messages would have the same hash.
This is called a collusion in cryptography,
and would allow an attack.
Again, it is not impossible, just highly unlikely.
Also, one characteristic of hashing functions
that is not on the list but is worth knowing
is that they generate hashes of the exact same length
regardless of the original message length.
This turns out to be useful in understanding password hashes.
So, password hashes in use.
In most cases, you enter a password on a login page
for some kind of online site,
and your password is transmitted in the clear to the server.
That means you could be vulnerable to what is called a man in the middle attack,
which is to say that if an attacker can get between you and the online site,
they can see your password.
For that reason, it is important to make sure you have a secure connection,
generally one that uses SSL to establish an encrypted connection to the server.
This is a whole topic in itself,
and we will be getting to the discussion of SSL certificates.
So, I won't go into it any further here.
In any case, your password goes to the server,
and the server employs a hashing function,
hopefully a HA2 or better,
and stores the hash in its database.
Since all hashes are the same length,
the database administrators can set aside a fixed field length to store the resulting hash,
which tends to make DBAs happy.
When you later try to log in and type in your password,
the server repeats the hashing function on the password,
and compares the hash it gets with what it stored in the database.
And if they match, you are allowed in.
And given the way that hashes work,
any difference at all results in a totally different hash.
So, there can never be a concept of close enough.
You and I might accept something that is 95% the same,
but for hashed passwords, that is not acceptable at all.
Hashes stored in this way are not susceptible
to a frontal brute force attack.
There is no way you can take the hash into a computation
that gives you the original password.
But because the hashing function is generally well known and deterministic,
there is an alternative attack that often does succeed.
You can compute a so-called dictionary
that contains the hashes for all known dictionary words,
and all popular passwords,
and then do a lookup of any given hash against this table.
An attacker can get a lot of passwords this way,
because so many people exercise poor judgment.
If you use the word password as your password,
or one, two, three, or let me in,
all of these, by the way, are known to be frequently used.
They will be found in this kind of an attack,
and trying to use what is called elite speak
to disguise your words,
and that's where you use a number in place of a letter,
such as a three in place of the letter E,
or a one in place of the letter L,
you will get caught.
Attackers know all about that,
and the dictionaries have all of those entries as well.
So in essence, if an attacker can get a database of a million words,
they can run million passwords,
they can run all of the hashes against a dictionary,
and in short order, they can get as many as 50% of the passwords,
just from a deck comparison.
And one thing that helps them is that a lot of people use the same password,
and all of their hashes will be identical.
You can see this in the periodically issued lists
of the most common passwords.
Now there is a countermeasure called a salted hash.
The idea here is to add a random element to each password,
so that the hash is harder to look up.
And if any two people use the same password,
the hashes will be different,
because their salt is different.
Of course, that random number,
or random element, has to be recorded,
so that you can log in each time,
and that can mean an added field or perhaps table in the database.
Now, if an attacker gets the database,
they get the salt as well,
but the computation gets exponentially more difficult.
Suppose you have a password hash X,
and an own salt of Y that was used to calculate it.
The only way you can recover the password
is to create a new dictionary
that combines every possible password
in your original dictionary
with that known random number
and compute the resulting hash.
And if you succeed in this,
all you have is one password.
You would need to repeat this process
for every password you have,
which is what makes this computationally infeasible
for most attackers.
A good description of hashing
with a discussion of how to deal it
can be found at codeproject.com
and I have a link in the show notes for this.
This is an excellent and detailed discussion,
which I recommend if you are interested.
They also bring up another countermeasure
that is worth combining with the salting,
and that is a technique known as key stretching.
Essentially, this means using a hashing algorithm
that is notably slow to execute.
For hashing a single password at the server level
when a customer comes calling,
the added time is not significant,
but when you are trying to compute
an entire dictionary of millions and millions of hashes,
slowing down the attacker can make for a big difference.
So, now we've seen how the site owner
can make things more secure.
Next time we need to look at your own responsibility.
So, this is Ahuka,
signing off for hacker public radio
and reminding you as always
to please support free software.
You've been listening to Hacker Public Radio
at HackerPublicRadio.org.
We are a community podcast network
that releases shows every weekday,
Monday through Friday.
Today's show, like all our shows,
was contributed by an HBR listener like yourself.
If you ever thought of recording a podcast,
then click on our contributing
to find out how easy it really is.
Hacker Public Radio was founded
by the digital dog pound
and the Infonomicon Computer Club,
and is part of the binary revolution at binrev.com.
If you have comments on today's show,
please email the host directly,
leave a comment on the website
or record a follow-up episode yourself.
Unless otherwise status,
today's show is released
under Creative Commons,
Attribution,
Share a Light,
3.0 license.