Hash Functions

Properties of Cryptographically Secure Hash Functions

Any input length
Fixed output length
Fast
Preimage resistance (one way)
Second preimage resistance
Collision resistance

Preimage Resistance

Hash functions must be one-way
Given a hash of a message H(x) it must be infeasible to calculate x
Less applicable to digital signatures, crucial for e.g. password storage and key derivation

Second Preimage Resistance

Weak collision resistance
Given a message x_0 and a has of that message it should be infeasible to find a second message ...

Collision Resistance

Strong collision resistance
It is not possible to find any message pair such that..
In practice, this is much easier than finding a weak collision

Preventing Collisions

If you attempt to fit more messages into fewer hash spaces, collisions are guaranteed

Second Preimage Attacks - For a 256 bit hash with good random properies might expect 2^256 bit brute force resistance before we find a collision with x_0

Collision Attacks - There are many other possible collisions beyond those simply with x_0

Birthday Paradox

It is easier to first calculate the probability P(n) that n people do not share any birthdays
The probability of at least one collision is 1-P(no collision)
The probability of a collision with only 23 people is 50% For 40 its 90%
Same principle applies to hash functions, the more hashes computed, the more likely a collision becomes

Birthday attack

.....

Merkle-Damgard

Many families of hash functions use a Merkle-Damgard construction
The message is added in one block at a time
Message blocks are mixed with the previous hash state using a compression function
...

Padding

...

Compression Function

The compression function mixes 512 bit blocks of message into the current state to produce a new state
The message schedule expands 512 bits to 64 x 32 bit words
Each round performs permutation and mixing
Conceptually similar to a block cipher, and some hash functions are implemented this way

Avalanche Effect

Whenever the input is changed at all, each output bit should change with 50% probability

Password Hashing

Preimage Protection means we cannot derive x from H(x) any more easily than brute force
This is ideal for storing passwords, we store H(x) and then compare to H(y) where y is whatever the user submits
When, not if, your database is exposed you dont expose user passwords
Property 3 of Hash Functions: Fast
Fast functions, are fast to brute force

Password Salting

Attacking hash functions often uses Rainbow Tables, precomputed hashes
We can resist this by not storing H(X), but instead H(x+s)
If s is unique per user, then any Rainbow Table must be unique per user too
Salting means an attack has to brute force per user

Just Break Property 3

A password hash function can be x times slower and not meaningfully impact user experience
Take a known fast hash function and repeat it
PBKDF creates a hash by repeatedly hashing the password, up to 100,000 times and xoring the results together
Balloon Hashing will create a large array of hashes and work back and forth over that increasing not just the time but the memory cost

Hash Functions

Properties of Cryptographically Secure Hash Functions​

Birthday Paradox​

Birthday attack​

Merkle-Damgard​

Padding​

Compression Function​

Avalanche Effect​

Password Hashing​

Password Salting​

Just Break Property 3​