Hash Functions
Properties of Cryptographically Secure Hash Functions
- Any input length
- Fixed output length
- Fast
- Preimage resistance (one way)
- Second preimage resistance
- Collision resistance
Preimage Resistance
- Hash functions must be one-way
- Given a hash of a message H(x) it must be infeasible to calculate x
- Less applicable to digital signatures, crucial for e.g. password storage and key derivation
Second Preimage Resistance
- Weak collision resistance
- Given a message x_0 and a has of that message it should be infeasible to find a second message ...
Collision Resistance
- Strong collision resistance
- It is not possible to find any message pair such that..
- In practice, this is much easier than finding a weak collision
Preventing Collisions
- If you attempt to fit more messages into fewer hash spaces, collisions are guaranteed
Second Preimage Attacks - For a 256 bit hash with good random properies might expect 2^256 bit brute force resistance before we find a collision with x_0
Collision Attacks - There are many other possible collisions beyond those simply with x_0
Birthday Paradox
- It is easier to first calculate the probability P(n) that n people do not share any birthdays
- The probability of at least one collision is 1-P(no collision)
- The probability of a collision with only 23 people is 50% For 40 its 90%
- Same principle applies to hash functions, the more hashes computed, the more likely a collision becomes
Birthday attack
.....
Merkle-Damgard
- Many families of hash functions use a Merkle-Damgard construction
- The message is added in one block at a time
- Message blocks are mixed with the previous hash state using a compression function
- ...
Padding
...
Compression Function
- The compression function mixes 512 bit blocks of message into the current state to produce a new state
- The message schedule expands 512 bits to 64 x 32 bit words
- Each round performs permutation and mixing
- Conceptually similar to a block cipher, and some hash functions are implemented this way
Avalanche Effect
- Whenever the input is changed at all, each output bit should change with 50% probability
Password Hashing
- Preimage Protection means we cannot derive x from H(x) any more easily than brute force
- This is ideal for storing passwords, we store H(x) and then compare to H(y) where y is whatever the user submits
- When, not if, your database is exposed you dont expose user passwords
- Property 3 of Hash Functions: Fast
- Fast functions, are fast to brute force
Password Salting
- Attacking hash functions often uses Rainbow Tables, precomputed hashes
- We can resist this by not storing H(X), but instead H(x+s)
- If s is unique per user, then any Rainbow Table must be unique per user too
- Salting means an attack has to brute force per user
Just Break Property 3
- A password hash function can be x times slower and not meaningfully impact user experience
- Take a known fast hash function and repeat it
- PBKDF creates a hash by repeatedly hashing the password, up to 100,000 times and xoring the results together
- Balloon Hashing will create a large array of hashes and work back and forth over that increasing not just the time but the memory cost