Skip to main content

Hash Functions

Properties of Cryptographically Secure Hash Functions

  1. Any input length
  2. Fixed output length
  3. Fast
  4. Preimage resistance (one way)
  5. Second preimage resistance
  6. Collision resistance

Preimage Resistance

  • Hash functions must be one-way
  • Given a hash of a message H(x) it must be infeasible to calculate x
  • Less applicable to digital signatures, crucial for e.g. password storage and key derivation

Second Preimage Resistance

  • Weak collision resistance
  • Given a message x_0 and a has of that message it should be infeasible to find a second message ...

Collision Resistance

  • Strong collision resistance
  • It is not possible to find any message pair such that..
  • In practice, this is much easier than finding a weak collision

Preventing Collisions

  • If you attempt to fit more messages into fewer hash spaces, collisions are guaranteed

Second Preimage Attacks - For a 256 bit hash with good random properies might expect 2^256 bit brute force resistance before we find a collision with x_0

Collision Attacks - There are many other possible collisions beyond those simply with x_0

Birthday Paradox

  • It is easier to first calculate the probability P(n) that n people do not share any birthdays
  • The probability of at least one collision is 1-P(no collision)
  • The probability of a collision with only 23 people is 50% For 40 its 90%
  • Same principle applies to hash functions, the more hashes computed, the more likely a collision becomes

Birthday attack

.....

Merkle-Damgard

  • Many families of hash functions use a Merkle-Damgard construction
  • The message is added in one block at a time
  • Message blocks are mixed with the previous hash state using a compression function
  • ...

Padding

...

Compression Function

  • The compression function mixes 512 bit blocks of message into the current state to produce a new state
  • The message schedule expands 512 bits to 64 x 32 bit words
  • Each round performs permutation and mixing
  • Conceptually similar to a block cipher, and some hash functions are implemented this way

Avalanche Effect

  • Whenever the input is changed at all, each output bit should change with 50% probability

Password Hashing

  • Preimage Protection means we cannot derive x from H(x) any more easily than brute force
  • This is ideal for storing passwords, we store H(x) and then compare to H(y) where y is whatever the user submits
  • When, not if, your database is exposed you dont expose user passwords
  • Property 3 of Hash Functions: Fast
  • Fast functions, are fast to brute force

Password Salting

  • Attacking hash functions often uses Rainbow Tables, precomputed hashes
  • We can resist this by not storing H(X), but instead H(x+s)
  • If s is unique per user, then any Rainbow Table must be unique per user too
  • Salting means an attack has to brute force per user

Just Break Property 3

  • A password hash function can be x times slower and not meaningfully impact user experience
  • Take a known fast hash function and repeat it
  • PBKDF creates a hash by repeatedly hashing the password, up to 100,000 times and xoring the results together
  • Balloon Hashing will create a large array of hashes and work back and forth over that increasing not just the time but the memory cost