Hash Functions

Wikipedia · Cryptographic hash function · CC BY-SA 4.0

A cryptographic hash function maps arbitrary-length input to a fixed-length output. It must be one-way (hard to invert), collision-resistant (hard to find two inputs with the same hash), and avalanche-prone (a small input change flips about half the output bits).

One-way property

Given a hash output h, it should be computationally infeasible to find any input m such that H(m) = h. This is called preimage resistance. For a hash with n-bit output, a brute-force preimage attack takes about 2ⁿ tries.

Scheme

; A toy hash function to illustrate one-wayness
; Real hash functions use much more complex mixing

(define (toy-hash msg)
  ;; Sum character codes, multiply, take modulo
  (let loop ((chars (string->list msg)) (acc 1))
    (if (null? chars)
      (modulo acc 65536)
      (loop (cdr chars)
            (modulo (+ (* acc 31) (char->integer (car chars))) 65536)))))

(display "H('hello') = ") (display (toy-hash "hello")) (newline)
(display "H('world') = ") (display (toy-hash "world")) (newline)
(display "H('hellp') = ") (display (toy-hash "hellp")) (newline)

;; Given output 27485, can you find the input?
;; With 16-bit output, brute force needs ~65536 tries.
;; SHA-256 has 256-bit output: ~2^256 tries.
(display "One bit change, very different output.")

Collision resistance and the birthday attack

A collision is two different inputs that produce the same hash. For an n-bit hash, the birthday attack finds a collision in about 2^n/2 tries (not 2ⁿ). This is why SHA-256 provides only 128 bits of collision resistance. The birthday paradox: in a room of 23 people, there is a 50% chance two share a birthday.

Scheme

; Birthday paradox simulation
; With a small hash (8-bit = 256 buckets),
; how many inputs until we get a collision?

(define (toy-hash-8 msg-num)
  ;; Hash a number to 0-255
  (modulo (+ (* msg-num 31) 7) 256))

;; Find first collision
(define (find-collision)
  (let loop ((i 0) (seen '()))
    (let ((h (toy-hash-8 i)))
      (if (memv h seen)
        (begin
          (display "Collision at input ") (display i)
          (display ", hash = ") (display h) (newline)
          (display "Tries: ") (display i) (newline)
          (display "sqrt(256) = 16 (expected ~16 tries)"))
        (loop (+ i 1) (cons h seen))))))

(find-collision)

Merkle-Damgard construction

SHA-256 processes the message in 512-bit blocks. Each block is fed into a compression function along with the previous hash state. The initial state is a fixed constant (IV). After all blocks, the final state is the hash. This iterative structure is the Merkle-Damgard construction.

Scheme

; Merkle-Damgard construction (simplified)
; Process message in blocks, chaining the state

(define IV 42)  ;; initial value

(define (compress state block)
  ;; Toy compression: mix state with block
  (modulo (+ (* state 31) block) 65536))

(define (merkle-damgard blocks)
  (let loop ((state IV) (remaining blocks))
    (if (null? remaining)
      state
      (loop (compress state (car remaining))
            (cdr remaining)))))

;; Message split into "blocks"
(define blocks1 (list 72 101 108 108 111))   ; "Hello"
(define blocks2 (list 72 101 108 108 112))   ; "Hellp"

(display "H('Hello') = ") (display (merkle-damgard blocks1)) (newline)
(display "H('Hellp') = ") (display (merkle-damgard blocks2)) (newline)

;; Adding a block changes the final hash
(define blocks3 (list 72 101 108 108 111 33))  ; "Hello!"
(display "H('Hello!') = ") (display (merkle-damgard blocks3))

Neighbors

Cross-references

📡 Shannon Ch. 1 — surprise and information: hash functions compress information

Foundations (Wikipedia)

← Block Ciphers by june.kim MACs and Authentication →