Info & Theory
A cryptographic hash maps any input to a fixed-length output that behaves like a random fingerprint. Two properties make it useful: the avalanche effect and resistance to collisions.
The avalanche effect
A small change to the input — even a single bit — should flip
about half of the output bits. Outputs for similar
inputs look completely unrelated, so you cannot reverse-engineer
the input or nudge it toward a target hash.
The birthday bound
With a b-bit hash there are
B = 2ᵇ possible outputs. A collision (two inputs
with the same hash) becomes likely much sooner than B — around
√B samples. The probability is approximately
1 − e^(−n²⁄2B) for n samples.
Why √B, not B
Each new sample can collide with every earlier one, so the
number of pairs grows like n². That quadratic
growth is why a room of just 23 people likely shares a birthday,
and why a 256-bit hash needs ~2¹²⁸ work to break by brute
collision — still astronomically hard.
About this simulation
For speed the hash here is a small deterministic mixing function (FNV-1a plus xorshift), not SHA-256. It still exhibits the avalanche effect and the birthday bound, which are properties of any good hash.
Frequently asked questions
What is a cryptographic hash function?
It is a function that maps any input to a fixed-length output that behaves like a random fingerprint. The same input always gives the same output, but the output reveals nothing useful about the input.
What is the avalanche effect?
The avalanche effect means a tiny change to the input — even one bit — flips about half of the output bits. It makes similar inputs produce completely unrelated hashes.
What is a hash collision?
A collision is when two different inputs produce the same hash output. A good hash makes finding one computationally infeasible.
What is the birthday paradox in hashing?
It is the surprising result that collisions become likely after only about the square root of the number of possible outputs, far sooner than intuition suggests.
Why are collisions likelier than expected?
Because each new sample can collide with every previous one, the number of pairs grows quadratically. So the chance of any collision rises much faster than the chance of hitting one specific value.
How many samples cause a likely collision?
For a b-bit hash with B = 2 to the power b outputs, a collision becomes likely around the square root of B samples. The probability is roughly 1 minus e to the power of minus n squared over 2B.
Why should about half the bits change?
If each output bit were an independent fair coin, flipping any input would change each output bit with probability one half, so on average half of them flip. That is the ideal a strong hash imitates.
Is the hash in this simulation real SHA-256?
No. For speed it uses a small deterministic FNV-1a plus xorshift mixing function. It still shows the avalanche effect and birthday bound, which are properties of any good hash.
Why use a fixed-length output?
A fixed-length digest lets you compare, store and index data of any size in constant space, and it is what makes signatures, integrity checks and proofs efficient.
How big does a hash need to be to resist collisions?
Because of the birthday bound, a hash needs about twice as many bits as the security level you want. A 256-bit hash gives roughly 128 bits of collision resistance.