Hashing Explained: MD5 vs SHA-256 and Why a Hash Only Goes One Way

You've seen hashes even if you've never called them that. The row of hex characters next to a software download labelled SHA-256. The reason a website can check your password without storing it. The commit IDs in git. They're all the same idea: a hash function, which takes any input at all and produces a short, fixed-size fingerprint of it.

The defining feature — and the thing that makes hashing different from the Base64 encoding we looked at recently — is that it only runs one way. You can turn an input into its hash in an instant. You cannot turn the hash back into the input. Ever. That one-way property is the whole point, and most of what hashing is good for follows from it.

What a hash function actually does

Feed a hash function anything — a single character, a sentence, a 4 GB video — and it returns a fixed-length string called a digest. SHA-256, the most common one today, always returns 256 bits, written as 64 hexadecimal characters, no matter how big or small the input was. You can run any text through one with our hash generator.

Two properties matter immediately. First, it's deterministic: the same input always produces exactly the same digest, every time, on every machine. That's what makes it useful for comparison — if two files have the same SHA-256, they're almost certainly identical. Second, the output is fixed-size regardless of input, which is why a hash works as a compact fingerprint for data of any length.

The properties that make it useful

A cryptographic hash function is built to guarantee a few more things:

One-way (preimage resistance). Given a digest, there's no practical way to work out what input produced it, short of guessing inputs until one matches.
The avalanche effect. Change the input by a single character and the output changes completely — not slightly, completely. The hash of hello and the hash of Hello have nothing visible in common:

SHA-256("hello")  →  2cf24dba5fb0a30e…   (64 hex characters)
SHA-256("Hello")  →  185f8db32271fe25…   (one capital letter — a totally different result)

Collision resistance. It should be infeasible to find two different inputs that produce the same digest. When this property fails, the hash is considered broken — which brings us to the comparison everyone wants.

It's not encoding, and it's not encryption

It's worth nailing down where hashing sits, because the three get muddled constantly:

Encoding (like Base64) — reversible by anyone, no key. For transport, not secrecy.
Encryption — reversible, but only with the right key. For confidentiality.
Hashing — not reversible at all, by anyone, with or without a key. For fingerprinting and verification.

If you need the original data back, you want encryption. If you just need to move binary through a text channel, you want encoding. Hashing is the one you reach for when you never need the input back — you only need to check whether something matches.

MD5 vs SHA-256

This is the comparison people actually search for, and the short version is: use SHA-256, retire MD5.

MD5 dates to 1991, produces a 128-bit digest, and is very fast. It is also thoroughly broken. Researchers can now generate collisions — two different inputs with the same MD5 — almost trivially, which means an MD5 digest can no longer prove that a file hasn't been swapped for a malicious one. It survives only as a non-security checksum for catching accidental corruption, and even there, there's little reason not to use something better.

SHA-256 is part of the SHA-2 family, produces a 256-bit digest, and has no known practical collisions. It's the current default for integrity checking, digital signatures, and almost everything else. Its older sibling SHA-1 (160-bit) is also broken — a real collision was demonstrated in 2017 — and has been retired across browsers, certificate authorities, and increasingly git itself. If you're choosing a hash today, SHA-256 (or SHA-3, or BLAKE3) is the floor.

What "broken" means: collisions

A collision is two different inputs that hash to the same value. For a secure hash, finding even one should take longer than the age of the universe. The danger when it's easy is concrete: if an attacker can produce a harmless document and a malicious one that share a digest, then any system trusting that digest to verify the file can be fooled into accepting the malicious version. That's why a broken hash can't be trusted for security, even though it still computes a perfectly consistent fingerprint.

Where hashing is used

File integrity. The SHA-256 string next to a download lets you confirm the file you received matches the one the publisher posted — recompute it locally and compare.
Digital signatures. Signing a large file directly would be slow, so systems hash the file and sign the small digest instead. This is the machinery underneath TLS certificates and signed software.
Content addressing. git names every commit by the hash of its contents; deduplication systems spot identical files by matching digests.
Password storage. Sites store a hash of your password rather than the password itself — but doing this safely needs more than a plain hash, which is the catch worth dwelling on.

Passwords: where plain hashing goes wrong

It's tempting to think "I'll just store users' passwords as SHA-256." That's a mistake, for two reasons.

First, SHA-256 is too fast. The same speed that makes it great for checksums makes it terrible for passwords: an attacker who steals the database can try billions of guesses per second against a fast hash. Second, without a salt — a unique random value mixed into each password before hashing — identical passwords produce identical hashes, so an attacker can crack them in bulk using precomputed tables.

The right approach is a purpose-built password hashing function: bcrypt, scrypt, or Argon2 (the modern recommendation is Argon2id). These automatically salt each password and are deliberately slow and memory-hungry, so that a single guess takes meaningful time and brute-forcing a stolen database becomes impractical. When security people say passwords should be "hashed," this — not a bare SHA-256 — is what they mean.

A quick word on HMAC

Hashing also underpins message authentication. An HMAC combines a hash function with a secret key, producing a digest that proves two things at once: the data hasn't changed, and whoever produced it knew the key. If you've read about JSON Web Tokens, the HS256 signing algorithm is exactly this — an HMAC built on SHA-256. It's a neat illustration that a plain hash gives you integrity, while a hash plus a secret gives you authenticity too.

The takeaway

A hash is a one-way fingerprint: fast to compute, impossible to reverse, and completely different for any change to the input. It's the right tool for verifying integrity, building signatures, and — handled properly, with a salt and a slow algorithm — storing passwords. It is not a way to hide data you need back, and a broken hash like MD5 is no protection against a determined attacker. Reach for SHA-256 or better, and you can experiment with any of them in our hash generator.

Try it

Generate a hash from any text

Run any string through common hashing algorithms like MD5, SHA-1, and SHA-256, and watch the avalanche effect for yourself — change a single character and every digest changes completely.

Open the Hash Generator →