Hashing & MACs

In one line: A cryptographic hash is a one-way fingerprint of data — same input always gives the same short output, but you can't run it backward — and the two things beginners most often get wrong are using a fast hash for passwords (you need a deliberately slow one, salted) and confusing a hash with a MAC (which also proves who sent it).

In plain English

A hash is like a blender that turns any amount of data into a fixed-size smoothie. The same fruit always makes the same smoothie, so you can check "is this the same fruit?" by comparing smoothies — but you can never un-blend the smoothie back into fruit. That one-way property is the whole point. It's how a website can check your password without storing the password, how you can verify a 4 GB download wasn't corrupted by comparing a tiny fingerprint, and how a signature proves a file is unaltered. The catch: hashing is not encryption (there's no key, and you can't reverse it), and the way you hash a password is deliberately different — and slower — than the way you hash a file.

What a cryptographic hash is

A hash function takes input of any size and produces a fixed-size output (the hash, or digest), e.g. 256 bits for SHA-256. A cryptographic hash has three properties that ordinary hashes (like the ones in a hash table) don't guarantee:

Deterministic — the same input always yields the same digest.
One-way (preimage resistant) — given a digest, you can't feasibly find an input that produces it. No "decryption."
Collision resistant — you can't feasibly find two different inputs with the same digest. (And a tiny change in input — one bit — produces a completely different, unpredictable digest: the "avalanche effect.")

"hello"          ── SHA-256 ──▶  2cf24dba5fb0a30e26e83b2ac5b9e29e...
"hellp"          ── SHA-256 ──▶  9c1185a5c5e9fc54612808977ee8f548...   ← one letter, totally different
"hello" (again)  ── SHA-256 ──▶  2cf24dba5fb0a30e26e83b2ac5b9e29e...   ← identical, always

Terms, defined once

Hash / digest / fingerprint — the fixed-size output. Three words for the same thing.
Collision — two different inputs producing the same digest. A cryptographically broken hash is one where collisions can be found (this killed MD5 and SHA-1).
Salt — a unique random value added to each password before hashing, so identical passwords hash differently. Stored alongside the hash (it's not secret).
Pepper — a secret value (kept separately from the database) added to all passwords; an optional extra layer.
MAC (Message Authentication Code) — a hash that also takes a secret key, proving both integrity and authenticity. HMAC is the standard construction.
KDF (Key Derivation Function) — a deliberately slow, salted function for turning passwords into hashes (or keys): bcrypt, scrypt, Argon2.

The hashes you should use (and avoid)

Use for general hashing: the SHA-2 family (SHA-256, SHA-512) and SHA-3, or BLAKE2/BLAKE3 (fast and modern). These are for file integrity, signatures, deduplication, etc.
Never use for security: MD5 and SHA-1. Both are broken — attackers can manufacture collisions, so they can't be trusted for integrity or signatures. (You may still see MD5 used as a non-security checksum against accidental corruption; never against a deliberate attacker.)

The password trap: why fast hashes are wrong for passwords

Here's the single most important practical point in this lesson. You never store passwords. You store hashes of passwords, so that a stolen database doesn't immediately hand over everyone's password. But how you hash matters enormously.

Worked example: why SHA-256 is the WRONG way to hash a password

Say you store passwords as plain SHA-256(password). Two problems:

Problem 1 — it's too fast. SHA-256 is designed to be fast — a modern GPU computes billions per second. An attacker who steals your hash database just hashes every common password and every dictionary word at billions/sec and matches them against your stored hashes. Speed is a feature for file integrity but a disaster for passwords.

Problem 2 — identical passwords look identical. Without a salt, two users with password 123456 have the same hash. An attacker cracks it once and owns both — and can precompute giant lookup tables ("rainbow tables") of common-password→hash pairs in advance.

The fix — a slow, salted password hash (a KDF):

Salt: add a unique random value per user before hashing, so identical passwords produce different hashes and precomputed tables are useless.
Slow on purpose: use a function deliberately engineered to be expensive — bcrypt, scrypt, or (preferred today) Argon2. They take a tunable amount of CPU/memory per hash (say, 100 ms). Imperceptible for your one login; ruinous for an attacker trying billions.

WRONG:  store  SHA-256(password)              ← fast, unsalted → cracked in hours
RIGHT:  store  Argon2(password, unique_salt)  ← slow, salted   → cracking is infeasible

Modern KDFs handle the salt for you and bake it into the stored output. Use Argon2id (or bcrypt if that's what your platform offers); never a bare SHA/MD5 for passwords.

Highlight: encryption vs hashing for passwords

Passwords are hashed, not encrypted. Encryption is reversible — if you encrypt passwords, anyone with the key (an attacker who breaches your server) gets every plaintext password back. Hashing is one-way: even you can't recover the password, which is the point. At login you hash the submitted password and compare digests. If a site can email you your original password, they're storing it reversibly — a serious red flag.

MACs: hashing that also proves who

A plain hash proves a message wasn't accidentally changed — but an attacker who alters the message can just recompute the hash, so a bare hash alone doesn't stop deliberate tampering over a channel. A MAC (Message Authentication Code) fixes this by mixing in a shared secret key.

plain hash:   H(message)            → anyone can recompute it after tampering
MAC:          HMAC(key, message)    → only someone with the secret key can produce/verify it

Because only holders of the secret key can compute a valid MAC, a correct MAC proves two things:

Integrity — the message wasn't altered.
Authenticity — it came from someone holding the shared key.

The standard construction is HMAC (Hash-based MAC, e.g. HMAC-SHA256). MACs are what authenticate API requests (signed webhooks, request signing), session tokens, and — recall the last symmetric lesson — the "authentication tag" inside AEAD modes is doing exactly this MAC job.

Use constant-time comparison

When checking a MAC or any secret, compare with a constant-time equality function, not a normal ==. A normal comparison bails out at the first mismatching byte, and the tiny timing difference can leak the secret one byte at a time (a timing attack). Vetted libraries provide constant_time_compare / hmac.compare_digest for this — another reason not to roll your own.

How hashing ties the chapter together

A digital signature (last lesson) is "hash the document, then sign the hash with a private key" — the hash gives integrity, the signing gives authenticity. Hashing is the first half of every signature.
File integrity / downloads: publish a SHA-256 of a file so downloaders can verify they got the real, unaltered bytes.
AEAD's auth tag is a MAC over the ciphertext.
Password storage uses slow salted KDFs.

One primitive, four jobs — which is why hashing sits in the middle of the cryptography chapter.

Why it matters

It's how integrity is enforced everywhere. Signatures, certificates, package managers, blockchains, Git commits (which are content-addressed by hash) — all lean on collision-resistant hashing.
Password handling is a rite of passage. Getting it wrong (fast hash, no salt, or reversible encryption) is one of the most common and damaging real-world mistakes — it turns one database breach into millions of cracked accounts, often reused across other sites.
MACs guard the trust boundary. The boundary lens said "verify data crossing in." A MAC is the cryptographic way to verify a message arriving from elsewhere is authentic and intact.

Common pitfalls

Where people commonly trip up

Hashing passwords with a fast hash (SHA-256, MD5). Fast is the enemy here. Use a slow, salted KDF (Argon2id, bcrypt, scrypt).
No salt (or a shared/static salt). Unsalted hashes let attackers use precomputed rainbow tables and crack identical passwords once. Salt must be unique per password (modern KDFs do this for you).
Encrypting passwords instead of hashing. Reversible = recoverable by whoever steals the key. Hash, don't encrypt. A site that can email your password back is doing it wrong.
Using MD5/SHA-1 for anything security-relevant. Both are collision-broken. Use SHA-256/SHA-3/BLAKE for integrity and signatures.
Confusing a hash with a MAC. A bare hash doesn't stop a deliberate attacker (they recompute it). If you need to prove a message came from a key-holder, use HMAC, not a plain hash.
Comparing secrets with normal equality. Use constant-time comparison to avoid timing attacks.

Page checkpoint

Required checkpoint

Hashing & MACs — locked in?

Pass to unlock the Next button below

What's next

→ Continue to TLS 1.3 — where symmetric encryption, key exchange, signatures, and hashing all come together into the handshake that secures every HTTPS connection.

→ Going deeper: hashing is the first step of every digital signature; the slow-KDF idea reappears in authentication, and constant-time comparison is part of secure coding in Secure SDLC.

What a cryptographic hash is​

The hashes you should use (and avoid)​

The password trap: why fast hashes are wrong for passwords​

MACs: hashing that also proves who​

How hashing ties the chapter together​

Why it matters​

Common pitfalls​

Page checkpoint​

What's next​