Skip to main content

Hashing & MACs

In one line: A cryptographic hash is a one-way fingerprint of data — same input always gives the same short output, but you can't run it backward — and the two things beginners most often get wrong are using a fast hash for passwords (you need a deliberately slow one, salted) and confusing a hash with a MAC (which also proves who sent it).

In plain English

A hash is like a blender that turns any amount of data into a fixed-size smoothie. The same fruit always makes the same smoothie, so you can check "is this the same fruit?" by comparing smoothies — but you can never un-blend the smoothie back into fruit. That one-way property is the whole point. It's how a website can check your password without storing the password, how you can verify a 4 GB download wasn't corrupted by comparing a tiny fingerprint, and how a signature proves a file is unaltered. The catch: hashing is not encryption (there's no key, and you can't reverse it), and the way you hash a password is deliberately different — and slower — than the way you hash a file.

What a cryptographic hash is

A hash function takes input of any size and produces a fixed-size output (the hash, or digest), e.g. 256 bits for SHA-256. A cryptographic hash has three properties that ordinary hashes (like the ones in a hash table) don't guarantee:

  1. Deterministic — the same input always yields the same digest.
  2. One-way (preimage resistant) — given a digest, you can't feasibly find an input that produces it. No "decryption."
  3. Collision resistant — you can't feasibly find two different inputs with the same digest. (And a tiny change in input — one bit — produces a completely different, unpredictable digest: the "avalanche effect.")
"hello" ── SHA-256 ──▶ 2cf24dba5fb0a30e26e83b2ac5b9e29e...
"hellp" ── SHA-256 ──▶ 9c1185a5c5e9fc54612808977ee8f548... ← one letter, totally different
"hello" (again) ── SHA-256 ──▶ 2cf24dba5fb0a30e26e83b2ac5b9e29e... ← identical, always
Terms, defined once
  • Hash / digest / fingerprint — the fixed-size output. Three words for the same thing.
  • Collision — two different inputs producing the same digest. A cryptographically broken hash is one where collisions can be found (this killed MD5 and SHA-1).
  • Salt — a unique random value added to each password before hashing, so identical passwords hash differently. Stored alongside the hash (it's not secret).
  • Pepper — a secret value (kept separately from the database) added to all passwords; an optional extra layer.
  • MAC (Message Authentication Code) — a hash that also takes a secret key, proving both integrity and authenticity. HMAC is the standard construction.
  • KDF (Key Derivation Function) — a deliberately slow, salted function for turning passwords into hashes (or keys): bcrypt, scrypt, Argon2.

The hashes you should use (and avoid)

  • Use for general hashing: the SHA-2 family (SHA-256, SHA-512) and SHA-3, or BLAKE2/BLAKE3 (fast and modern). These are for file integrity, signatures, deduplication, etc.
  • Never use for security: MD5 and SHA-1. Both are broken — attackers can manufacture collisions, so they can't be trusted for integrity or signatures. (You may still see MD5 used as a non-security checksum against accidental corruption; never against a deliberate attacker.)

The password trap: why fast hashes are wrong for passwords

Here's the single most important practical point in this lesson. You never store passwords. You store hashes of passwords, so that a stolen database doesn't immediately hand over everyone's password. But how you hash matters enormously.

Worked example: why SHA-256 is the WRONG way to hash a password

Say you store passwords as plain SHA-256(password). Two problems:

Problem 1 — it's too fast. SHA-256 is designed to be fast — a modern GPU computes billions per second. An attacker who steals your hash database just hashes every common password and every dictionary word at billions/sec and matches them against your stored hashes. Speed is a feature for file integrity but a disaster for passwords.

Problem 2 — identical passwords look identical. Without a salt, two users with password 123456 have the same hash. An attacker cracks it once and owns both — and can precompute giant lookup tables ("rainbow tables") of common-password→hash pairs in advance.

The fix — a slow, salted password hash (a KDF):

  • Salt: add a unique random value per user before hashing, so identical passwords produce different hashes and precomputed tables are useless.
  • Slow on purpose: use a function deliberately engineered to be expensivebcrypt, scrypt, or (preferred today) Argon2. They take a tunable amount of CPU/memory per hash (say, 100 ms). Imperceptible for your one login; ruinous for an attacker trying billions.
WRONG: store SHA-256(password) ← fast, unsalted → cracked in hours
RIGHT: store Argon2(password, unique_salt) ← slow, salted → cracking is infeasible

Modern KDFs handle the salt for you and bake it into the stored output. Use Argon2id (or bcrypt if that's what your platform offers); never a bare SHA/MD5 for passwords.

Highlight: encryption vs hashing for passwords

Passwords are hashed, not encrypted. Encryption is reversible — if you encrypt passwords, anyone with the key (an attacker who breaches your server) gets every plaintext password back. Hashing is one-way: even you can't recover the password, which is the point. At login you hash the submitted password and compare digests. If a site can email you your original password, they're storing it reversibly — a serious red flag.

MACs: hashing that also proves who

A plain hash proves a message wasn't accidentally changed — but an attacker who alters the message can just recompute the hash, so a bare hash alone doesn't stop deliberate tampering over a channel. A MAC (Message Authentication Code) fixes this by mixing in a shared secret key.

plain hash: H(message) → anyone can recompute it after tampering
MAC: HMAC(key, message) → only someone with the secret key can produce/verify it

Because only holders of the secret key can compute a valid MAC, a correct MAC proves two things:

  • Integrity — the message wasn't altered.
  • Authenticity — it came from someone holding the shared key.

The standard construction is HMAC (Hash-based MAC, e.g. HMAC-SHA256). MACs are what authenticate API requests (signed webhooks, request signing), session tokens, and — recall the last symmetric lesson — the "authentication tag" inside AEAD modes is doing exactly this MAC job.

Use constant-time comparison

When checking a MAC or any secret, compare with a constant-time equality function, not a normal ==. A normal comparison bails out at the first mismatching byte, and the tiny timing difference can leak the secret one byte at a time (a timing attack). Vetted libraries provide constant_time_compare / hmac.compare_digest for this — another reason not to roll your own.

How hashing ties the chapter together

  • A digital signature (last lesson) is "hash the document, then sign the hash with a private key" — the hash gives integrity, the signing gives authenticity. Hashing is the first half of every signature.
  • File integrity / downloads: publish a SHA-256 of a file so downloaders can verify they got the real, unaltered bytes.
  • AEAD's auth tag is a MAC over the ciphertext.
  • Password storage uses slow salted KDFs.

One primitive, four jobs — which is why hashing sits in the middle of the cryptography chapter.

Why it matters

  • It's how integrity is enforced everywhere. Signatures, certificates, package managers, blockchains, Git commits (which are content-addressed by hash) — all lean on collision-resistant hashing.
  • Password handling is a rite of passage. Getting it wrong (fast hash, no salt, or reversible encryption) is one of the most common and damaging real-world mistakes — it turns one database breach into millions of cracked accounts, often reused across other sites.
  • MACs guard the trust boundary. The boundary lens said "verify data crossing in." A MAC is the cryptographic way to verify a message arriving from elsewhere is authentic and intact.

Common pitfalls

Where people commonly trip up
  • Hashing passwords with a fast hash (SHA-256, MD5). Fast is the enemy here. Use a slow, salted KDF (Argon2id, bcrypt, scrypt).
  • No salt (or a shared/static salt). Unsalted hashes let attackers use precomputed rainbow tables and crack identical passwords once. Salt must be unique per password (modern KDFs do this for you).
  • Encrypting passwords instead of hashing. Reversible = recoverable by whoever steals the key. Hash, don't encrypt. A site that can email your password back is doing it wrong.
  • Using MD5/SHA-1 for anything security-relevant. Both are collision-broken. Use SHA-256/SHA-3/BLAKE for integrity and signatures.
  • Confusing a hash with a MAC. A bare hash doesn't stop a deliberate attacker (they recompute it). If you need to prove a message came from a key-holder, use HMAC, not a plain hash.
  • Comparing secrets with normal equality. Use constant-time comparison to avoid timing attacks.

Page checkpoint

Required checkpoint

Hashing & MACs — locked in?

Pass to unlock the Next button below

What's next

→ Continue to TLS 1.3 — where symmetric encryption, key exchange, signatures, and hashing all come together into the handshake that secures every HTTPS connection.

Going deeper: hashing is the first step of every digital signature; the slow-KDF idea reappears in authentication, and constant-time comparison is part of secure coding in Secure SDLC.