Detection Engineering
In one line: Detection engineering is the craft of writing the rules that turn SIEM telemetry into alerts — and the entire skill is the tradeoff between catching real attacks and not drowning analysts in false alarms, which is won by detecting behavior (hard for attackers to change) rather than brittle indicators (trivial to change), and by treating detections as tested, tuned, version-controlled code.
A detection is a rule that says "if you see this in the logs, raise an alarm." Easy to write; hard to write well. The whole problem is a tension: make a rule too broad and it screams constantly at innocent activity (so analysts start ignoring it — the deadly alert fatigue of the next lesson); make it too narrow and a slightly-different attack slips right past. Worse, the easy things to detect are the things attackers change in seconds — a specific malware file's fingerprint, an IP address. The valuable things to detect are the attacker's underlying behaviors — "a user account just did credential-dumping then logged into ten servers" — because the attacker can't easily stop behaving like an attacker even when they swap tools. Detection engineering is the discipline of writing rules that target durable behaviors, tested and tuned so they fire on real badness and stay quiet otherwise. It's where a blue team's quality really lives.
The core tension: signal vs. noise
Every detection lives on a spectrum between two failure modes:
- False positives — the rule fires on benign activity. Too many and analysts suffer alert fatigue: they stop trusting alerts, and the real one gets dismissed as "probably noise again." False positives don't just waste time — they actively hide true positives.
- False negatives — the rule misses a real attack. The attacker sails through undetected; dwell time grows.
You can't max out both; tightening one tends to loosen the other. Good detection engineering finds the useful point on that curve — and, crucially, measures and tunes it over time rather than writing a rule once and forgetting it.
- Detection (rule / analytic) — logic that identifies suspicious activity in telemetry and raises an alert.
- True/false positive, true/false negative — alert fired and was real / fired but benign / didn't fire and there was nothing / didn't fire but there was an attack.
- Indicator of Compromise (IOC) — a specific artifact of a known attack: a file hash, a malicious IP/domain. Easy to match, easy for attackers to change.
- TTP (Tactics, Techniques, and Procedures) — how an adversary operates — their behaviors. Durable; the high-value detection target (formalized by MITRE ATT&CK).
- Pyramid of Pain — a model ranking indicators by how much pain it causes an attacker to change them — low (hashes, IPs) to high (their TTPs/behaviors).
- Detection-as-code — managing detections like software: version-controlled, peer-reviewed, and tested.
- Threat hunting — proactively searching telemetry for attackers without a triggering alert, often to discover gaps that become new detections.
- Baseline — a model of "normal" for a user/host/system, so anomalies can be flagged.
Detect behavior, not just indicators: the Pyramid of Pain
The single most important principle in detection engineering. The Pyramid of Pain ranks what you can detect by how badly it hurts the attacker to evade it:
▲ more PAIN to the attacker (more durable detection)
TTPs / behaviors ── attacker must change HOW they operate (very hard)
Tools ── attacker must swap toolkits (hard)
Network/host artifacts ── change file names, registry keys (annoying)
Domain names ── register a new domain (easy)
IP addresses ── change IP (trivial)
File hashes ── recompile, one byte changes the hash (trivial)
▼ less pain (brittle detection)
- Bottom (hashes, IPs): trivial to detect and trivial to evade. A detection keyed to a specific malware hash breaks the instant the attacker recompiles. Necessary as a cheap layer, but low-value alone.
- Top (TTPs / behaviors): hard to detect, but if you nail it, the attacker can't easily escape — because changing their behavior means changing how they attack at all. A detection for "credential-dumping behavior followed by lateral authentication" catches the technique regardless of which tool, IP, or file the attacker uses.
Two detections for the same threat — an attacker stealing credentials from memory:
- Brittle (low pyramid): "Alert if a process with hash
abc123…runs." → The attacker recompiles their tool (new hash) or uses a different one, and your detection is blind. It also misses the living-off-the-land version entirely. - Durable (high pyramid): "Alert when any process accesses the credential store / reads another process's memory in a way associated with credential theft." → This catches the behavior — dumping credentials — no matter what tool, name, or hash performs it. The attacker would have to stop credential-dumping as a technique to evade it, which defeats their purpose.
Same threat, wildly different resilience. Investing detection effort up the pyramid is what makes a blue team durable against adaptive attackers — and it's exactly why MITRE ATT&CK (a catalog of TTPs) is the field's organizing framework.
Behavior detection needs a baseline
Detecting "abnormal" behavior requires knowing what normal is. A baseline models typical activity — which servers a workstation talks to, when a user logs in, how much data a service moves — so the SIEM can flag deviations:
- This account normally logs in from one city, 9–5; now it's 3 a.m. from another continent.
- This server never initiates outbound connections; now it's beaconing to an unknown host every 60 seconds. (C2.)
- This user reads ~10 records a day; now they pulled 50,000. (exfiltration.)
Anomaly-based detection is powerful precisely because it doesn't depend on knowing the attacker's specific tools — it flags behavioral deviation. Its challenge is false positives (legitimate-but-unusual activity), which is why baselines must be tuned and combined with context.
Treat detections as code
Mature teams practice detection-as-code: detections are written, reviewed, version-controlled, and tested like software — not clicked together once in a console and forgotten.
- Version control & review — every detection is in source control, peer-reviewed, with a documented rationale (what it catches, why, expected false-positive rate).
- Test it — validate that the rule fires on the attack (often by safely emulating the technique) and doesn't fire on normal activity, before and after changes.
- Tune continuously — measure each rule's true/false-positive rate in production and refine. A noisy rule is a bug to fix, not background static to endure.
- Document and map — tie each detection to the ATT&CK technique it covers, so you can see your coverage and gaps as a map.
And complement rules with threat hunting: analysts proactively searching telemetry for attackers without an alert, on a hypothesis ("if an attacker were doing X, what would I see?"). Hunting finds what your current detections miss — and each discovery becomes a new, tuned detection.
This is the counterintuitive core of the discipline. A rule that fires 100 times a day with 99 false positives doesn't make you safer — it makes you less safe, because it trains analysts to reflexively dismiss that alert, and the one real hit dies in the noise. The goal is never "more alerts"; it's high-signal alerts a human can trust. Every detection you ship is a promise that when it fires, it's worth someone's attention. Breaking that promise repeatedly is how the next lesson's alert fatigue — and the breaches it causes — happen.
Why it matters
- It's where detection quality is decided. Logs and a SIEM are potential; detections are what actually catch things. Good ones make a small team effective; bad ones make a big team blind through fatigue.
- It's the durable counter to adaptive attackers. Behavior/TTP detection (high pyramid) holds up when attackers change tools, IPs, and files — which they do constantly. This is how you stay ahead instead of always one indicator behind.
- It directly connects offense to defense. Every detection targets a post-exploitation behavior. Knowing how attackers operate is precisely what lets you write detections that catch them — the offensive chapter, repurposed.
Common pitfalls
- Detecting only indicators (hashes, IPs). They're trivial for attackers to change, so these detections break instantly. Use them as a cheap layer, but invest up the Pyramid of Pain toward behaviors/TTPs.
- Writing noisy rules and tolerating them. High false positives cause alert fatigue, which hides true positives. A noisy detection is a bug to tune or kill, not static to endure.
- No baseline for anomaly detection. "Flag abnormal" is meaningless without a model of normal. Build and tune baselines so deviations are real signals.
- Write-once detections. Threats and environments change; an untested, untuned rule rots. Treat detections as code — versioned, reviewed, tested, and measured.
- Coverage blind spots. Without mapping detections to a framework (ATT&CK), you can't see which techniques you'd miss. Map coverage and hunt for the gaps.
- Relying only on rules, never hunting. Rules catch known patterns; threat hunting finds the unknowns that become tomorrow's detections. Do both.
Page checkpoint
Did detection engineering click?
Pass to unlock the Next button belowWhat's next
→ Continue to Alerting & the SOC — the people and process behind the detections: how a security operations center triages alerts, escalates incidents, and fights the alert fatigue this lesson warned about.
→ Going deeper: the framework that catalogs the TTPs you detect is MITRE ATT&CK; the attacker behaviors you're writing rules for are post-exploitation.