Skip to main content

Reconnaissance

In one line: Reconnaissance is the attacker's mapping phase — discovering everything that makes up the target's attack surface — and it splits into passive recon (gathering public information without touching the target) and active recon (probing the target directly), because the completeness of your map decides whether you find the one exposed thing that matters.

Authorized scope only

Active reconnaissance — scanning, enumeration, probing — touches the target and is only legal within an authorized, scoped engagement. Passive recon uses public sources, but even there, stay within scope and the law. Recon is where over-eager testers first drift out of bounds; let your scope define your map.

In plain English

Before a heist, the crew studies the building for weeks: entrances, guard schedules, camera placement, who works there. Reconnaissance is that study phase for a system — and it's where most real attacks actually succeed or fail, long before any "hacking." The goal is to build the most complete possible picture of the target: every domain, server, open port, running service, employee email, exposed document, and forgotten test environment. Why does it matter so much? Because an organization only has to forget one thing — an old subdomain, a leaked password, a debug endpoint — and a thorough recon finds it. Attackers are patient mappers; the flashy exploit is usually the short part. Recon comes in two flavors: passive (learn about the target from public information without ever touching it — invisible) and active (poke the target directly to see what it reveals — noisier but precise). You generally do passive first.

Passive recon: learn without touching

Passive reconnaissance gathers information from public, third-party sources — never sending traffic to the target's own systems, so it's essentially undetectable. This is OSINT (Open-Source Intelligence): assembling a picture from what's already out there.

Typical passive sources and what they reveal:

  • DNS & domain records — subdomains, mail servers, IP ranges, hosting providers. Public certificate transparency logs (every TLS cert issued is logged publicly) are a goldmine for discovering subdomains an org didn't mean to expose.
  • Search engines & the web — exposed documents, error pages, login portals, and robots.txt hints. ("Google dorking" = using advanced search operators to find exposed files/pages.)
  • Public code & data — company repositories, and the leaked secrets that hide in them; paste sites; misconfigured cloud storage.
  • Breach data & credential dumps — emails and passwords from other sites' breaches, fueling credential stuffing.
  • Employee & org footprint — staff names, roles, and emails (from professional networks), useful for guessing username formats and for social engineering.
  • Infrastructure aggregators — services that continuously scan the whole internet and let you look up a target's exposed hosts/services without you scanning them yourself (passive, because someone else did the scanning).
Terms, defined once
  • OSINT (Open-Source Intelligence) — intelligence assembled from publicly available sources.
  • Attack surface — the total set of points an attacker could target (from Foundations). Recon's job is to enumerate it fully.
  • Subdomain enumeration — finding all of an organization's subdomains (e.g., dev., vpn., old.), since forgotten ones are common weak points.
  • Port scanning — probing which network ports are open on a host, revealing what services it runs.
  • Service/version enumeration — identifying the specific software and version behind an open port (so you can check it for known CVEs).
  • Fingerprinting — identifying technologies in use (web server, framework, CMS) from their tell-tale responses.
  • Footprint — the total discoverable presence of an organization online.

Active recon: probe the target directly

Active reconnaissance sends traffic to the target to learn what passive sources can't — what's actually running, right now. It's more precise but detectable (it shows up in the target's logs), so it only happens inside authorized scope. The progression narrows from "what exists" to "what's exploitable":

  1. Host discovery — which IPs in scope are alive.
  2. Port scanning — which ports are open on each host (a port is a door; an open one means a service is listening). Tools like Nmap are the classic here.
  3. Service & version enumeration — what software and version sits behind each open port. This is the pivotal step: a version number lets you look up known vulnerabilities (CVEs) for that exact software.
  4. Application enumeration — for web targets, discovering directories, endpoints, parameters, and technologies (the trust boundaries and inputs you'll test in exploitation).
Worked example: from a domain to a candidate weakness

Authorized to test example.com, you build the map:

  1. Passive: Certificate transparency logs reveal app.example.com, vpn.example.com, and — interestingly — legacy-portal.example.com (a name suggesting an old, possibly unmaintained system). An infrastructure aggregator shows legacy-portal exposes a web service.
  2. Active (in scope): A port scan of legacy-portal.example.com shows port 443 open. Version enumeration fingerprints it as a content-management system running a years-old version.
  3. Cross-reference: That version has a publicly known CVE for authentication bypass.

You haven't exploited anything yet — but recon has turned "a company" into "a specific, likely-vulnerable forgotten system and a candidate weakness to validate." Notice the winning thread was the forgotten asset: thorough enumeration found the one thing the org wasn't watching. That's the entire value of recon — you only have to find the one door they forgot to lock.

Why "the one forgotten thing" wins

Recon's power comes from an asymmetry you met in Foundations: defenders must secure everything; an attacker needs one gap. Organizations accumulate forgotten subdomains, abandoned test servers, shadow IT, expired-but-live services, and leaked credentials. A defender who isn't doing their own recon doesn't even know these exist. Thorough enumeration systematically surfaces them — which is why mature defenders run continuous attack-surface discovery on themselves, recon-ing their own footprint before an attacker does.

A note on the human attack surface

Recon isn't only technical. The people in an organization are part of the attack surface, and social engineering — manipulating humans into revealing information or access (phishing, pretexting, baiting) — is, in real breaches, often the easiest path in (the path of least resistance). Passive recon feeds it: employee names and email formats enable convincing phishing. Whether social engineering is permitted is an explicit Rules-of-Engagement decision — never assume it's in scope. The defense is largely awareness training, plus the technical controls (MFA, etc.) that limit what a phished credential can do.

Why it matters

  • It's where engagements are won. Coverage in recon determines everything downstream — you can't exploit what you never found. The flashy exploit is usually the short, final step on top of patient mapping.
  • It mirrors what defenders must do. Attack-surface management is recon turned inward. Understanding offensive recon is exactly how a defender learns to find their own forgotten exposures first.
  • Passive-first is a real skill. Knowing how much you can learn without touching the target — and staying invisible while you do — is core tradecraft, and the safest place to start.

Common pitfalls

Where people commonly trip up
  • Rushing to exploitation with a thin map. Skipping thorough enumeration means missing the forgotten asset that was the real way in. Breadth of recon beats speed to exploit.
  • Doing active recon outside scope. Scanning touches the target and is detectable and illegal outside authorization. Let scope bound your map; passive-first keeps you safe and quiet.
  • Ignoring subdomains and forgotten assets. The main site is usually the hardened one. The weak points are dev., old., staging., and shadow IT — enumerate exhaustively.
  • Overlooking version enumeration. "Port 443 is open" is far less useful than "it's running X version Y," which maps directly to known CVEs. Always fingerprint versions.
  • Forgetting the human surface. Technical recon misses social-engineering paths that are often the easiest — but only test them if RoE explicitly allows.
  • Being needlessly noisy. Aggressive scanning can disrupt services and trip defenses; respect intensity/timing limits in the RoE.

Page checkpoint

Required checkpoint

Did reconnaissance click?

Pass to unlock the Next button below

What's next

→ Continue to Exploitation — turning the candidate weaknesses recon surfaced into safely-demonstrated proof, using the bug classes from AppSec and beyond.

Going deeper: recon turned inward is attack-surface management, a defensive practice in Detection & Response; the leaked-credential angle ties back to secret scanning.