Penetration Testing and Defensive Security: Complement, Not Competition

The False Choice

"We have a SOC now, so we don't need a pen test." Wrong.

A mid-sized insurance company invests heavily in defensive security. It deploys CrowdStrike Falcon across every endpoint. It stands up a 24/7 managed SOC with a Tier 1/2/3 analyst structure. It implements Microsoft Sentinel as its SIEM, ingesting logs from Active Directory, M365, firewalls, VPN, and DNS. It writes 140 custom detection rules. It passes its Cyber Essentials Plus assessment. The CISO tells the board: "We can see everything."

Six months later, a penetration tester achieves Domain Admin in three hours and forty minutes. They captured credentials via LLMNR poisoning, Kerberoasted a service account, escalated through a misconfigured backup operator role, extracted the NTDS.dit from the domain controller, and accessed the finance share containing policyholder data for 340,000 customers. They moved laterally across four servers and escalated privileges three times.

The SOC detected none of it. Not the LLMNR poisoning. Not the Kerberoasting. Not the lateral movement. Not the NTDS.dit extraction. Not the file share access. CrowdStrike was running on every endpoint — and didn't alert because the tester used legitimate Windows tools and protocols that EDR is not designed to flag by default. The SIEM had 140 rules — none of which matched the specific attack patterns the tester employed.

The SOC wasn't incompetent. It was untested. The detection rules were written against a threat model that had never been validated by a real adversary. The SIEM was collecting the right logs but asking the wrong questions. The EDR was functioning perfectly — within the boundaries of what EDR is designed to detect. Nobody had ever tested whether the defensive stack, as a whole, could detect and respond to the attacks that actually work against this specific environment.

The Core Principle

Penetration testing asks: "Can an attacker compromise this environment?" Defensive security asks: "Can we detect and respond when they try?" These are different questions. The answer to one does not imply the answer to the other. An organisation needs both — and each makes the other more effective.

Different Questions

What offence and defence actually measure.

The misconception that penetration testing and defensive security are interchangeable — that you can invest in one instead of the other — stems from a failure to understand what each is measuring. They operate on different planes of the same problem.

	Penetration Testing	Defensive Security (SOC / EDR / SIEM)
Primary question	Can an attacker breach this environment, escalate privileges, and access sensitive data?	Can we detect, investigate, and respond to an attack in progress before it causes damage?
What it finds	Exploitable vulnerabilities, misconfigurations, weak credentials, missing controls, viable attack paths, chainable findings.	Alert gaps, log blind spots, detection rule failures, response time deficiencies, analyst knowledge gaps, playbook weaknesses.
When it operates	Point-in-time. A defined engagement with a start date, end date, and scope. A snapshot of the environment's vulnerability posture at a specific moment.	Continuous. 24/7/365 monitoring. Ongoing detection and response. The persistent defensive capability that operates between pen tests.
What it assumes	The attacker is skilled, motivated, and patient. They will find the weakest point and exploit it. The test assumes defences may fail.	Attacks will occur. The defensive stack must detect them quickly enough and respond effectively enough to limit damage. The SOC assumes attacks will get through.
What it doesn't measure	Whether the defensive stack detected the attack. Whether the SOC responded appropriately. Whether the organisation's incident response process works. (Unless explicitly scoped as a detection assessment.)	Whether the attack would have succeeded in the first place. Whether the vulnerability exists. Whether the misconfiguration is exploitable. Whether the attack path is viable.
Output	A report of confirmed vulnerabilities, demonstrated attack paths, and remediation recommendations ranked by risk.	Ongoing alerts, investigation reports, incident response actions, threat intelligence integration, and continuous posture improvement.
Analogy	A burglar testing whether your locks, windows, and alarm system can be bypassed.	The alarm company monitoring your house 24/7 and dispatching a response team when the sensors trigger.

The burglar test is useless if you never install an alarm. The alarm is useless if you never test whether the burglar can get past it. Security requires both — and the gap between them is where real-world breaches live.

The Detection Gap

The most important metric nobody is tracking.

When a penetration test is conducted alongside — not instead of — a functioning SOC, it produces a metric that neither can generate alone: the detection gap. This is the delta between what the attacker did and what the SOC saw. It's the most actionable finding in any engagement that includes detection assessment, and it's the metric that turns a pen test from a vulnerability report into a detection improvement programme.

Detection Gap Analysis — Insurance Company Engagement
# Attacker Actions vs SOC Detections

action_01= LLMNR poisoning (Responder)# Detected: NO
action_02= NTLMv2 hash capture (4 hashes)# Detected: NO
action_03= Offline hash cracking (hashcat)# Detected: N/A (offline)
action_04= Domain auth with cracked creds# Detected: NO
action_05= BloodHound AD enumeration (LDAP)# Detected: NO
action_06= Kerberoasting (TGS request for svc_backup)# Detected: NO
action_07= Lateral movement: PsExec to FILESRV01# Detected: NO
action_08= Lateral movement: WMI to DC01# Detected: NO
action_09= NTDS.dit extraction via secretsdump# Detected: NO
action_10= File share access: \\FILESRV01\Finance# Detected: NO

# Summary
attacker_actions= 10
soc_detections= 0
detection_rate= 0%
mean_time_to_detect= ∞ (never detected)

# What Was Missing
gap_01= no SIEM rule for LLMNR/NBT-NS traffic
gap_02= no alert for Kerberos TGS anomalies (RC4 requests)
gap_03= no alert for PsExec/WMI lateral movement patterns
gap_04= no alert for NTDS.dit access or volume shadow copy
gap_05= no alert for bulk file share access from new source
gap_06= EDR running but no custom rules for LOLBIN abuse
gap_07= SIEM ingesting DC logs but not parsing Event ID 4769

Ten attacker actions. Zero detections. A 0% detection rate against the exact attack techniques that produce Domain Admin in the majority of internal penetration tests. The SOC had 140 rules — but none of them covered LLMNR poisoning, Kerberoasting, PsExec lateral movement, NTDS.dit extraction, or anomalous file share access. The rules detected threats the organisation had imagined. The pen test revealed the threats that actually materialised.

After the engagement, the SOC team used the detection gap analysis to write seven new detection rules — each mapped to a specific attacker action that had succeeded undetected. The next pen test, six months later, achieved the same initial compromise — but the SOC detected the Kerberoasting at minute 40, the lateral movement at minute 55, and triggered the incident response playbook at minute 62. Domain Admin was still achievable, but containment began before the tester reached the finance share. That improvement was only possible because the pen test measured the detection gap and gave the SOC team the specific intelligence they needed to close it.

What Pen Tests Reveal About Your SOC

Findings that only adversarial testing can surface.

A SOC that has never faced a realistic adversary is untested by definition. It may have passed tabletop exercises, responded to commodity malware alerts, and tuned rules against known indicators of compromise. But none of that proves it can detect a skilled attacker using legitimate tools, valid credentials, and native protocols to move through the environment.

What Pen Testing Reveals	Why the SOC Can't Find This Alone
Detection rules that don't fire — rules that look correct on paper but fail to trigger against real attack patterns because of parsing errors, threshold misconfiguration, or log source gaps.	A rule that's never been triggered can't be validated by normal operations. Only a real attack — or a simulated one — produces the telemetry that proves whether the rule works. The SOC has no way to test its own rules without adversarial input.
Log blind spots — categories of activity that generate no telemetry because the relevant log source isn't ingested, isn't parsed, or isn't enabled. Common gaps: PowerShell script block logging, Kerberos service ticket requests (Event ID 4769), DNS query logging, SMB access auditing.	The SOC can't detect what it can't see. But it often doesn't know what it can't see until an attacker operates in the blind spot. The pen test maps the attacker's path and cross-references it against the SIEM's data sources — revealing every point where telemetry was absent.
Alert fatigue and prioritisation failure — the attack generated some alerts, but they were buried in noise, deprioritised by triage logic, or dismissed as false positives by Tier 1 analysts.	Alert fatigue is invisible from the inside. Analysts don't report the alerts they dismissed — they report the alerts they escalated. The pen test reveals which real-attack indicators were generated, triaged, and discarded.
Response playbook gaps — the SOC detected something but didn't know what to do with it. The playbook didn't cover this scenario. The escalation path was unclear. The containment action wasn't defined.	Playbooks are written against anticipated scenarios. Pen tests create unanticipated scenarios — or rather, scenarios the SOC should have anticipated but didn't. The gap between detection and effective response is only visible under real pressure.
EDR bypass techniques — the attacker used living-off-the-land binaries (LOLBins), legitimate remote administration tools, or in-memory execution that the EDR didn't flag because the activity used signed, trusted binaries.	EDR is designed to detect malware, exploit attempts, and suspicious process behaviour. Attackers who use certutil, mshta, wmic, PsExec, and PowerShell — all signed Microsoft binaries — operate within the EDR's trust model. Only adversarial testing reveals whether custom EDR rules have been created for LOLBin abuse.
Mean time to detect (MTTD) and mean time to respond (MTTR) — the actual elapsed time between attacker action and SOC detection, and between detection and effective containment. The real numbers, not the SLA targets.	The SOC tracks MTTD and MTTR for incidents it detects. It cannot track MTTD for incidents it misses entirely. The pen test provides the ground truth: every attacker action is timestamped, and every SOC response (or lack thereof) is measurable against it.

What Your SOC Reveals About Pen Tests

The defensive context that changes the risk picture.

The complement works in both directions. A pen test without SOC context reports every finding at face value — as if the attacker operates undetected. But if the SOC detects and contains the Kerberoasting within 15 minutes, the finding's real-world risk is different from an environment where it goes undetected for days. The defensive capability changes the risk calculus.

Time-to-Containment Changes Severity

A lateral movement path that takes 90 minutes to execute is critical if the SOC never detects it. It's high — but not critical — if the SOC detects and isolates the compromised endpoint at minute 45. The pen test finding is the same. The real-world risk is different. Including SOC performance in the risk rating produces a more accurate picture of actual exposure.

Detection Validates Remediation Priority

If the SOC reliably detects Kerberoasting within 10 minutes, the urgency of disabling RC4 encryption for Kerberos tickets decreases — the compensating control is effective. If the SOC can't detect it at all, the remediation priority escalates. Detection context helps the organisation allocate remediation effort where it matters most.

Repeated Testing Tracks Defensive Maturity

The first pen test achieves Domain Admin in 3 hours with 0% detection. After the SOC implements the gap analysis recommendations, the next test achieves DA in 3 hours with 60% detection and containment initiated at hour 2. The third test: DA still achievable but SOC triggers full incident response at 45 minutes. The vulnerability hasn't changed. The defensive response has transformed.

SOC Threat Intelligence Improves Test Scope

The SOC sees the real threat landscape — the attacks that actually hit the organisation daily. That intelligence should inform pen test scope: if the SOC is seeing a surge in QR code phishing and NTLM relay attacks, the next pen test should specifically target those vectors to validate whether the defences are keeping pace with the threat.

Purple Teaming

When offence and defence work together in real time.

Purple teaming is the natural evolution of the pen test / SOC relationship. Instead of the pen tester operating covertly and the SOC discovering the results after the engagement, both teams work collaboratively — the tester executes techniques one at a time, the SOC attempts to detect each one, and gaps are addressed immediately.

	Traditional Pen Test	Purple Team Exercise
Attacker visibility	Covert. The tester operates without the SOC's knowledge (or with minimal notification). Realism is maximised.	Collaborative. The tester and SOC analysts sit in the same room (or virtual session). Each technique is announced, executed, and evaluated together.
Detection feedback	After the engagement. The SOC learns what it missed when the report is delivered — days or weeks later.	Immediate. After each technique, the SOC checks: did we detect it? If not, why not? The gap is diagnosed, and a rule is drafted or tuned in the same session.
Output	A vulnerability report with a detection gap appendix.	A detection improvement log: each technique tested, the detection result, the root cause of any gap, and the rule or configuration change that closes it.
Best for	Assessing the overall security posture. Testing whether defences hold under realistic adversarial pressure without the SOC having advance notice.	Rapidly improving detection capability. Training SOC analysts against real attack techniques. Building and tuning detection rules with immediate feedback.
MITRE ATT&CK alignment	The report maps findings to ATT&CK techniques. The SOC reviews them post-engagement.	Each technique is executed by ATT&CK ID. Detection coverage is measured technique by technique against the ATT&CK matrix. Gaps are visualised in real time.

Purple teaming isn't a replacement for traditional pen testing — it serves a different purpose. A covert pen test answers: "Can the SOC detect a realistic attack?" A purple team exercise answers: "For each specific technique, does the SOC have visibility, and if not, how do we create it?" Mature organisations run both: periodic covert pen tests to validate, and regular purple team sessions to improve.

The Maturity Model

How offence and defence grow together.

The relationship between pen testing and defensive security evolves as the organisation's security programme matures. At each stage, the value each delivers — and the value each derives from the other — increases.

Maturity Stage	Pen Test Focus	Defensive Security Focus	How They Interact
1. Foundational	Identify the vulnerabilities. What misconfigurations exist? What can be exploited? What are the critical findings?	Deploy the tools. Stand up EDR, SIEM, and basic logging. Build the initial rule set. Establish a monitoring capability.	Minimal interaction. The pen test finds vulnerabilities. The SOC is too new to be tested. The findings are remediated independently. The detection gap isn't yet measured.
2. Developing	Test the attack paths. Can the vulnerabilities be chained? How quickly does the attacker reach critical assets?	Tune the rules. Reduce false positives. Expand log sources. Begin tracking MTTD and MTTR for detected incidents.	The pen test report includes a detection gap appendix. The SOC uses it to write new rules. Detection improves between engagements. The feedback loop begins.
3. Established	Test detection explicitly. Did the SOC detect the attack? How quickly? Was the response effective? Include detection assessment in the pen test scope.	Proactive hunting. Threat intelligence integration. Custom EDR rules for LOLBin abuse. Analyst training programme.	The pen test is designed to test the SOC as much as the infrastructure. Detection rate becomes a primary metric. Purple team exercises supplement covert pen tests.
4. Advanced	Red team operations. Assumed breach. Objective-based testing ("can you reach the SWIFT terminal?"). Multi-week, multi-vector campaigns.	Mature SOC with threat hunting, behavioural analytics, deception technology (honeypots, honey tokens). Detection engineering as a discipline.	Full adversary simulation against a battle-tested SOC. The pen test is the training ground for the SOC. Each engagement makes the detection capability stronger. The detection gap shrinks with each cycle.
5. Optimised	Continuous red teaming. Attack simulation platforms. Automated technique replay against the detection stack.	Detection-as-code. Automated rule testing. Continuous validation of detection coverage against the ATT&CK matrix.	Offence and defence operate as a continuous loop. New attack techniques are tested against the detection stack within days of publication. The organisation's security posture is validated continuously, not annually.

Common Missteps

Where organisations get the balance wrong.

Understanding that pen testing and defensive security are complementary is the first step. Implementing that understanding without falling into common traps is the second.

Misstep	The Problem	The Fix
"We have EDR, so the pen test is less important now"	EDR detects a category of threats — malware, exploit attempts, suspicious process behaviour. It does not detect misconfiguration exploitation, credential abuse with legitimate tools, or living-off-the-land attacks by default. The pen test reveals the threats that EDR was never designed to catch.	Commission the pen test explicitly to test whether EDR detects the techniques used. Include EDR bypass assessment in the scope. Use the results to create custom EDR rules.
"The pen test didn't trigger any alerts, so the SOC failed"	A skilled pen tester deliberately avoids detection — that's part of the test. If the test was scoped as a covert assessment, a low detection rate measures SOC capability against a motivated adversary. It doesn't mean the SOC "failed" — it means the SOC now has specific intelligence about what to detect.	Frame the detection gap as a learning opportunity, not a performance failure. Use the gap analysis to build new detection rules. Measure improvement across engagements.
"We'll do the pen test in stealth mode and not tell the SOC"	If nobody in the SOC knows the pen test is happening, and the tester triggers a real incident response — analysts working through the night, management escalation, potential data breach notification — the organisation has wasted significant resources and damaged trust.	Always brief a designated SOC liaison. They don't share details with analysts (preserving the test's realism) but they can distinguish pen test activity from a genuine breach if escalation is needed.
"We do pen tests annually and purple teams quarterly — that's enough"	Annual pen tests assess point-in-time posture. Quarterly purple teams improve detection. But neither tests the organisation's ability to respond to a novel, multi-week intrusion that evolves over time — which is how real advanced threats operate.	At maturity, supplement pen tests and purple teams with a red team engagement: a multi-week, objective-based exercise where the red team operates covertly with an evolving strategy, testing the full kill chain from initial access to objective completion.
"The pen test and the SOC are managed by different vendors — they don't talk"	The pen test report is delivered to the security manager. The SOC operates independently. Nobody cross-references the attacker's actions with the SOC's telemetry. The detection gap is never measured. The most valuable finding from the engagement — what the SOC missed — is lost.	Require the pen test provider to produce a detection gap analysis. Share the tester's timestamped action log with the SOC. Hold a joint debrief where the tester walks the SOC through the attack path and the SOC identifies where telemetry existed but rules didn't fire.

For Your Organisation

Making offence and defence strengthen each other.

Include Detection Assessment in Pen Test Scope

When commissioning a pen test, explicitly request a detection gap analysis. Ask the provider to timestamp every significant action and, after the engagement, cross-reference with the SOC's telemetry. The resulting gap analysis is the blueprint for your next quarter of detection engineering.

Run Purple Team Sessions After Every Pen Test

Use the pen test findings as the purple team's attack playbook. The tester replays each technique while the SOC watches. Rules are written, tested, and validated in real time. By the end of the session, every technique that succeeded undetected in the pen test now has a detection rule.

Track Detection Rate Across Engagements

The detection gap should shrink over time. If your first pen test achieved DA with 0% detection, and your third achieves DA with 70% detection and incident response triggered within an hour, your defensive security programme is demonstrably maturing — and you have the evidence to prove it to the board.

Brief a SOC Liaison, Not the SOC

A designated senior SOC member should know the pen test is happening — dates, general scope, emergency contact. They do not share this with analysts. This preserves the test's realism while preventing the SOC from wasting resources on a false incident response.

Let the SOC Shape the Pen Test Scope

The SOC knows which attack techniques it sees most often, which detection rules it's least confident in, and which areas of the environment have the weakest telemetry. That intelligence should inform the pen test scope — ensuring the tester focuses on the areas where detection validation is most needed.

Test Continuously, Not Annually

An annual pen test validates once. Quarterly purple teams improve four times. Monthly attack simulations validate continuously. The faster the loop between "attacker does X" and "SOC detects X," the more resilient the organisation becomes. Match the testing cadence to the environment's rate of change.

Summary

The bottom line.

Penetration testing and defensive security are not alternatives. They're not even adjacent disciplines that happen to coexist. They are complementary halves of the same capability: the ability to understand, detect, and respond to the attacks that threaten the organisation.

A pen test without a SOC produces a list of vulnerabilities but no understanding of whether the organisation would detect the exploitation. A SOC without a pen test produces alerts and metrics but no evidence that the detection rules work against the attacks that actually succeed. The detection gap — the delta between what the attacker did and what the SOC saw — is the metric that ties them together, and it's only measurable when both are operating.

The organisations with the strongest security posture aren't the ones that spend the most on EDR or commission the most pen tests. They're the ones that use each to make the other better: the pen test reveals what the SOC misses, the SOC provides the context that changes how pen test findings are prioritised, and together they drive a continuous cycle of testing, detection, improvement, and retesting that makes the organisation measurably harder to compromise with every iteration.

Test Your Detection Capability

Find out what your SOC sees — and what it doesn't.

Our penetration tests include detection gap analysis as standard — timestamped attacker actions cross-referenced against your SOC's telemetry, delivering the specific intelligence your defensive team needs to close the gaps that matter.

Discuss Detection Assessment Read: The Human Element

All Posts Get in Touch