Security Strategy

How Penetration Testing Differs from Red Teaming, and When Each Is Appropriate

> diff pen_test.conf red_team.conf | head -20 && echo 'same tools, different objectives'_

Peter Bassill 17 January 2025 15 min read
red teaming purple teaming penetration testing comparison maturity detection security assessment

Same tools, same techniques — fundamentally different objectives.

A penetration test answers the question: "What vulnerabilities exist in this environment, and how can they be exploited?" The tester's objective is to find as many exploitable weaknesses as possible within the scope, demonstrate the risk they represent through exploitation and chain analysis, and produce a comprehensive set of findings that drives remediation.

A red team exercise answers a different question: "Can our organisation detect and respond to a realistic adversary?" The red team's objective is not to find every vulnerability — it's to simulate a realistic attack from initial access through to a defined objective, testing whether the defensive team (the blue team) detects the activity, how quickly they respond, and whether the organisation's incident response processes function under pressure.

The pen tester finds weaknesses. The red team tests defences. Both use the same tools and techniques — Responder, BloodHound, Mimikatz, Cobalt Strike or equivalent — but the purpose is different. The pen tester uses them to identify and document vulnerabilities. The red team uses them to simulate an adversary and measure the organisation's ability to detect and respond.


The differences that matter in practice.

Dimension Penetration Test Red Team Exercise
Primary objective Identify and demonstrate exploitable vulnerabilities. Find as many weaknesses as possible within the scope. Produce a comprehensive finding set. Test the organisation's detection and response capability against a realistic adversary simulation. Achieve a specific objective while evading detection.
Scope Defined and bounded — specific network ranges, applications, or environments. The tester operates within agreed boundaries. Typically organisation-wide. The red team may target any system, any person, any process that a real attacker would. Scope restrictions exist but are minimal.
Blue team awareness The IT team and SOC are typically informed that testing is occurring. The objective is vulnerability identification, not detection testing. The blue team (SOC, IT security) is not informed. A small control team (typically the CISO and one or two others) manages the exercise. The blue team's unawareness is essential — detection must be genuine.
Stealth Not a priority. The tester may use noisy techniques — vulnerability scanning, password spraying, broad reconnaissance — because detection evasion isn't the objective. Essential. The red team operates covertly, using the same evasion techniques a real adversary would: encrypted C2 channels, living-off-the-land binaries, custom payloads, and slow-and-low approaches designed to avoid triggering alerts.
Duration Typically 5–15 days for a standard engagement. Focused on depth within a defined scope. Typically 2–6 weeks. The longer duration allows for realistic reconnaissance, careful initial access, patient lateral movement, and the time for the blue team to detect (or fail to detect) each phase.
Attack vectors Usually technical only — network exploitation, application testing, configuration assessment. Social engineering and physical access may be included but are often separate exercises. All vectors that a real adversary would use: phishing, social engineering, physical access, technical exploitation, and supply chain targeting. The red team selects the vector most likely to succeed, just as a real attacker would.
Findings volume Comprehensive. The report typically contains 20–60+ findings across the scope, covering everything from critical chains to informational configuration notes. Selective. The report focuses on the attack path taken, the detection gaps exploited, and the defensive failures observed — not a comprehensive catalogue of every vulnerability.
Primary deliverable A findings report with individual vulnerabilities, chain analysis, severity ratings, and remediation guidance. The output drives remediation. An attack narrative showing the full adversary simulation — techniques used, detection gaps, blue team response assessment, and recommendations for improving detection and response. The output drives defensive improvement.
Typical cost £8,000–£30,000 for a standard engagement depending on scope and duration. £25,000–£80,000+ depending on scope, duration, and the inclusion of social engineering and physical access.

Matching the assessment to the organisation's maturity.

The most common mistake organisations make is commissioning a red team exercise before they're ready for one. A red team exercise tests detection and response. If the organisation doesn't yet have effective detection — if the SOC can't detect LLMNR poisoning, if the SIEM doesn't have rules for lateral movement, if the EDR hasn't been tuned — the red team will achieve its objective without detection, and the findings will be: "Your detection doesn't work." That's an expensive way to learn something a pen test would have revealed at a fraction of the cost.

Scenario Appropriate Assessment Why
First-time testing. No previous assessments. Unknown vulnerability landscape. Penetration test The organisation needs a comprehensive baseline of vulnerabilities before investing in detection testing. A pen test identifies what's exploitable and produces the remediation roadmap.
Multiple pen tests completed. Recurring findings addressed. Core vulnerabilities remediated. Detection capability deployed but untested. Red team exercise The fundamentals are in place. The organisation needs to know whether its detection and response capabilities actually work against a realistic adversary. The red team provides that answer.
New environment — cloud migration, acquired infrastructure, new application — that hasn't been assessed. Penetration test New environments need vulnerability identification first. A red team exercise against an untested environment will find vulnerabilities incidentally but won't produce the comprehensive finding set needed for systematic remediation.
Compliance requires "penetration testing" — PCI DSS, ISO 27001, Cyber Essentials Plus, DORA general testing. Penetration test Regulatory frameworks typically require vulnerability identification and exploitation testing — not adversary simulation. A pen test satisfies the requirement and produces actionable findings.
The organisation wants to test its incident response plan against a realistic scenario. Red team exercise The red team provides a real (controlled) incident for the blue team to detect and respond to. The exercise validates the IR plan, the communication chain, and the team's ability to contain a live adversary.
Mature security programme. Strong detection. Experienced SOC. The organisation wants to improve collaboratively. Purple team exercise At this maturity level, the adversarial dynamic of red teaming provides less value than collaborative improvement. Purple teaming — where the red and blue teams work together — produces immediate detection rule improvements and shared understanding.

The collaborative evolution — where red and blue work together.

Purple teaming is not a separate team. It's a methodology where the red team (offensive) and the blue team (defensive) work collaboratively rather than adversarially. The red team executes a technique — Kerberoasting, for example. The blue team observes whether their detection fires. If it does, they test the alert quality, the response procedure, and the escalation path. If it doesn't, they build the detection rule immediately, with the red team confirming it works by re-executing the technique.

Dimension Red Team Purple Team
Dynamic Adversarial. Red team operates covertly. Blue team tries to detect. Results assessed after the exercise. Collaborative. Red and blue teams work together in real time. Techniques are executed, detection is tested, and improvements are made during the session.
Blue team awareness Unaware. Detection must be genuine. Fully aware and participating. Detection is tested and improved collaboratively.
Outcome Assessment of current detection and response capability. Identifies gaps but doesn't fix them during the exercise. Immediate improvement. Detection rules created and validated during the session. The blue team leaves with new capabilities.
Duration 2–6 weeks. 1–5 days per session, typically quarterly.
Best for Measuring current detection capability. Testing IR processes. Providing a realistic adversary simulation. Improving detection capability. Building MITRE ATT&CK coverage. Training the blue team against specific techniques. Producing immediate, measurable improvement.

How the three approaches build on each other over time.

The three assessment types — pen testing, red teaming, and purple teaming — form a maturity progression. Each builds on the previous, and attempting to skip stages produces diminished returns.

Years 1–2: Penetration Testing
Establish the baseline. Identify vulnerabilities. Remediate the quick wins. Build the remediation process. Start tracking metrics. The objective is to find and fix the most exploitable weaknesses — reducing the attack surface before testing detection. Without this foundation, red teaming produces only the finding that "everything is exploitable" — which isn't actionable.
Years 2–3: Continued Pen Testing + First Red Team
Continue pen testing against new and expanding scope. Once the core vulnerabilities are remediated and detection capability is deployed, commission the first red team exercise. The exercise answers: does the detection work? How quickly does the SOC respond? Does the IR plan function under pressure? The pen test provided the baseline. The red team tests whether the defences built on that baseline actually work.
Years 3–4: Red Teaming + Purple Teaming
Red team exercises every 12–18 months provide ongoing assessment of detection and response capability. Purple team sessions quarterly provide continuous improvement — building new detection rules, expanding MITRE ATT&CK coverage, and training the blue team against emerging techniques. Pen testing continues for new environments and scope areas.
Year 5+: Mature Programme
All three assessment types operating at appropriate frequencies: pen testing annually for vulnerability identification across rotating scope, red teaming every 12–18 months for adversary simulation, and purple teaming quarterly for continuous detection improvement. Each type serves a different purpose. Each produces different value. The mature programme uses all three.

Choosing the right assessment for where you are now.

Assess Your Maturity Honestly
If the organisation hasn't completed multiple pen tests, remediated the critical findings, and deployed detection capability — it's not ready for a red team exercise. A red team against an immature environment produces an expensive report that says "your detection doesn't work" — something a pen test would have revealed for a fraction of the cost.
Match the Assessment to the Objective
Want to know what's exploitable? Pen test. Want to know if your defences work? Red team. Want to improve your defences collaboratively? Purple team. The objective determines the assessment type — not the budget, not the provider's recommendation, and not the desire to appear more mature than the programme currently is.
Don't Skip Stages
The maturity progression exists for a reason. Red teaming without prior pen testing is testing defences that haven't been hardened. Purple teaming without prior red teaming is improving detection without first understanding where it fails. Each stage builds the foundation for the next.
Use All Three at Maturity
A mature programme doesn't replace pen testing with red teaming — it uses both. Pen testing continues to identify vulnerabilities in new and expanding scope. Red teaming periodically validates detection and response. Purple teaming continuously improves both. Each type answers a different question, and the organisation needs all the answers.
Beware Mislabelling
A pen test with a "red team" label costs more but doesn't deliver adversary simulation. Ask the provider: will the blue team be unaware? Will the engagement include social engineering and physical access? Will the report assess detection and response, not just vulnerabilities? If the answers are no, it's a pen test — regardless of what it's called on the proposal.

The bottom line.

Penetration testing finds weaknesses. Red teaming tests defences. Purple teaming improves defences collaboratively. The tools overlap. The techniques overlap. The objectives don't. A pen test produces a comprehensive vulnerability report that drives remediation. A red team exercise produces an adversary simulation that measures detection and response. A purple team session produces immediate detection improvements through collaborative testing.

The right assessment depends on the organisation's maturity: pen testing first to establish the baseline and remediate the fundamentals, red teaming once detection capability is deployed and needs validation, and purple teaming once the programme is mature enough to benefit from collaborative improvement. Attempting to skip stages — commissioning a red team before the fundamentals are addressed — produces expensive findings that could have been discovered more cheaply.

The question isn't "which is better?" — it's "which is right for where we are now?" The answer should be based on maturity, objectives, and readiness — not on which label sounds most impressive in the board report.


Penetration testing, red teaming, and purple teaming — matched to your objectives and readiness.

We help organisations select and commission the right assessment type for their current maturity: pen testing for vulnerability identification, red teaming for adversary simulation, and purple teaming for collaborative detection improvement — because the most valuable assessment is the one that's appropriate for where you are now.