> diff pen_test.conf red_team.conf | head -20 && echo 'same tools, different objectives'_
A penetration test answers the question: "What vulnerabilities exist in this environment, and how can they be exploited?" The tester's objective is to find as many exploitable weaknesses as possible within the scope, demonstrate the risk they represent through exploitation and chain analysis, and produce a comprehensive set of findings that drives remediation.
A red team exercise answers a different question: "Can our organisation detect and respond to a realistic adversary?" The red team's objective is not to find every vulnerability — it's to simulate a realistic attack from initial access through to a defined objective, testing whether the defensive team (the blue team) detects the activity, how quickly they respond, and whether the organisation's incident response processes function under pressure.
The pen tester finds weaknesses. The red team tests defences. Both use the same tools and techniques — Responder, BloodHound, Mimikatz, Cobalt Strike or equivalent — but the purpose is different. The pen tester uses them to identify and document vulnerabilities. The red team uses them to simulate an adversary and measure the organisation's ability to detect and respond.
| Dimension | Penetration Test | Red Team Exercise |
|---|---|---|
| Primary objective | Identify and demonstrate exploitable vulnerabilities. Find as many weaknesses as possible within the scope. Produce a comprehensive finding set. | Test the organisation's detection and response capability against a realistic adversary simulation. Achieve a specific objective while evading detection. |
| Scope | Defined and bounded — specific network ranges, applications, or environments. The tester operates within agreed boundaries. | Typically organisation-wide. The red team may target any system, any person, any process that a real attacker would. Scope restrictions exist but are minimal. |
| Blue team awareness | The IT team and SOC are typically informed that testing is occurring. The objective is vulnerability identification, not detection testing. | The blue team (SOC, IT security) is not informed. A small control team (typically the CISO and one or two others) manages the exercise. The blue team's unawareness is essential — detection must be genuine. |
| Stealth | Not a priority. The tester may use noisy techniques — vulnerability scanning, password spraying, broad reconnaissance — because detection evasion isn't the objective. | Essential. The red team operates covertly, using the same evasion techniques a real adversary would: encrypted C2 channels, living-off-the-land binaries, custom payloads, and slow-and-low approaches designed to avoid triggering alerts. |
| Duration | Typically 5–15 days for a standard engagement. Focused on depth within a defined scope. | Typically 2–6 weeks. The longer duration allows for realistic reconnaissance, careful initial access, patient lateral movement, and the time for the blue team to detect (or fail to detect) each phase. |
| Attack vectors | Usually technical only — network exploitation, application testing, configuration assessment. Social engineering and physical access may be included but are often separate exercises. | All vectors that a real adversary would use: phishing, social engineering, physical access, technical exploitation, and supply chain targeting. The red team selects the vector most likely to succeed, just as a real attacker would. |
| Findings volume | Comprehensive. The report typically contains 20–60+ findings across the scope, covering everything from critical chains to informational configuration notes. | Selective. The report focuses on the attack path taken, the detection gaps exploited, and the defensive failures observed — not a comprehensive catalogue of every vulnerability. |
| Primary deliverable | A findings report with individual vulnerabilities, chain analysis, severity ratings, and remediation guidance. The output drives remediation. | An attack narrative showing the full adversary simulation — techniques used, detection gaps, blue team response assessment, and recommendations for improving detection and response. The output drives defensive improvement. |
| Typical cost | £8,000–£30,000 for a standard engagement depending on scope and duration. | £25,000–£80,000+ depending on scope, duration, and the inclusion of social engineering and physical access. |
The most common mistake organisations make is commissioning a red team exercise before they're ready for one. A red team exercise tests detection and response. If the organisation doesn't yet have effective detection — if the SOC can't detect LLMNR poisoning, if the SIEM doesn't have rules for lateral movement, if the EDR hasn't been tuned — the red team will achieve its objective without detection, and the findings will be: "Your detection doesn't work." That's an expensive way to learn something a pen test would have revealed at a fraction of the cost.
| Scenario | Appropriate Assessment | Why |
|---|---|---|
| First-time testing. No previous assessments. Unknown vulnerability landscape. | Penetration test | The organisation needs a comprehensive baseline of vulnerabilities before investing in detection testing. A pen test identifies what's exploitable and produces the remediation roadmap. |
| Multiple pen tests completed. Recurring findings addressed. Core vulnerabilities remediated. Detection capability deployed but untested. | Red team exercise | The fundamentals are in place. The organisation needs to know whether its detection and response capabilities actually work against a realistic adversary. The red team provides that answer. |
| New environment — cloud migration, acquired infrastructure, new application — that hasn't been assessed. | Penetration test | New environments need vulnerability identification first. A red team exercise against an untested environment will find vulnerabilities incidentally but won't produce the comprehensive finding set needed for systematic remediation. |
| Compliance requires "penetration testing" — PCI DSS, ISO 27001, Cyber Essentials Plus, DORA general testing. | Penetration test | Regulatory frameworks typically require vulnerability identification and exploitation testing — not adversary simulation. A pen test satisfies the requirement and produces actionable findings. |
| The organisation wants to test its incident response plan against a realistic scenario. | Red team exercise | The red team provides a real (controlled) incident for the blue team to detect and respond to. The exercise validates the IR plan, the communication chain, and the team's ability to contain a live adversary. |
| Mature security programme. Strong detection. Experienced SOC. The organisation wants to improve collaboratively. | Purple team exercise | At this maturity level, the adversarial dynamic of red teaming provides less value than collaborative improvement. Purple teaming — where the red and blue teams work together — produces immediate detection rule improvements and shared understanding. |
Purple teaming is not a separate team. It's a methodology where the red team (offensive) and the blue team (defensive) work collaboratively rather than adversarially. The red team executes a technique — Kerberoasting, for example. The blue team observes whether their detection fires. If it does, they test the alert quality, the response procedure, and the escalation path. If it doesn't, they build the detection rule immediately, with the red team confirming it works by re-executing the technique.
| Dimension | Red Team | Purple Team |
|---|---|---|
| Dynamic | Adversarial. Red team operates covertly. Blue team tries to detect. Results assessed after the exercise. | Collaborative. Red and blue teams work together in real time. Techniques are executed, detection is tested, and improvements are made during the session. |
| Blue team awareness | Unaware. Detection must be genuine. | Fully aware and participating. Detection is tested and improved collaboratively. |
| Outcome | Assessment of current detection and response capability. Identifies gaps but doesn't fix them during the exercise. | Immediate improvement. Detection rules created and validated during the session. The blue team leaves with new capabilities. |
| Duration | 2–6 weeks. | 1–5 days per session, typically quarterly. |
| Best for | Measuring current detection capability. Testing IR processes. Providing a realistic adversary simulation. | Improving detection capability. Building MITRE ATT&CK coverage. Training the blue team against specific techniques. Producing immediate, measurable improvement. |
The three assessment types — pen testing, red teaming, and purple teaming — form a maturity progression. Each builds on the previous, and attempting to skip stages produces diminished returns.
Penetration testing finds weaknesses. Red teaming tests defences. Purple teaming improves defences collaboratively. The tools overlap. The techniques overlap. The objectives don't. A pen test produces a comprehensive vulnerability report that drives remediation. A red team exercise produces an adversary simulation that measures detection and response. A purple team session produces immediate detection improvements through collaborative testing.
The right assessment depends on the organisation's maturity: pen testing first to establish the baseline and remediate the fundamentals, red teaming once detection capability is deployed and needs validation, and purple teaming once the programme is mature enough to benefit from collaborative improvement. Attempting to skip stages — commissioning a red team before the fundamentals are addressed — produces expensive findings that could have been discovered more cheaply.
The question isn't "which is better?" — it's "which is right for where we are now?" The answer should be based on maturity, objectives, and readiness — not on which label sounds most impressive in the board report.
We help organisations select and commission the right assessment type for their current maturity: pen testing for vulnerability identification, red teaming for adversary simulation, and purple teaming for collaborative detection improvement — because the most valuable assessment is the one that's appropriate for where you are now.