What a Mature Penetration Testing Programme Looks Like Over Multiple Years

Event vs Programme

The difference between testing once a year and improving every year.

Most organisations commission their first penetration test because they have to — a compliance requirement, an insurance condition, a client contract, or a board directive after a competitor's breach. The test happens. The report arrives. Some findings are fixed. The report is filed. Twelve months later, the cycle repeats.

This is penetration testing as an event. It produces a snapshot — a point-in-time assessment that captures the organisation's security posture during the testing window. The snapshot has value: it identifies vulnerabilities, demonstrates compliance, and provides the evidence that auditors and insurers require. But each snapshot is independent. The tester starts from scratch. Findings from the previous year may recur. Progress is difficult to measure. The board sees a new set of numbers each year and has no way to determine whether the security programme is improving.

A penetration testing programme is different. It's a multi-year strategy where each engagement builds on the previous one. The scope evolves. The metrics compound. The provider understands the environment's history. Remediation is tracked across engagements, not just within them. The board sees a trajectory — not a snapshot — and can evaluate whether the security investment is producing returns.

The difference isn't cost. A programme costs roughly the same as annual tests. The difference is intent: the organisation treats penetration testing as an input to a continuous improvement cycle, not as a compliance obligation that gets discharged once a year.

The Maturity Journey

How a programme evolves from year one through year five.

Year	Focus	What Changes	Board Sees
Year 1: Baseline	Internal infrastructure pen test. Establish the current state. Identify the most critical attack paths. Build the remediation tracker.	First-time engagement. Full discovery. The tester maps the environment, identifies the chains, and produces the initial remediation roadmap. The organisation learns what its actual risk looks like — often for the first time.	"Here's where we are. The tester reached Domain Admin in 2 hours. Here are the three quick wins that break the chain, and here's the 6-month programme to address the systemic issues."
Year 2: Validate and Expand	Retest previous findings. Expand scope to include web applications or cloud. Measure improvement against baseline.	The tester receives last year's report and tracker. They validate remediations, identify recurring findings, and test new scope areas. The first longitudinal comparison is produced — the beginning of the trend.	"Recurring findings dropped from 34 to 6. Time to DA increased from 2 hours to 2 days. Detection rate improved from 0% to 44%. Here's the updated roadmap for Year 3."
Year 3: Deepen and Diversify	Add social engineering or physical access testing. Introduce red team exercises for the first time. Test detection and response, not just prevention.	The programme moves beyond vulnerability finding into detection testing. The SOC is assessed alongside the infrastructure. Red team exercises simulate realistic adversary behaviour over extended periods — testing the organisation's ability to detect and respond, not just resist.	"The SOC detected 7 of 8 tester actions. Mean detection time was 47 minutes. DA was not achieved within the 10-day window. Two areas need investment: cloud security and phishing resilience."
Year 4: Optimise	Purple team sessions where testers and defenders collaborate. Targeted testing of new infrastructure (cloud migration, M&A integration). Scenario-based testing (ransomware simulation, insider threat).	The programme is mature enough to move from adversarial testing to collaborative improvement. Purple team sessions let the SOC watch the tester's techniques in real time and tune detection rules live. Scenario-based testing validates the organisation's resilience against specific threats that matter to the business.	"Purple team session resulted in 14 new detection rules. Ransomware simulation showed containment within 23 minutes. Cloud assessment identified 3 critical misconfigurations in the new AWS environment."
Year 5: Continuous Assurance	Rotating scope across the full estate. Continuous vulnerability management integrated with periodic deep testing. Bug bounty programme for external perspective. Testing embedded in the SDLC.	The programme is self-sustaining. Testing is continuous, not annual. Every major change triggers a targeted assessment. The bug bounty provides ongoing external validation. Pen testing is embedded in the development lifecycle — new applications are tested before release.	"No tester has achieved DA in 18 months. Detection rate is 91%. Mean remediation time is 11 days. The programme is producing compounding returns on a stable budget."

This isn't a rigid framework — every organisation's journey is different. But the pattern is consistent: baseline, validate, expand, deepen, optimise. Each year builds on the previous one. The scope expands. The techniques become more sophisticated. The organisation's defensive capability matures in response.

Evolving Scope

How the testing scope should change as the programme matures.

Testing the same scope with the same methodology every year produces diminishing returns. The first test finds the easy wins. The second validates the fixes. By the third, the tester is re-covering familiar ground. A mature programme deliberately rotates and expands scope to maintain the value of each engagement.

Scope Area	When to Introduce	What It Tests
Internal infrastructure	Year 1 onwards — the foundation of every programme.	Active Directory, network services, workstation configuration, credential hygiene, lateral movement paths, privilege escalation, and detection capability.
External infrastructure	Year 1 or 2 — often required by compliance frameworks.	Internet-facing services, VPN endpoints, web servers, email gateways, cloud-hosted resources. The attack surface visible to any adversary.
Web applications	Year 2 — once infrastructure fundamentals are addressed.	OWASP Top 10, authentication and session management, authorisation, input validation, API security. Application-layer vulnerabilities that infrastructure testing doesn't cover.
Cloud environments	Year 2–3, or whenever cloud adoption reaches meaningful scale.	IAM configuration, storage permissions, network controls, serverless function security, container security, multi-tenancy isolation.
Social engineering	Year 3 — once technical controls are maturing.	Phishing campaigns, vishing (telephone), physical access, USB drops, pretexting. Tests the human layer that technical controls can't fully address.
Red team exercises	Year 3–4 — once the SOC and detection capability are established.	Realistic adversary simulation over days or weeks. Tests the full kill chain: initial access, persistence, lateral movement, data exfiltration, and — critically — whether the SOC detects and responds.
Purple team sessions	Year 4 — once the programme is mature enough for collaborative improvement.	Testers and defenders work together. The tester runs techniques while the SOC watches, detects, and tunes. Produces immediate detection improvements rather than waiting for the report.
Scenario-based testing	Year 4–5 — tailored to the organisation's specific threat model.	Ransomware simulation, insider threat, supply chain compromise, targeted APT simulation. Tests resilience against the threats that the board identifies as most significant to the business.

Provider Strategy

Long-term relationships, rotation, and getting the best of both.

Organisations debate whether to use the same pen test provider year after year or rotate providers for fresh perspectives. Both approaches have genuine advantages — and a mature programme can capture the benefits of each.

Approach	Advantages	Disadvantages
Same provider, multi-year	Deep environmental knowledge. Consistent methodology and reporting format. Longitudinal comparison built in. The tester knows the history — they can push deeper each year instead of re-learning the environment. Retesting is more efficient.	Potential for familiarity bias — the tester may follow the same paths and miss areas they've unconsciously deprioritised. The organisation may become comfortable with a specific approach.
Rotating providers	Fresh perspective. Different methodologies, tools, and techniques. Findings that the previous provider missed may be discovered. Prevents complacency.	The new provider starts from scratch — the first engagement produces less depth. Reporting formats differ, making longitudinal comparison harder. Environmental context must be rebuilt.
Hybrid approach	Primary provider for 2–3 years, then rotate. Or: primary provider for infrastructure, different provider for web applications or red teaming. Each provider brings depth in their speciality. Fresh perspective arrives periodically without losing continuity.	More complex to manage. Requires clear communication between providers when sharing previous reports and remediation trackers.

The hybrid approach is typically the most effective for a mature programme. A primary provider delivers continuity, longitudinal metrics, and deepening environmental knowledge over a two-to-three year cycle. A secondary provider — or periodic rotation — introduces fresh perspective, different methodologies, and the constructive challenge that comes from a new set of eyes. The key is ensuring that every provider receives the full history: previous reports, remediation trackers, and retest results.

From Pen Test to Red Team to Purple Team

How testing types serve different maturity stages.

Penetration testing, red teaming, and purple teaming are not interchangeable — they serve different purposes and are appropriate at different stages of the programme's maturity.

Penetration Testing — Finding What's Broken

Objective: identify vulnerabilities and demonstrate exploitation. Scope: defined systems and networks. Duration: days to two weeks. Appropriate from Year 1 onwards. Pen testing finds the technical weaknesses — the missing patches, the misconfigured services, the weak credentials. It's the foundation: without knowing what's broken, you can't fix it.

Red Teaming — Testing the Full Defence

Objective: simulate a realistic adversary to test detection and response. Scope: the entire organisation — technical, human, and physical. Duration: weeks to months. Appropriate from Year 3 when the SOC and detection capability are established. Red teaming answers a different question: not "what vulnerabilities exist" but "can the organisation detect and respond to a skilled attacker?" It tests the security programme, not just the security infrastructure.

Purple Teaming — Improving Together

Objective: collaborative improvement of detection and response. Scope: specific techniques from the MITRE ATT&CK framework. Duration: 1–3 days per session. Appropriate from Year 4 when the programme is mature enough to move from testing to tuning. Purple team sessions produce immediate, measurable improvement: the tester runs a technique, the SOC checks whether they detect it, and if they don't, they build and test the detection rule in real time.

A mature programme uses all three, at different frequencies: annual pen testing for ongoing vulnerability identification, periodic red teaming (every 12–18 months) for full-spectrum resilience testing, and quarterly purple team sessions for continuous detection improvement. Each type feeds into the others: pen test findings inform red team scenarios, red team results identify detection gaps, and purple team sessions close those gaps.

Programme Metrics

What the board should see at each stage of maturity.

The metrics that matter evolve as the programme matures. Early-stage metrics focus on finding and fixing. Mid-stage metrics focus on detection and response. Late-stage metrics focus on resilience and efficiency.

Maturity Stage	Key Metrics	What They Demonstrate
Early (Year 1–2)	Finding count by severity. Recurring findings from previous engagement. Mean time to remediate critical and high findings. Remediation success rate (self-verified and retest-confirmed).	The organisation is finding and fixing vulnerabilities. Remediation is tracked and verified. The baseline is established.
Mid (Year 3–4)	Time to tester objective (increasing). SOC detection rate (percentage of tester actions detected). Mean time to detect (decreasing). Chain break effectiveness (are previous chains still viable?).	The organisation is detecting and responding to attacks. The environment is getting harder to compromise. Detection gaps are being closed systematically.
Mature (Year 5+)	Red team objective achievement rate (should decrease). Time to containment during red team exercises. Purple team detection coverage (percentage of ATT&CK techniques detected). Cost per finding remediated (efficiency metric). Resilience score (composite of detection, prevention, and response).	The organisation is resilient. The security programme is efficient. Investment is producing compounding returns. The board can quantify the return on security investment.

What Immature Looks Like

The patterns that signal a programme isn't progressing.

Not every organisation that commissions annual pen tests has a programme. Some have been testing for years without progressing. The signs are recognisable — and addressing them is the first step toward programme maturity.

Immature Pattern	What It Looks Like	The Programme Response
Same findings every year	LLMNR, SMB signing, weak passwords, and missing patches appear in every report for three consecutive years. The tracker shows them remediated. The next test finds them again.	Investigate why remediations aren't persisting. Implement change controls that prevent recurrence. Commission retests between annual engagements to catch regression early.
No tester continuity	Different provider every year. Each starts from scratch. No longitudinal comparison. No trend data. The board sees new numbers each year with no context.	Adopt the hybrid model: primary provider for 2–3 years for continuity, periodic rotation for fresh perspective. Ensure every provider receives previous reports and trackers.
Scope never changes	Internal infrastructure pen test every year, same scope, same methodology. Web applications, cloud, social engineering, and detection capability are never tested.	Develop a multi-year scope plan that rotates across the full estate. Year 1: internal. Year 2: web apps + retest. Year 3: cloud + social engineering. Year 4: red team.
No remediation tracking	The report arrives, some findings are fixed, the report is filed. There's no tracker. Nobody knows which findings from two years ago were resolved.	Implement a remediation tracker from the first engagement. Maintain it across all engagements. Provide it to every new tester. Track metrics quarterly.
Testing is compliance-driven only	The pen test is commissioned because PCI DSS requires it or the insurer demands it. The test scope is the minimum to satisfy the requirement. The report is submitted and filed.	Reframe pen testing as risk management, not compliance. The compliance requirement is the floor. The programme should exceed it — testing beyond the mandated scope to address the organisation's actual risk profile.

For Your Organisation

Building a programme that compounds over time.

Create a Multi-Year Testing Plan

Map out three to five years of testing scope: which environments, which testing types, which providers. The plan doesn't need to be rigid — it needs to be intentional. Each year's engagement should build on the previous one and expand into new areas. Present the plan to the board as a programme, not a series of annual purchases.

Maintain a Longitudinal Record

Store every report, every remediation tracker, and every retest result in a single repository. This archive is the programme's memory. Without it, every engagement starts from scratch. With it, every engagement builds on everything that came before — and the trend data demonstrates measurable improvement.

Report Programme Metrics, Not Just Finding Counts

The board doesn't need to know how many mediums were found this year. They need to know: are we improving? Present recurring finding rates, time to objective, detection rates, and remediation speed as trends. Two years of improving trends is the most powerful evidence that security investment is producing returns.

Graduate Testing Types as the Programme Matures

Start with penetration testing. Add red teaming when detection capability is established. Add purple teaming when the SOC is mature enough for collaborative improvement. Each type builds on the previous. Jumping to red teaming before the fundamentals are addressed wastes the engagement — the red team will achieve their objectives through the same basic misconfigurations the pen test identified.

Treat Your Provider as a Programme Partner

Share the multi-year plan with the provider. Invite them to the debrief and the remediation planning session. Provide them with the full programme history. A provider who understands the programme's trajectory can tailor each engagement to maximise its contribution to the long-term improvement — rather than delivering a standalone report that exists in isolation.

Summary

The bottom line.

A penetration test is a snapshot. A penetration testing programme is a time-lapse — each frame building on the last, each year's engagement advancing the organisation's security posture further than the previous one. The snapshot tells you where you are. The time-lapse tells you where you've been, where you're going, and whether the investment is working.

The maturity journey follows a predictable arc: baseline the current state, validate remediation and expand scope, introduce detection testing and adversary simulation, move to collaborative improvement and scenario-based exercises, and embed testing into continuous operations. Each stage produces compounding returns: recurring findings decrease, detection rates improve, time to compromise increases, and the organisation's resilience against real-world threats becomes measurable and demonstrable.

The difference between an organisation that tests annually and an organisation that runs a programme isn't the budget — it's the intent. The programme treats every engagement as a chapter in a continuing story. The reports are read together, not in isolation. The metrics are tracked as trends, not single points. And the result, after three to five years, is an organisation that can demonstrate — with evidence, to the board, the auditor, the insurer, and the regulator — that its security programme is working.

Build the Programme, Not Just the Test

Multi-year penetration testing programmes that compound into measurable security improvement.

We work with organisations to build testing programmes that evolve over years — from initial baseline through red teaming and purple team sessions — delivering the longitudinal metrics and compounding improvement that demonstrate security investment is producing returns.

Discuss a Multi-Year Programme Read: Remediation Validation and Retesting

All Posts Get in Touch