> diff findings_v1.json findings_v2.json | grep 'remediated' | wc -l && echo 'verified, not assumed'_
The pen test report identified 34 findings. The IT team spent eight weeks working through the remediation roadmap. The GPO changes were applied. The service account passwords were rotated. The detection rules were deployed. The remediation tracker shows 28 of 34 findings marked as remediated. The CISO reports to the board: "82% of findings remediated. Programme on track."
Six months later, the next pen test begins. Within the first day, the tester captures credentials via LLMNR poisoning — the same technique from the previous engagement. Finding F-003: LLMNR enabled. Status in the tracker: remediated. Status in reality: the GPO was applied to the wrong Organisational Unit. Half the domain is still broadcasting LLMNR. The fix was implemented. It just doesn't work.
This isn't unusual. In our experience, approximately 15–25% of remediations that are marked as complete contain implementation errors that leave the vulnerability partially or fully unaddressed. The finding is closed in the tracker. The risk remains in the environment. The gap between "we fixed it" and "it's actually fixed" is the verification gap — and it's the most common reason organisations see recurring findings across successive pen test engagements.
| Error Pattern | Example | How Often We See It |
|---|---|---|
| Wrong scope | The GPO to disable LLMNR was linked to the "Servers" OU instead of the domain root. All workstations — the actual target — are still broadcasting. | Common. GPO scoping errors account for a significant proportion of failed infrastructure remediations. |
| Partial implementation | SMB signing was enforced on servers but not on workstations. The tester relays from a workstation — the fix only covered half the attack surface. | Common. Particularly with changes that require both client and server configuration. |
| Exception undermines the rule | The firewall rule blocking lateral movement has an exception for the IT team's subnet. The tester compromises an IT workstation first and bypasses the rule entirely. | Frequent. Exceptions created for operational convenience often recreate the vulnerability for the exact users who have the most access. |
| Old credentials cached | The service account password was rotated, but the old password is cached in LSASS on three servers where the account was previously used interactively. The tester extracts the cached credential. | Occasional. Credential caching persists until the cached entries are cleared or the system is restarted. |
| Fix applied, then reverted | Network segmentation was implemented in March. A connectivity issue in April led the network team to add a temporary "allow all" rule between VLANs. The temporary rule is still in place eight months later. | Frequent. Temporary changes made under operational pressure are rarely removed. The remediation tracker still shows "remediated." |
| Detection rule too narrow | The SOC deployed a Kerberoasting detection rule for RC4 ticket requests. The tester requests AES tickets — still Kerberoastable, different encryption type. The rule doesn't fire. | Common. Detection rules written for the exact technique used in the previous engagement often miss variants. |
Every one of these errors is invisible in the remediation tracker. The tracker shows the finding as closed. The engineer believes the fix is in place. The CISO reports progress to the board. And the vulnerability persists — until someone tests whether the fix actually works.
There are two ways to validate remediation: the engineering team verifies their own fixes using the steps provided in the report, or an independent tester validates the fixes by attempting to re-exploit the vulnerabilities. Both have value. They serve different purposes.
| Approach | How It Works | Assurance Level | Best For |
|---|---|---|---|
| Self-verification | The engineer implements the fix and then runs the verification step from the pen test report. "After disabling LLMNR, run Responder on the VLAN for 15 minutes — no hashes should be captured." The engineer performs the test and records the result. | Moderate. Confirms the specific fix was applied correctly. Does not test for variants, workarounds, or unintended consequences. Subject to the engineer's interpretation of the verification step. | Quick wins and standard remediations where the verification step is straightforward. Appropriate for low and medium findings. First-line validation for all findings. |
| Independent retesting | An independent tester — ideally from the original pen test provider — attempts to re-exploit the remediated findings using the same techniques and any variants. They validate that the fix works, that it hasn't introduced new issues, and that the attack chain is genuinely broken. | High. Independent validation that the fix works against the actual technique and its variants. Tests the fix in the context of the full attack chain, not just the individual finding. Identifies implementation errors that self-verification misses. | Critical and high findings. Chain findings where the break point must be validated. Findings where the remediation is complex or the risk of implementation error is high. Before reporting remediation progress to the board or regulator. |
The recommended approach is both: self-verification for every finding as the first-line check, followed by independent retesting for critical and high findings and all chain break points. Self-verification catches the obvious implementation errors immediately. Independent retesting catches the subtle ones — the wrong OU, the cached credential, the detection rule variant — that only surface when an adversary's perspective is applied.
A retest is not a full pen test. It's a focused engagement — typically one to three days — where the tester validates that specific remediations have been implemented correctly and that the attack chains from the previous engagement are broken.
| Element | Full Pen Test | Targeted Retest |
|---|---|---|
| Objective | Discover new vulnerabilities. Map attack chains. Test the full scope of the environment. | Validate that specific remediations from the previous engagement are effective. Confirm attack chains are broken. |
| Scope | Broad — the full environment or a defined segment. | Narrow — limited to the findings from the previous report, particularly critical and high findings and chain break points. |
| Duration | 5–15 days depending on environment size. | 1–3 days. The tester already knows the environment and the findings — they're validating specific fixes, not discovering new issues. |
| Deliverable | Full report with executive summary, attack narrative, findings, and remediation roadmap. | Retest report: a table showing each remediated finding, its validation status (fixed, partially fixed, not fixed), evidence, and notes on any implementation issues discovered. |
| Cost | Full engagement fee. | Typically 15–25% of the original engagement cost. Significantly less because the scope is narrower, the tester is familiar with the environment, and discovery isn't required. |
This is the kind of finding that's invisible without a retest. The tracker shows "remediated." The GPO exists. The engineer implemented it in good faith. But the scoping error means the fix protects half the environment. The retest identifies the gap, provides the specific correction, and the 15-minute fix completes the remediation. Without the retest, the finding recurs in the next annual engagement — six months and a significant amount of risk later.
The board doesn't want to hear about GPO scoping errors. They want to know whether the security programme is working — whether the money spent on testing, remediation, and tooling is producing measurable improvement. The retest provides the data to answer that question.
| Metric | What It Measures | What Good Looks Like |
|---|---|---|
| Remediation success rate | The percentage of remediated findings confirmed as fully fixed during retesting. | Increasing over time. Year 1: 72% of remediations verified as effective. Year 2: 89%. The gap between "marked remediated" and "confirmed fixed" is narrowing — the engineering team's implementation quality is improving. |
| Recurring finding rate | The percentage of findings from the previous engagement that appear again in the current engagement. | Decreasing over time. Year 1: 14 of 34 findings recur (41%). Year 2: 3 of 28 recur (11%). Findings are being fixed permanently, not temporarily. |
| Time to objective | How long the tester takes to achieve their primary objective (e.g. Domain Admin, access to sensitive data). | Increasing over time. Year 1: DA in 2 hours 15 minutes. Year 2: DA in 2 days 4 hours. Year 3: DA not achieved within the 10-day window. The environment is getting harder to compromise. |
| Detection rate | The percentage of tester actions detected by the SOC, EDR, and monitoring systems. | Increasing over time. Year 1: 0 of 7 actions detected (0%). Year 2: 4 of 9 detected (44%). Year 3: 7 of 8 detected (88%). The detection capability is maturing. |
| Mean time to remediate | The average number of days between report delivery and confirmed remediation of critical and high findings. | Decreasing over time. Year 1: 94 days. Year 2: 37 days. Year 3: 14 days. The organisation is responding faster to identified risk. |
| Chain viability | Whether the attack chains from the previous engagement are still viable after remediation. | Chains broken. If the retest confirms that the three chain break points identified in the previous report are all effective, the specific path to the crown jewels no longer exists. This is the most meaningful single metric. |
These metrics, presented as trends across two or three years of engagements, tell a story the board can understand: security is improving, the investment is producing results, and the risk trajectory is downward. No single engagement can tell this story. The longitudinal view — built from consistent testing, remediation, retesting, and tracking — demonstrates the return on the organisation's security investment in terms that justify continued funding.
The testing-remediation-validation cycle transforms penetration testing from a compliance checkbox into a continuous improvement programme. Each iteration builds on the previous one, and the cumulative effect compounds over time.
| Phase | When | What Happens | Deliverable |
|---|---|---|---|
| 1. Test | Annual (or more frequent) | Full penetration test against the agreed scope. Findings identified, attack chains mapped, remediation roadmap produced. | Full pen test report with executive summary, attack narrative, findings, and roadmap. |
| 2. Remediate | Weeks 1–8 after report delivery | Engineering team works through the roadmap. Quick wins implemented immediately. Standard remediations scheduled. Projects scoped and funded. Each finding self-verified using the report's verification steps. | Updated remediation tracker with status, date, evidence, and sign-off for each finding. |
| 3. Validate | Weeks 8–12 after report delivery | Independent retest of critical and high findings and all chain break points. Implementation errors identified and corrected. Chains confirmed as broken or still viable. | Retest report with validation status for each finding. Remediation tracker updated with confirmed status. |
| 4. Report | Quarterly | Remediation progress reported to the board. Metrics presented as trends. Improvement demonstrated. Remaining risk communicated honestly. | Board report showing remediation success rate, recurring findings, detection rate, and time to objective — as trends. |
| 5. Repeat | Annual cycle restarts | Next engagement builds on the previous: previous report and tracker provided to the new tester. Remediated findings validated. New scope areas explored. The cycle compounds. | New report with comparison section showing improvement since the previous engagement. |
A finding marked "remediated" in a tracker is a claim. A finding confirmed as fixed by an independent retest is a fact. The gap between the two — the 15–25% of remediations that contain implementation errors — is the verification gap that retesting closes. Without validation, the organisation reports progress to the board based on tracker status. With validation, the organisation reports progress based on confirmed results.
Retesting transforms the pen test from a one-time assessment into a continuous improvement cycle. Each iteration — test, remediate, validate, report, repeat — builds on the previous one. The metrics compound: recurring findings decrease, time to objective increases, detection rates improve, and remediation speed accelerates. After two or three cycles, the organisation has a longitudinal record that demonstrates measurable security improvement — the strongest evidence of programme effectiveness for the board, the auditor, the insurer, and the regulator.
The pen test finds the problems. Remediation addresses them. Retesting proves they're fixed. Without all three, the loop is open — and an open loop is a loop that doesn't improve anything.
Our retesting engagements validate every critical and high remediation, confirm attack chains are broken, and provide the longitudinal metrics that demonstrate your security programme is working — because a finding that's verified as fixed is worth more than ten that are assumed to be.