> poc --demonstrate=impact --extract=minimum --harm=zero_
A penetration test report lands on the IT Director's desk. Finding #7: SQL injection in the customer search endpoint. Severity: High. Remediation: parameterise all database queries. The IT Director forwards it to the development lead. The development lead reads the description, looks at the evidence — a screenshot of Burp Suite showing an error-based response — and says: "That's the test environment. The WAF would catch this in production. I'll add it to the backlog."
Six months later, the same vulnerability is exploited by an attacker who exfiltrates 90,000 customer records. The development lead's WAF theory was wrong. But the pen test report never proved it was wrong — it showed that the injection existed without demonstrating what it could achieve.
Now imagine the same finding, different evidence. The report includes a screenshot of 10 rows extracted from the production customer table (with PII redacted), the exact SQL payload used, the time taken (under 30 seconds), and a note confirming that the WAF did not block the request. The development lead doesn't add this to the backlog. They fix it that afternoon.
The vulnerability was identical. The remediation was identical. The difference was the proof of concept — and that difference determined whether the finding was fixed or forgotten.
A finding without convincing proof of concept is a suggestion. A finding with strong PoC is a fact. The entire value of a penetration test — the remediation it drives, the risk it reduces, the investment it justifies — rests on whether the people reading the report believe the findings are real, exploitable, and consequential.
Proof of concept isn't an optional flourish added to impress the reader. It serves four distinct functions, each of which directly affects whether the pen test delivers real value or gathers dust.
| Function | What It Does | What Happens Without It |
|---|---|---|
| Eliminates doubt | Confirms the vulnerability is real and exploitable in the specific environment — not a false positive, not a theoretical risk, not something the WAF would catch. | Findings are disputed: "Are you sure this works in production?" "Our security controls should block this." "The scanner flags this every time — it's a known false positive." Disputed findings don't get fixed. |
| Quantifies impact | Demonstrates exactly what an attacker would achieve: the data they'd access, the privileges they'd gain, the systems they'd reach. Turns abstract risk into concrete consequence. | Impact is vague: "could lead to data breach." Every medium-severity finding says this. Without specific, demonstrated impact, the finding competes equally with 29 others for remediation resources. |
| Enables reproduction | Provides the development team with exact steps, payloads, and parameters to reproduce the finding in their own environment — essential for building, testing, and validating a fix. | Developers can't reproduce the finding: "We tried what the report described and it didn't work." Without reproducible PoC, the development cycle stalls. The fix is guessed at rather than verified. |
| Drives urgency | Creates an emotional and rational response: "Someone demonstrated they can access our customer database." That visceral realisation — this actually works — is what turns a finding from a line in a spreadsheet into an emergency. | Findings are prioritised by CVSS score and triaged into the normal development cycle. A high-severity finding with weak PoC sits in the backlog for months. The same finding with strong PoC is fixed in days. |
Proof of concept exists on a spectrum. At one end, insufficient PoC fails to convince anyone the finding is real. At the other end, excessive PoC goes beyond what's needed to prove impact and enters territory that creates unnecessary risk, damages trust, or raises legal concerns.
The skill — and it is a skill, not a formula — lies in finding the point that's sufficient to prove the finding is real and impactful, without going a single step further than necessary.
| Insufficient PoC | Appropriate PoC | Excessive PoC | |
|---|---|---|---|
| SQL injection | "The parameter appears to be injectable based on error message differences." No extraction demonstrated. | 10 rows extracted from the customer table using a UNION SELECT with LIMIT 10. PII redacted in the report. Exact payload, request, and response documented. |
Full database dump of 90,000 records extracted and included in the report appendix. Real customer PII visible in the evidence. |
| IDOR | "Changing the ID parameter returns a different response." No demonstration that it's another user's data. | 3 different user IDs tested, each returning a different customer's profile data. Screenshots show the parameter change and the corresponding data. Pattern confirmed as systemic. | All 43,000 user IDs enumerated. A CSV export of every customer's profile data included as evidence. |
| Domain Admin | "A Kerberoastable account was identified." No cracking attempted, no DA logon demonstrated. | Hash cracked offline in 4 minutes. DA logon demonstrated via whoami /groups. Screenshot shows the privileged context. No AD modifications made. |
Tester creates a new Domain Admin account, resets the KRBTGT password, or accesses every mailbox to "prove" the extent of access. |
| Remote code execution | "The file upload function accepts PHP files." No execution confirmed. | Minimal PHP file uploaded, executes whoami and hostname. Screenshot of output. File immediately removed. Clean-up confirmed. |
Full web shell deployed with file browser, command execution, and database access. Left in place "for the client to see." |
| Credential exposure | "Credentials were found in a public repository." No validation that they work. | Credentials tested against the target system. Successful authentication demonstrated via screenshot. Access level documented. Password partially redacted in the report. | Credentials used to log into every system they grant access to. Internal emails read. Documents downloaded. Screenshots of sensitive internal communications included in the report. |
The middle column is the target. Every example follows the same principle: demonstrate the vulnerability is real, demonstrate the impact is significant, then stop. The left column doesn't prove enough to drive action. The right column proves more than necessary and creates new risks in the process.
Excessive proof of concept isn't just unnecessary — it actively harms the engagement, the client relationship, and potentially the tester's own legal position. The consequences are real and avoidable.
Responsible proof of concept isn't subjective — it follows a clear set of principles that can be applied consistently across every finding, every engagement, and every tester.
| Principle | What It Means | How to Apply It |
|---|---|---|
| 1. Minimum sufficient evidence | Capture the least amount of data and access needed to prove the finding is real and its impact is clear. No more. | SQL injection: extract 5–10 rows, not the full table. IDOR: test 3–4 IDs, not all of them. DA: demonstrate the logon, don't create accounts. RCE: run whoami, don't install a persistent shell. |
| 2. Impact over access | The goal is to demonstrate what an attacker could achieve, not to achieve it in full. The PoC proves the potential; it doesn't realise it. | Instead of exfiltrating all customer data, extract a sample and state: "All 43,000 records are accessible via this method. We retrieved 10 as evidence." The sample proves the vulnerability; the number proves the scale. |
| 3. Redact by default | Any personal data, credentials, or sensitive information captured as evidence should be redacted in the report unless the unredacted content is essential to understanding the finding. | Show extracted data with PII replaced: john.s****@acme.co.uk. Show cracked passwords partially: W1nt***25!. The finding is equally credible with or without the full values. |
| 4. Clean up immediately | Every artefact created during PoC — uploaded files, test accounts, registry entries, web shells, scheduled tasks — is removed immediately after the evidence is captured. | Maintain a running clean-up log. Check every item off before the engagement closes. Include a clean-up confirmation section in the report. Leave nothing behind. |
| 5. Communicate before you escalate | If the PoC requires access to particularly sensitive data or systems — the CEO's mailbox, the HR database, the finance system — check with the client before proceeding. | A quick call: "We've achieved DA and can access the mailbox server. Do you want us to demonstrate access to a specific mailbox, or is the DA logon sufficient evidence?" This takes 30 seconds and prevents 30 days of relationship repair. |
To make the framework concrete, here's how responsible PoC applies to the findings we encounter most frequently. Each example shows what to capture, what to redact, and where to stop.
As the client receiving a pen test report, you can assess the quality of proof of concept by looking for specific indicators. Strong PoC doesn't just prove the tester is technically capable — it demonstrates professional judgement, ethical awareness, and respect for your data.
| Indicator | Strong PoC | Weak PoC |
|---|---|---|
| Evidence type | Full request/response pairs, timestamped screenshots, attack narrative explaining each step, reproduction instructions | A single screenshot with no context, or a paragraph describing what the tester claims happened without supporting evidence |
| Data handling | Extracted data is minimal (5–10 rows), PII is redacted, and the report states how much additional data was accessible without extracting it | Large volumes of unredacted data included in the report, or no data extracted at all (just a claim that access was possible) |
| Proportionality | The level of exploitation matches the finding. A confirmed SQLi with 10 sample rows. A DA logon with whoami. An RCE with a single command execution. |
Either under-proven ("this may be exploitable") or over-proven (full database dump, persistent backdoor left in place, sensitive documents included in the report) |
| Reproducibility | Step-by-step reproduction instructions that a developer can follow independently to verify the finding and test their fix | Vague descriptions: "inject a payload into the search field" without specifying which payload, which parameter, or what response to expect |
| Impact clarity | Specific impact statement: "43,291 customer records accessible. Mandatory ICO notification under UK GDPR. Estimated regulatory fine: up to 4% of annual turnover." | Generic impact: "could lead to data breach" or "may allow unauthorised access" — language that applies to any finding at any severity |
| Clean-up confirmation | Report includes a section listing every artefact created during testing and confirming its removal | No mention of clean-up, or an assumption that the client knows what was left behind |
Some targets require extra care — not because the methodology changes, but because the consequences of getting PoC wrong are amplified. When the data is more sensitive, the margin for excess is smaller, and the communication with the client needs to be more deliberate.
| Target | Additional PoC Considerations |
|---|---|
| Healthcare data | Patient records carry legal protections beyond standard GDPR (Caldicott Principles, NHS Data Security and Protection Toolkit). Extract zero real patient data if possible — demonstrate access to the table structure and row count without retrieving identifiable records. If extraction is essential, use synthetic or obviously-test records where available. |
| Financial systems | Never demonstrate a business logic flaw by completing a real financial transaction, even a small one. Demonstrate the ability to reach the transaction endpoint, document the parameters, and explain the theoretical outcome. If a test environment supports it, execute the transaction there. |
| Email and communications | Demonstrating mailbox access doesn't require reading real emails. Show that the mailbox is accessible (screenshot of the inbox list without opening messages), document the access path, and stop. Reading internal communications creates confidentiality obligations that outlast the engagement. |
| HR and personnel data | Salary data, disciplinary records, and performance reviews are among the most sensitive categories of internal data. If your attack path reaches the HR system, demonstrate that you can access it — not what's in it. A screenshot of the HR application's dashboard is proof enough. |
| Legal and M&A documents | Access to legal hold data, active litigation files, or M&A documentation could have regulatory implications (insider trading, legal privilege). If the path reaches these systems, inform the client immediately and agree on evidence capture before proceeding. |
| Operational technology | OT environments control physical processes — manufacturing, energy, water treatment. PoC in OT should demonstrate network access to the OT segment and identification of control systems, never direct interaction with PLCs, SCADA interfaces, or safety systems. |
Responsible PoC isn't solely the tester's responsibility. The client shapes the engagement through the rules of engagement, the scoping conversation, and ongoing communication during testing. Here's how to ensure the PoC in your engagement is both credible and responsible.
Beyond credibility and impact, proof of concept serves a quieter but equally important function: it eliminates false positives. Vulnerability scanners generate false positives at a rate that ranges from irritating to overwhelming — findings that look real in the scanner output but aren't exploitable in the specific environment.
| Scanner Says... | PoC Reveals... |
|---|---|
| "SQL injection detected in the search parameter" (based on a time delay in the response) | The time delay is caused by a slow database query, not by injected SQL. The parameter is correctly parameterised. Finding is a false positive. |
| "Cross-site scripting in the name field" (based on reflected output) | The output is reflected but HTML-encoded. The XSS payload renders as text, not as executable code. Finding is a false positive. |
| "Remote code execution: Apache Struts CVE-2017-5638" (based on version banner) | The server returns an Apache Struts banner but is actually running a different framework behind a reverse proxy. The CVE doesn't apply. Finding is a false positive. |
| "Open redirect in the login return URL" (based on parameter manipulation) | The redirect is restricted to a whitelist of internal domains. External URLs are rejected. The scanner didn't test the whitelist. Finding is a false positive. |
| "SSL certificate hostname mismatch" (flagged during scan) | The mismatch only occurs on an internal hostname that isn't accessible from the internet. The public-facing certificate is correct. Finding is accurate but irrelevant to external risk. |
Every false positive that reaches the remediation backlog wastes developer time, erodes trust in the testing process, and dilutes attention from the findings that actually matter. Manual PoC verification eliminates false positives before they enter the report — ensuring that every finding your team works on is confirmed, exploitable, and worth fixing.
Proof of concept is what makes a pen test finding credible, actionable, and urgent. Without it, findings are suggestions that compete for remediation resources based on abstract severity scores. With it, findings are facts — demonstrated, evidenced, and impossible to deprioritise.
But PoC that goes too far creates its own risks: data protection liability, trust erosion, legal exposure, and a remediation conversation that focuses on the tester's conduct rather than the organisation's vulnerabilities. The line between sufficient and excessive isn't arbitrary — it's defined by a clear principle: demonstrate the impact with the minimum evidence necessary, then stop.
The best proof of concept leaves no doubt that the vulnerability is real, no question about what an attacker could achieve, and no unnecessary data in the report. That's the standard every finding should meet — and it's the standard that turns a pen test from a document into a decision.
Every finding in our reports is backed by proportionate, responsible proof of concept — sufficient to drive remediation, careful enough to protect your data, and documented to a standard that satisfies boards, auditors, and regulators.