> ./run-engagement --methodology=walkthrough --lang=plain-english_
A professional penetration test is not someone sitting in a dark room randomly attacking your systems. It's a structured, methodical process with defined phases, clear deliverables, and constant communication. Every step has a purpose, every decision is documented, and the entire engagement follows a methodology refined through thousands of real-world assessments.
But unless you've been through the process before, it can feel opaque. You sign a contract, someone tests your systems for a week, and a report appears. What happens in between? What does the tester actually do all day? How do they decide what to focus on? And how does the process ensure that nothing important gets missed?
This article walks through every phase of a typical engagement in plain English — no jargon, no assumptions, no skipped steps. Whether you're commissioning your first pen test or your fiftieth, this is what the process looks like from the inside.
Our methodology is aligned with the Penetration Testing Execution Standard (PTES) and incorporates elements of OWASP, OSSTMM, and MITRE ATT&CK. The specific phases may vary slightly between providers, but the structure described here is representative of any professional, CREST-accredited engagement.
Every engagement begins with a conversation — not a scan. The pre-engagement phase is where we understand what you need, why you need it, and how to structure the test so it delivers maximum value. Nothing is tested until this phase is complete and both parties have signed off.
| Activity | What Happens | Why It Matters |
|---|---|---|
| Scoping call | We discuss your objectives (compliance, risk assessment, pre-launch validation), your environment (technology stack, cloud providers, office locations), and your concerns (specific threats, previous incidents, known weak areas). | The scoping call determines the entire shape of the engagement. A test designed around your actual risk delivers far more than one sized by IP count. |
| Scope definition | We agree the precise list of targets (IP addresses, URLs, applications, networks), the approach (black/grey/white box), the environment (production/staging), and all exclusions. | Ambiguity in scope leads to gaps in coverage. We document everything explicitly so there's no question about what was and wasn't tested. |
| Rules of engagement | We define what the tester is permitted to do (exploitation, brute-forcing, social engineering) and what they must not do (denial of service, data destruction, testing outside hours). Communication protocols and emergency contacts are agreed. | The RoE protect both parties. They ensure testing stays within acceptable boundaries and provide legal authorisation under the Computer Misuse Act 1990. |
| Statement of Work | All of the above is formalised in a signed document: scope, approach, timeline, deliverables, pricing, retest policy, and contact details. | The SoW is the contract. If it isn't in the SoW, it isn't agreed. This is the document you'll reference if there's ever a question about what was covered. |
| Prerequisites | You prepare your side: test accounts provisioned, SOC notified, VPN access verified, stakeholders briefed, backups confirmed. | Client-side preparation failures are the single most common cause of lost testing time. We provide a checklist and verify readiness before day one. |
Pre-engagement typically takes one to two weeks. It feels like admin. It isn't — it's the phase that determines whether the test produces actionable intelligence or just generates noise.
Before the tester sends a single probe to your systems, they spend time gathering intelligence from public sources — exactly as a real attacker would. This passive reconnaissance is invisible to you: no logs are generated, no alerts fire, no packets touch your network.
Reconnaissance typically occupies the first day of testing — sometimes more for large or complex estates. Every hour invested here saves multiple hours later by ensuring the tester focuses on the right targets rather than probing blind.
With the attack surface mapped and prioritised, the tester begins systematically identifying vulnerabilities across the in-scope targets. This phase combines automated scanning with manual investigation — the scanner catches known issues at scale, and the human tester finds the context-dependent, logic-based, and chained vulnerabilities that automation misses.
The nature of testing differs by target type, but the principle is consistent: understand what the system does, then systematically test whether it can be made to do something it shouldn't.
| Target Type | What the Tester Investigates | What They're Looking For |
|---|---|---|
| Infrastructure | Service versions and patch levels, default credentials, misconfigurations, unnecessary services, encrypted protocol enforcement, management interface exposure | Known CVEs with public exploits, weak or default passwords, services that shouldn't be internet-facing, protocols that leak information (SNMP, LLMNR, NTLMv1) |
| Web applications | Every input field, URL parameter, HTTP header, cookie, API endpoint, file upload, and authentication mechanism — tested for injection, access control failures, logic flaws, and session management weaknesses | SQL injection, XSS, IDOR, privilege escalation, CSRF, business logic abuse, API authorisation failures, insecure file uploads, authentication bypass |
| Active Directory | Domain architecture, trust relationships, Group Policy, service account configurations, delegation settings, password policies, privileged group membership | Kerberoastable accounts, AS-REP roastable accounts, unconstrained delegation, GPP passwords, overly-permissive ACLs, paths to Domain Admin via BloodHound |
| Cloud environments | IAM policies, storage permissions, network security groups, logging configuration, cross-account trust, serverless function roles, container security | Over-permissive IAM roles, public storage buckets, missing logging, privilege escalation through role chaining, exposed management APIs |
Every potential vulnerability is noted but not yet confirmed. The tester builds a working list of hypotheses: "this parameter might be injectable," "this access control might be bypassable," "this service account might be crackable." The next phase confirms or disproves each one.
Exploitation is the phase that separates a penetration test from a vulnerability assessment. Where a vulnerability assessment says "this might be exploitable," exploitation proves that it is — by actually doing it, under controlled conditions, with the client's explicit authorisation.
This is not reckless. Exploitation in a professional pen test is careful, documented, and proportionate. The tester's goal is to demonstrate impact with the minimum disruption necessary. They're not trying to break things — they're trying to prove that an attacker could break things, so that you can fix the vulnerability before someone with less restraint finds it.
| What the Tester Does | What It Demonstrates | How It's Controlled |
|---|---|---|
| Exploits a SQL injection to extract a sample of database records | The vulnerability allows full read access to the database — including customer PII, credentials, and financial data. The sample is sufficient evidence; the full database is not extracted. | The tester extracts the minimum data needed to prove the vulnerability. Typically 5–10 rows. Real customer data is handled as sensitive evidence, encrypted, and destroyed after the engagement. |
| Uses a cracked service account password to authenticate to the domain controller | A Kerberoastable service account with a weak password provides a direct path to Domain Admin. The entire Active Directory is compromised. | The tester demonstrates the logon and captures evidence (screenshot, timestamp, account details). They do not modify AD objects, reset passwords, or create new accounts. |
| Chains an SSRF vulnerability with the cloud metadata service to retrieve IAM credentials | A web application vulnerability allows the attacker to access the underlying cloud infrastructure — potentially the entire cloud estate. | The tester retrieves the temporary credentials and documents their permissions. They do not use them to access production data or modify cloud resources. |
| Uploads a web shell through an unrestricted file upload to achieve remote code execution | The attacker has full control of the web server — they can read files, access databases, pivot to internal systems, and deploy malware. | The tester demonstrates code execution (e.g. whoami, hostname), captures evidence, and removes the web shell immediately. No persistent backdoor is left. |
Exploitation converts abstract vulnerability descriptions into concrete business risk. "SQL injection in the search field" becomes "an attacker can extract your entire customer database." "Kerberoastable service account" becomes "an attacker can take over your Active Directory in under two hours." The finding is the same; the impact statement changes everything.
If exploitation reveals a critical vulnerability — one that poses an immediate risk to the organisation — we don't wait for the final report. We contact the designated client representative by phone within one hour, followed by a written advisory within four hours. You can begin remediation the same day, rather than waiting a week for the report.
Once an initial foothold is established — a compromised account, a web shell, access to the internal network — the question becomes: what can an attacker do from here? Post-exploitation simulates the actions a real attacker would take after the initial breach, testing whether your internal defences contain the damage or allow it to spread.
This phase is where penetration testing most diverges from vulnerability scanning. A scanner stops at "this service has a known vulnerability." Post-exploitation asks: "if this service is compromised, can the attacker reach the domain controller? Can they access the finance system? Can they exfiltrate client data? Can they deploy ransomware?" The answers to these questions are what the board actually needs to hear.
| Post-Exploitation Activity | What It Answers |
|---|---|
| Privilege escalation — attempting to gain higher privileges from the initial foothold (standard user → admin → Domain Admin) | If an attacker compromises a standard user account, how quickly can they become an administrator? Are there misconfigurations in AD, local systems, or cloud IAM that enable escalation? |
| Lateral movement — moving from the compromised system to other systems on the network | Is the network segmented? Can an attacker on a workstation reach the database server, the file server, the domain controller? Do shared credentials or trust relationships allow movement without additional exploitation? |
| Credential harvesting — extracting passwords, hashes, tokens, and keys from compromised systems and network traffic | Are credentials stored securely? Are Domain Admin passwords cached on workstations? Can the tester capture credentials from network broadcast protocols? Do harvested credentials grant access to additional systems? |
| Data access — identifying and accessing sensitive data (customer records, financial information, intellectual property, board documents) | What can the attacker actually reach? File shares with sensitive documents? Database servers with customer PII? Email inboxes containing confidential communications? The answer defines the real-world impact of the breach. |
| Persistence — testing whether the attacker could maintain access through reboots, password changes, and remediation attempts | If you discover the breach and reset the compromised password, does the attacker lose access — or have they established backup mechanisms (scheduled tasks, registry keys, web shells) that survive remediation? |
| Detection assessment — noting whether any activity during the engagement was detected by the SOC, EDR, SIEM, or other monitoring | Are your defensive tools and teams working? Which phases of the attack were detected? Which were missed? How long did it take from compromise to detection? This is some of the most valuable data in the entire engagement. |
Post-exploitation is typically where the engagement's most impactful findings emerge. The initial vulnerability might be a medium-severity issue on its own — but combined with weak segmentation, cached credentials, and absent monitoring, it becomes a pathway to complete organisational compromise.
The report is the deliverable. It's the tangible output of the entire engagement — the document that drives remediation, informs risk decisions, satisfies compliance requirements, and gets presented to the board. A brilliant test with a poor report is a wasted test.
Our reports are structured to serve multiple audiences. The executive summary is written for the board and senior leadership: plain English, business impact, strategic recommendations. The technical findings are written for the security and IT teams who will carry out the remediation: precise, reproducible, and prioritised by real-world risk rather than generic CVSS scores.
| Report Section | Who It's For | What It Contains |
|---|---|---|
| Executive summary | Board, senior leadership, non-technical stakeholders | A plain-English narrative of the engagement: what was tested, what was found, what the real-world risk is, and what needs to change — with no jargon, no CVSS scores, and no assumption of technical knowledge. |
| Scope and methodology | Auditors, compliance teams, anyone assessing the test's validity | Precisely what was tested, what was excluded, which methodology was followed, the testing window, the approach (black/grey/white box), and the tester's credentials. |
| Attack narrative | CISO, security team, technically-interested leadership | The story of each attack chain: how the tester moved from the initial entry point to the final objective, step by step. This is the section that most effectively communicates the real risk. |
| Technical findings | IT team, developers, system administrators | Each vulnerability documented with: description, affected system, evidence (screenshots, request/response pairs), risk rating, CVSS score, remediation guidance, and references (CVE, CWE, OWASP). |
| Risk heat map | Everyone — a single visual snapshot | Findings plotted by likelihood and impact, showing where risk concentrates. Critical-corner findings are the immediate priority; the map makes the priority intuitive. |
| Remediation tracker | IT team, project managers | A structured table listing every finding, its priority, the recommended fix, the estimated effort, and a status column for tracking remediation progress. Designed to be imported directly into your ticketing system. |
| Strategic recommendations | CISO, board, security programme owners | Beyond individual fixes: systemic recommendations for improving the security posture. "You need network segmentation" is a finding-level fix. "Your testing programme should include annual internal assessments and biannual social engineering" is a strategic recommendation. |
The report is delivered within five working days of testing completion. It's not a scanner export with a logo — it's a hand-written document, quality-reviewed by a second consultant, designed to be read by humans and acted upon.
A report alone isn't enough. Findings need context, questions need answering, and the people responsible for remediation need to understand not just what to fix but why it matters and how the attacker exploited it.
The debrief is a structured walkthrough of the report with the tester who conducted the engagement. It's interactive — not a presentation but a conversation.
The pen test identified the vulnerabilities. The debrief ensured everyone understands them. Now the organisation fixes them — and the tester comes back to verify that the fixes work.
| Step | What Happens | Timeline |
|---|---|---|
| Remediation | Your IT and development teams address the findings: patching, reconfiguring, rewriting code, implementing new controls. The remediation tracker from the report provides the roadmap. | At your pace — typically 4–8 weeks for the highest-priority findings. Some strategic recommendations may take longer. |
| Retest request | When you're confident the critical and high findings have been addressed, you notify us and we schedule the retest. | Retesting is typically included in the original engagement (within an agreed window, e.g. 30 or 60 days of report delivery). |
| Retest execution | The tester re-examines every finding that was remediated. They verify that the fix works, that it hasn't introduced new issues, and that the original attack chain is broken. | Typically 1–2 days, depending on the volume of remediated findings. |
| Updated report | The original report is reissued with each finding's status updated: resolved, partially resolved, or unresolved. This provides clean evidence for auditors, clients, and insurers. | Delivered within 3 working days of retest completion. |
| Attestation letter | If required, we provide a formal letter confirming the scope, findings, and remediation status — suitable for sharing with clients, regulators, or insurers without disclosing the full technical report. | Issued alongside the updated report. |
The retest closes the loop. Without it, you've identified problems but you haven't confirmed they've been fixed. A pen test without a retest is a diagnosis without a follow-up appointment.
Here's the complete engagement lifecycle, from first conversation to final retest, with realistic timeframes:
From your side, the engagement should feel structured, communicative, and transparent. Here's what a well-run engagement looks like from the client's seat:
| When | What You See |
|---|---|
| Before testing | A thorough scoping conversation. A clear, readable Statement of Work. A pre-engagement call that validates every assumption. A checklist of things to prepare. No ambiguity about what's happening or when. |
| Day 1 | A confirmation email that testing has begun. The tester verifies connectivity and access. Any issues (blocked IP, credentials not working, scope questions) are raised immediately. |
| During testing | Daily status updates (if agreed) — a brief summary of progress, any access issues, and a heads-up on emerging findings. If anything critical is discovered, you're contacted by phone within one hour. |
| End of testing | A verbal summary of key findings. No surprises in the report — you already know the headlines. Report delivery date confirmed. |
| Report delivery | A professional, hand-written report with executive summary, attack narratives, technical findings, and remediation tracker. Delivered encrypted, within the agreed timeframe. |
| Debrief | A scheduled call (or on-site session) where the tester walks through every finding, answers every question, and helps plan the remediation sequence. |
| After remediation | A retest confirming that fixes work. An updated report with clean status. An attestation letter if needed. |
If any of these touchpoints are missing from your engagement — if you don't hear from the tester during testing, if the report arrives without a debrief, if there's no retest option — that's a signal that the process may not be as thorough as it should be.
A professional penetration test is an eight-phase process: pre-engagement, reconnaissance, vulnerability discovery, exploitation, post-exploitation, reporting, debrief, and retest. Each phase builds on the last, and skipping any one of them reduces the value of all the others.
The methodology isn't complicated, but it is rigorous. It ensures that nothing important is missed, that every finding is verified and contextualised, that the report serves every audience from the board to the developer, and that remediation is confirmed rather than assumed.
The best pen test is one where you understand every step of the process — because when you understand the process, you can hold your provider accountable, prepare your teams effectively, and extract maximum value from the engagement.
Every engagement follows this methodology — because rigour is what separates a genuine pen test from a checkbox exercise. The first step is a conversation about what you need.