Penetration Testing

What Is a Threat-Led Penetration Test (TLPT)? — Intelligence-Driven Red Teaming Explained

> assessment: Threat-Led Penetration Test —— framework: TIBER-EU / DORA —— scope: entire organisation —— method: intelligence-driven red teaming<span class="cursor-blink">_</span>_

Hedgehog Security 6 November 2024 16 min read
tlpt threat-led-penetration-testing red-teaming tiber-eu dora cbest threat-intelligence financial-services

A penetration test that is not a penetration test.

A Threat-Led Penetration Test (TLPT) is a controlled, intelligence-driven simulation of a real-world cyberattack conducted against a live production environment. Despite the name, a TLPT is not a penetration test in the conventional sense — it is a red team exercise in which testers replicate the tactics, techniques, and procedures (TTPs) of specific threat actors known to target the organisation's sector. The 'threat-led' component is what distinguishes it: the attack scenarios are derived from genuine threat intelligence about real adversaries, not from a generic testing methodology or a checklist of common vulnerabilities.

The concept originated in the UK financial sector with the Bank of England's CBEST framework in 2013 and was subsequently adopted and expanded by the European Central Bank's TIBER-EU framework in 2018. Since January 2025, TLPT has become a legally binding requirement for significant financial entities under the EU's Digital Operational Resilience Act (DORA). The test is designed to assess not just whether individual systems can be compromised — any determined attacker will find a way in — but whether the entire organisation, its people, its processes, and its technology, can detect the intrusion, respond effectively, contain the damage, and recover its critical functions.

As one offensive security expert put it: 'A threat-led penetration test is not a pen test at all; it is a red team exercise.' The terminology matters because the expectations, scope, duration, and outputs are fundamentally different from what most organisations associate with the words 'penetration test'.


Understanding what makes TLPT different.

Organisations that have commissioned penetration tests or red team exercises before sometimes assume that a TLPT is simply a more expensive version of what they have already done. It is not. The differences are structural, not just in scale.

Attribute Conventional Penetration Test Red Team Exercise Threat-Led Penetration Test
Scope Specific system, application, or network segment. Defined and bounded by the client. Broader organisational scope, but still typically focused on specific objectives — access the CEO's email, exfiltrate customer data, reach the payment processing environment. Entire organisation. Must cover several or all Critical or Important Functions (CIFs) and the underlying systems, people, and processes that support them. The scope is validated by the regulatory authority.
Threat Intelligence Not intelligence-driven. Testing based on methodology (OWASP, PTES, OSSTMM) and tester expertise. May be informed by threat intelligence, but not required. Testers choose their own attack paths based on experience. Mandatory and formal. A dedicated threat intelligence phase produces a Targeted Threat Intelligence (TTI) report identifying specific threat actors that target the entity. Attack scenarios are derived directly from this intelligence and mapped to MITRE ATT&CK.
Defender Awareness Stakeholders and security teams are typically aware the test is occurring. Some may even assist with access. Security teams may or may not be informed, depending on the engagement rules. The Blue Team (defensive security — SOC, CSIRT) must not know the test is taking place. Only a small Control Team is aware. This ensures that detection, escalation, and response are assessed under realistic conditions.
Duration Days to weeks. Typically 1–4 weeks of active testing. Weeks to months. Typically 4–12 weeks of active testing. Months. The active red team phase alone typically spans 12–16 weeks. The entire TLPT lifecycle — from preparation through closure and remediation — takes 6–12 months.
Regulatory Oversight None. Client-driven commercial engagement. None typically, unless conducted under a specific framework. Supervised by a designated regulatory authority — national central bank, financial regulator, or delegated authority. The regulator validates the scope, reviews the TI report, monitors progress, and issues a formal attestation.
Purple Teaming Not standard. May be offered as an optional add-on. Optional. Depends on the engagement contract. Mandatory under DORA. After active testing, a collaborative purple team exercise walks the Blue Team through every attack step — what happened, what was detected, what was missed, and how to improve. This is where the real value is delivered.
Output Technical vulnerability report with findings, risk ratings, and remediation recommendations. Narrative report describing attack paths, objectives achieved, detection gaps, and strategic recommendations. Test Summary Report, formal remediation plan, and regulatory attestation confirming compliance with the framework. Attestation enables mutual recognition across EU member states — a multinational entity does not need to repeat the full test in each jurisdiction.

CBEST, TIBER-EU, and DORA — a decade of evolution.

Threat-led testing did not emerge in a vacuum. It was developed by financial regulators who recognised that conventional penetration testing — valuable as it is for identifying technical vulnerabilities — was not answering the question that mattered most: could a sophisticated, motivated adversary compromise our critical financial infrastructure and what would happen if they did?

CBEST (2013 — Bank of England)
The original intelligence-led testing framework. CBEST was developed by the Bank of England's Prudential Regulation Authority (PRA) and Financial Conduct Authority (FCA) specifically for UK financial market infrastructure. It introduced the concept of using bespoke threat intelligence — not generic vulnerability scanning — to design red team scenarios targeting the critical functions of banks, payment systems, and market infrastructure. CBEST was voluntary but strongly encouraged by regulators, and it established the template that every subsequent framework followed: threat intelligence drives the test, the defenders do not know it is happening, and the entire lifecycle is overseen by the regulator.
TIBER-EU (2018 — European Central Bank)
The ECB's Threat Intelligence-Based Ethical Red Teaming framework standardised intelligence-led testing across Europe. TIBER-EU defines the complete methodology — from threat intelligence provider selection and due diligence, through red team execution on live production systems, to closure with replay workshops and remediation planning. Sixteen EU member states have published national implementations (including France, Germany, the Netherlands, Ireland, Italy, and Spain). TIBER-EU remained voluntary until DORA made intelligence-led testing a legal requirement — but the framework was always designed as the technical backbone for regulatory testing.
DORA (2025 — EU Regulation)
The Digital Operational Resilience Act transforms TLPT from voluntary best practice into legally binding obligation. DORA Article 26 requires significant financial entities to conduct TLPT at least every three years. The Regulatory Technical Standards (RTS), published July 2024 and directly applicable from July 2025, are based on TIBER-EU but introduce important changes: internal testers are permitted with safeguards, purple teaming is mandatory (not just recommended), ICT third-party service providers must participate when in scope, and mutual recognition enables cross-border acceptance of test results.
Other Frameworks
Several national and sectoral frameworks incorporate equivalent intelligence-led testing. TBEST (UK telecoms sector), STAR (UK payment systems), iCAST (Hong Kong Monetary Authority), AASE (Monetary Authority of Singapore), and Advanced Red Teaming / ART (De Nederlandsche Bank — updated in 2025 to align with DORA). While terminology and governance structures differ, the core methodology is consistent: threat intelligence produces attack scenarios, red teamers execute those scenarios against live systems, the defenders are not informed, and the results assess organisational resilience rather than individual system security.

How a threat-led penetration test actually works.

A TLPT follows a structured lifecycle with four distinct phases. Each phase has specific deliverables, governance checkpoints, and interactions with the regulatory authority. The entire process typically spans six to twelve months — a timeframe that surprises organisations accustomed to two-week penetration tests.

TLPT — The Four Phases
── Phase 1: Preparation (4–8 weeks) ───────────────────────
• Regulatory authority notified and engaged
• Control Team established — small, trusted group who know
about the test (typically CISO, senior risk officer, PM)
• Critical or Important Functions (CIFs) identified — which
business functions, systems, and processes are in scope
• Threat Intelligence provider and Red Team provider selected
(may be same firm, but staff must be separated)
• Rules of engagement agreed — legal framework, risk
management, 'flag' conditions that halt the test,
communication protocols, data handling requirements
• ICT third-party providers notified if in scope (DORA)

── Phase 2: Threat Intelligence (6–8 weeks) ────────────────
• External TI provider conducts research specific to the entity
• Produces Targeted Threat Intelligence (TTI) report:
— Threat landscape for the entity's sector and geography
— Specific threat actors most likely to target the entity
— Their documented TTPs (mapped to MITRE ATT&CK)
— The entity's external attack surface and exposure
— Actionable attack scenarios for the Red Team
• TTI report validated by Control Team and regulator
• Scenarios refined to be testable and safe on live systems

── Phase 3: Red Team Testing (12–16 weeks) ─────────────────
• Red Team executes attack scenarios against LIVE PRODUCTION
• Blue Team (SOC, CSIRT) does NOT know the test is happening
• Testers replicate the specific TTPs from the TI report:
— Initial access (phishing, exploitation, supply chain)
— Persistence and C2 establishment
— Privilege escalation and credential theft
— Lateral movement toward CIFs
— Access to / compromise of critical functions
— Data staging and exfiltration
• Control Team monitors for safety — can halt if needed
• Every Blue Team detection (or failure to detect) recorded
• Red Team produces detailed report of all attack paths

── Phase 4: Closure (4–6 weeks) ────────────────────────────
• Blue Team informed — the test is revealed
• MANDATORY purple team exercise (DORA requirement):
— Red Team walks Blue Team through every attack step
— What happened at each stage
— What was detected and how quickly
— What was missed and why
— Specific improvements to close detection gaps
• 360° feedback session with all stakeholders
• Test Summary Report and remediation plan produced
• Regulator reviews and issues attestation
• Attestation enables mutual recognition across EU states

Five teams, one test — who does what.

Team Role Key Requirements
Control Team Manages the entire TLPT lifecycle. Coordinates between the regulator, TI provider, Red Team, and (eventually) the Blue Team. The only people inside the organisation who know the test is happening during the active phase. Acts as safety net — can halt the test if it risks impacting live services or customer data. Small, trusted group — typically the CISO or equivalent, a senior risk officer, and a dedicated project manager. Must have authority to halt the test immediately. Must maintain absolute confidentiality from the Blue Team throughout the active phase.
Threat Intelligence Provider Produces the Targeted Threat Intelligence (TTI) report that drives the entire test. Researches the entity's threat landscape, identifies relevant threat actors and their TTPs, analyses the entity's external attack surface, and produces actionable attack scenarios for the Red Team to execute. Must always be external to the tested entity — even when internal red teaming is permitted under DORA. Must demonstrate specific capability in cyber threat intelligence for the entity's sector and geography. Their report must be validated by both the Control Team and the regulatory authority before testing begins.
Red Team Executes the attack scenarios against the live production environment. Replicates the specific TTPs identified in the threat intelligence. Documents every action, finding, detection event, and piece of evidence. Produces the detailed technical report that forms the basis of the Test Summary Report. May be external or internal under DORA (with safeguards). Must use external testers at least every third test. Must demonstrate red teaming capability, maintain operational security throughout, adhere to the rules of engagement, and carry appropriate professional indemnity insurance. Internal testers must be functionally independent from the Blue Team.
Blue Team The organisation's defensive security capability — SOC analysts, CSIRT responders, security engineers, threat hunters. They respond to the Red Team's activity exactly as they would to a real attack, because as far as they know, it is a real attack. Their detection and response performance is a core assessment output of the TLPT. Must not be informed that a test is occurring until the closure phase. Their genuine, unscripted response is essential to the validity of the assessment. After the test is revealed, they participate in the mandatory purple team exercise to develop improvements.
Regulatory Authority (TLPT Cyber Team) Oversees the TLPT from initial scoping through to final attestation. Validates that the scope covers the right CIFs, reviews and approves the TTI report and attack scenarios, may observe or receive updates during the active testing phase, and issues the attestation confirming framework compliance. Designated by each EU member state — may be the national central bank, financial regulator, or a delegated authority. The TLPT Cyber Team within the authority requires at least two test managers. The regulator's involvement ensures consistency, quality, and that results are meaningful for supervisory purposes.

What a TLPT scenario actually looks like.

To understand why TLPT differs from conventional testing, consider what a real scenario might look like for a European payment institution. This is illustrative — actual TLPT scenarios are confidential — but it demonstrates the intelligence-driven approach and how the threat actor profiles we publish on this blog directly inform the test design.

Illustrative TLPT Scenario — European Payment Institution
── Threat Intelligence Assessment ──────────────────────────
Entity: European payment institution processing cross-border
transactions for 200+ financial clients

Identified Threat Actor: APT35 (Charming Kitten / IRGC)
Rationale: APT35 has documented history of targeting financial
services across Europe and Middle East. IRGC-sponsored. Known
to exploit ProxyShell/Exchange vulnerabilities AND conduct
patient social engineering campaigns impersonating academics
and journalists. Entity's Exchange infrastructure and LinkedIn
presence make both vectors viable.

── Attack Scenarios (derived from APT35 TTPs) ──────────────

Scenario 1 — Social Engineering Path
Initial Access: Spear-phishing targeting finance staff via
LinkedIn, impersonating industry conference organiser
Credential Harvest: Cloned SSO login page capturing MFA token
Persistence: OAuth app registration for mailbox access
Lateral Movement: Credential reuse across internal systems
Objective: Access payment processing CIF and transaction data

Scenario 2 — Technical Exploitation Path
Initial Access: Exploitation of internet-facing web application
using known CVE (consistent with APT35 rapid exploitation)
Persistence: Web shell deployment + custom C2 backdoor
Privilege Escalation: Kerberoasting / AD exploitation
Lateral Movement: RDP via Fast Reverse Proxy (FRP)
Objective: Access client transaction database and exfiltrate

── Assessment Criteria ──────────────────────────────────────
• Time from initial compromise to first detection
• Time from detection to escalation to incident response
• Effectiveness of containment actions taken by Blue Team
• Whether CIFs were compromised before detection occurred
• Whether Blue Team correctly identified the TTP chain
• Recovery capability for affected critical functions

The key difference is visible immediately: a conventional penetration test would scan for vulnerabilities and attempt exploitation. This TLPT scenario replicates a specific, documented threat actor's behaviour — from the social engineering techniques APT35 actually uses, through the tools they actually deploy, to the objectives they would actually pursue against a financial institution. The assessment measures the organisation's ability to detect and respond to that specific adversary, not just whether a vulnerability exists.


DORA's TLPT requirements — who is in scope.

Not every financial entity is required to conduct TLPT. DORA Article 26 applies to entities identified as 'significant' by their competent authority — those whose failure could have systemic impact on financial stability. The criteria for identification are defined in the Regulatory Technical Standards and include the entity's size, systemic importance, ICT risk profile, and the criticality of its functions to the broader financial system.

Entity Type TLPT Requirement
Credit Institutions (Global Systemically Important) Mandatory TLPT at least every three years. Approximately 120 significant banks under ECB direct supervision (Single Supervisory Mechanism) are expected to be in scope. These institutions underpin the stability of the European financial system.
Payment and Electronic Money Institutions Required if identified as significant by their competent authority based on size, systemic importance, and ICT risk profile. Major payment processors and e-money issuers handling significant transaction volumes are likely candidates.
Investment Firms, CSDs, and Trading Venues Required if identified as significant. Central Securities Depositories and trading venues that underpin critical market infrastructure are primary candidates given the systemic impact of their disruption.
Insurance and Reinsurance Undertakings Required if identified as significant by EIOPA or national supervisory authorities. Major insurers whose failure could impact policyholder protection or financial stability.
Crypto-Asset Service Providers Newly in scope under DORA — the first time crypto-asset service providers face mandatory intelligence-led testing requirements. Required if identified as significant based on the criteria in the RTS.
Microenterprises Exempt. Entities with fewer than 10 employees and annual turnover or balance sheet below €2 million are excluded from TLPT requirements. They remain subject to DORA's general testing obligations under Articles 24–25.

The first TLPT deadline for significant entities under DORA falls before 17 January 2028 — three years after the regulation entered into force. However, entities already operating under TIBER-EU national implementations may have earlier or existing obligations. Competent authorities may also increase the testing frequency based on the entity's risk profile and operational circumstances. Planning should begin at least 18 months before the deadline to allow for provider selection, scope definition, regulatory engagement, and the test itself.


What DORA changes compared to TIBER-EU.

DORA's TLPT requirements are built on the TIBER-EU framework, but the move from voluntary guidance to binding regulation introduced several important changes. In February 2025, the TIBER-EU framework was itself updated to align with DORA's RTS — including adopting DORA's 'Control Team' terminology in place of TIBER-EU's original 'White Team' — beginning a convergence process that will eventually fully align the two.

Internal Testers Now Permitted
TIBER-EU required all red team testers to be external to the tested entity. DORA permits internal testers — recognising the value of in-house red team capability — but with safeguards. Internal testers must demonstrate equivalent capability to external providers. An external red team must be used at least every third test. And the threat intelligence provider must always be external, regardless of whether the Red Team is internal or external. This means at minimum one external TI engagement and one external red team engagement per three-test cycle.
Purple Teaming Is Mandatory
TIBER-EU strongly recommended purple teaming and replay workshops, but did not mandate them. DORA makes the purple team exercise compulsory. This is arguably the most valuable change: the mandatory collaboration between Red Team and Blue Team after the active phase transforms the test from an assessment into a learning exercise. The Red Team walks the Blue Team through every attack step, explains what detection opportunities existed, and helps develop specific improvements. This is where detection capability actually improves — not during the test, but in the closure phase.
ICT Third-Party Providers Must Participate
When ICT service providers are in scope of the TLPT, the financial entity must take the necessary measures to ensure their participation. This reflects the reality that modern financial infrastructure depends heavily on managed service providers, cloud vendors, and critical software suppliers — and that supply chain compromise is a primary attack vector for the sophisticated threat actors that target the financial sector. The entity remains fully responsible for the impact of tests involving third parties.
Mutual Recognition via Attestation
Upon successful completion, the competent authority issues a formal attestation confirming the test met framework requirements. This attestation enables mutual recognition across EU member states — a multinational entity operating in multiple jurisdictions does not need to repeat the full TLPT in each country. This addresses a significant cost and complexity concern and was a key request from the financial industry during the DORA consultation process.

What TLPT reveals — and what it does not.

TLPT provides an assessment that no other testing methodology can replicate: a realistic, intelligence-driven evaluation of how the entire organisation — people, processes, and technology — would perform against a genuine, sophisticated adversary targeting its most critical functions. When executed well, a TLPT reveals gaps in detection capability, incident escalation procedures, response coordination, containment effectiveness, and recovery processes that conventional penetration testing and vulnerability assessments simply cannot identify.

However, TLPT also has inherent limitations that organisations must understand. Because the test replicates known threat actor TTPs, it may not capture novel, zero-day, or signatureless attacks that a real adversary might employ. A test based on APT35's documented techniques tells you how your organisation would perform against APT35's documented techniques — it does not tell you how you would perform against an attack method that APT35 has not yet used publicly, or against a completely different adversary. Some frameworks address this through a 'Scenario X' provision, allowing the Red Team to include creative, non-intelligence-driven attack techniques alongside the threat-led scenarios, but this is not universally applied.

Organisations should treat TLPT as one component of a comprehensive security testing programme — alongside regular penetration testing (DORA Articles 24–25 require annual pen testing for all in-scope entities), vulnerability management, and continuous security monitoring — rather than as a single test that validates the entire security posture. TLPT tests resilience against specific, documented threats. That is its strength and its limitation.


The bottom line.

A Threat-Led Penetration Test is an intelligence-driven red team exercise conducted against a live production environment, in which testers replicate the tactics, techniques, and procedures of real threat actors known to target the entity's sector. The Blue Team does not know the test is happening. The scope covers the organisation's critical functions. The scenarios are derived from genuine threat intelligence about real adversaries. And under DORA, the entire process is overseen by the regulatory authority and results in a formal attestation that carries legal weight across the EU.

TLPT represents the most rigorous form of security testing available to financial institutions. It moves beyond the question of 'can this system be compromised' — to which the answer is always yes — to the question that actually matters: 'when this system is compromised, will our organisation detect it, respond effectively, and protect its critical functions?' The mandatory purple teaming phase ensures that the test produces actionable capability improvements, not just a report that sits in a drawer.

For financial entities in scope of DORA's TLPT requirements, the clock is running. The first deadline falls before January 2028, but provider selection, regulatory engagement, scope definition, threat intelligence production, and ICT third-party coordination all require significant lead time. The organisations that extract the most value from TLPT are those that treat it not as a compliance exercise to be endured but as a genuine, high-fidelity test of their resilience against the adversaries that are, right now, actively targeting their sector.


Preparing for your first Threat-Led Penetration Test?

We provide threat intelligence and red teaming services for TLPT engagements under TIBER-EU, CBEST, and DORA frameworks. Our team delivers the targeted threat intelligence reports and red team execution that regulatory bodies require — grounded in real-world threat actor TTPs mapped to your organisation's specific risk profile and critical functions.