Anatomy of a Breach

Anatomy of a Breach: Collection #1 — 773 Million Email Addresses and the Credential Dump That Dwarfed Everything Before It

> series: anatomy_of_a_breach —— part: 121 —— dataset: collection_1 —— emails: 773,000,000 —— passwords: 21,000,000<span class="cursor-blink">_</span>_

Hedgehog Security 31 January 2019 13 min read

773 million emails. 21 million passwords. Aggregated from thousands of breaches. Available to everyone.

In January 2019, Troy Hunt — the security researcher behind Have I Been Pwned — reported the discovery of Collection #1: a 87GB dataset containing 773 million unique email addresses and over 21 million unique plaintext passwords, compiled from thousands of separate data breaches spanning years. The dataset had been assembled by aggregating credentials from breaches old and new — LinkedIn, Adobe, Myspace, and hundreds of smaller breaches — into a single, searchable, weaponisable collection.

Within weeks, Collections #2 through #5 were discovered, bringing the total to approximately 2.2 billion unique username-password pairs. The scale was staggering: 2.2 billion credentials represents a significant fraction of the world's internet-connected population. For credential-stuffing attackers, the Collection datasets provided an industrial-scale ammunition supply — making password-only authentication fundamentally untenable for any service connected to the internet.


Recommended

Not sure where to start?

We'll scope your test for free and tell you exactly what you need. No obligation, no hard sell.

Free Scoping Call

2.2 billion credentials. Password-only authentication is over.

Aggregated From Thousands of Breaches
Collection #1 was not a single breach — it was the aggregation of credentials from thousands of breaches, compiled into a single dataset for credential-stuffing convenience. This demonstrates that breach data is cumulative: every breach documented in this series contributes to the growing pool of compromised credentials. <a href="https://www.socinabox.co.uk/blog/what-is-the-dark-web-business-guide">Dark web monitoring</a> through <a href="https://www.socinabox.co.uk">SOC in a Box</a> detects when your credentials appear in these aggregated datasets.
MFA Is Now Non-Negotiable
With 2.2 billion credentials available, the probability that any given user's password has been compromised at least once approaches certainty. MFA is the only control that remains effective when passwords are compromised at this scale. <a href="/cyber-essentials">Cyber Essentials Danzell</a> makes MFA an auto-fail criterion because the evidence from a decade of this series — culminating in Collection #1 — is irrefutable.
Credential Stuffing at Scale
The Collection datasets fuelled an explosion in credential-stuffing attacks throughout 2019 and beyond. Automated tools testing billions of credentials against every internet-facing login page became the dominant attack methodology. Our <a href="/penetration-testing/web-application">application testing</a> includes credential-stuffing simulation and rate-limiting assessment.
Passkeys and FIDO2 — The Future
Collection #1 accelerated the industry's move toward passwordless authentication — FIDO2/WebAuthn and passkeys that eliminate passwords entirely. <a href="/cyber-essentials">Cyber Essentials Danzell</a> now accepts passkeys and FIDO2 as MFA methods, reflecting the industry's direction.

Every breach in this series contributed to Collection #1.

Collection #1 is the aggregated consequence of every credential breach documented throughout this series — from LinkedIn (117M) through Adobe (153M) to Yahoo (3B). Each breach added credentials to the pool; Collection #1 made the pool searchable. The lesson is stark: any password that has ever been used on any service that has ever been breached is now effectively public knowledge. Password-only authentication is over.

Cyber Essentials mandates MFA. Dark web monitoring through SOC in a Box detects credential exposure. Our application testing validates authentication controls and credential-stuffing defences. And UK Cyber Defence provides incident response when credential compromise leads to account takeover.


2.2 billion credentials are public. MFA is the only answer. Is yours deployed universally?

<a href="/cyber-essentials">Cyber Essentials Danzell</a> mandates MFA. <a href="https://www.socinabox.co.uk">SOC in a Box</a> monitors dark web credential databases. Our <a href="/penetration-testing">penetration testing</a> validates authentication controls.

Next Step

Not sure where to start?

We'll scope your test for free and tell you exactly what you need. No obligation, no hard sell.

Free Scoping Call

Related Articles