Methodology

Measured at the source: the BitTorrent network itself

Most piracy reporting is built on screenshots and site-scraping. piracy.watch is built on continuous, protocol-level observation of the BitTorrent Distributed Hash Table — the same decentralised system the swarms themselves rely on. Here is the pipeline, end to end.

DHT observation

Our collectors participate in the BitTorrent Distributed Hash Table — the decentralised index that every public swarm announces itself to. By crawling the DHT continuously we observe which infohashes are active, how large their swarms are, and how activity changes hour by hour. This is direct measurement of the network itself, not scraping of piracy websites.

Metadata resolution

Active infohashes are resolved to their underlying torrent metadata — the release name and file list — using the standard BitTorrent metadata exchange. The release name is then parsed: source tags (CAM, HDTS, SCREENER, WEB-DL, HDTV, BDRip), resolution, season and episode markers, and group conventions.

Catalogue matching

Parsed releases are matched against a film and television catalogue — title, year, season, studio, production companies and broadcast network — so swarm activity rolls up to the level your business thinks in: titles, franchises, publishers and platforms.

Geographic aggregation

Swarm participation is aggregated by country using IP geolocation, producing per-territory demand pictures. Aggregation happens at country and city level — the platform reports distributions, not individuals.

Infrastructure intelligence

Alongside swarm telemetry we continuously monitor the infrastructure piracy runs on: tracker domains (WHOIS registration status, expiry, uptime, latency) and streaming-site brands across their rotating domains — including takedown-delisting history.

Intelligence products

The result is fifteen live modules, alerting, and packaged monthly reporting: leak velocity and source attribution, licensing-gap and demand mapping, impact-ranked takedown queues, network benchmarks and historical archives.

Principles

Direct observation, not estimates

Headline figures derive from observed swarm participation, not survey extrapolation. When we model or sample, the methodology note on the module says so.

Aggregates, not individuals

Reporting is at the level of titles, territories and infrastructure. We do not publish personally identifiable information about individual file-sharers.

Time-stamped and reproducible

Every observation carries first-seen and last-seen timestamps and is retained in a historical archive, so a number quoted in a report can be traced to when and how it was observed.

Lawful collection

Collection is limited to information the BitTorrent protocol makes public by design. We do not intrude into private systems, bypass technical protection measures, or seed content.

A note on the numbers you'll see on this site

The demo portal and the public title, studio and network pages on piracy.watch use illustrative sample data so we can show the product's analytical surface without publishing client-grade intelligence openly. Catalogue metadata — titles, years, artwork, studios, networks — is real and refreshed weekly. The piracy figures attached to it on the public site are representative samples, clearly labelled. Customer reporting runs on live data.

Questions about the methodology?

We're happy to walk your legal or data team through collection, retention and reporting in detail.