If you’ve ever downloaded a large file—a game patch, or (let’s be honest) a TV show—chances are you’ve used BitTorrent, even if indirectly. Born in 2001 from the mind of Bram Cohen, BitTorrent didn’t just become another file-sharing protocol—it became the most efficient way to distribute large files across the internet. At its peak, it accounted for 35% of all internet traffic.
But what made BitTorrent so special? And how does it actually work? Let’s break it down.
P2P networking : A refresher
A simple definition :
💡A communications model in which each party has the same capabilities and either party
can initiate a communication session
The formal definition :
💡A distributed network architecture may be called a Peer-to-Peer (P-to-P, P2P,...) network
if the participants share a part of their own (hardware) resources (processing*
*power, storage capacity, network link capacity, printers,...).
These shared resources are necessary to provide the Service and content offered by the network (e.g. file sharing or shared workspaces for collaboration).
They are accessible by other peers directly, without passing intermediary entities.*
The participants of such a network are thus resource (Service and content) providers as well as resource (Service and content) requestors (Servant-concept)
Pure p2p :
💡A distributed network architecture has to be classified as a “Pure” Peer-to-Peer network,
if it is firstly a Peer-to- Peer network according to Definition 1 and
secondly if any single, arbitrary chosen Terminal Entity can be removed from the network without having the network suffering any loss of network service*
Hybrid p2p :
💡distributed network architecture has to be classified as a “Hybrid” Peer-to-Peer network, if it is firstly a Peer-to-Peer network according to Definition 1 and secondly a central entity is necessary to provide parts of the offered network services
Why bittorrent was just different
Most early file-sharing systems (like Napster or Kazaa) relied on a simple model: you download a file directly from one person. The problem? If that person had a slow upload speed, your download crawled to a halt.
BitTorrent flipped this on its head by introducing swarming—a way to download different pieces of a file from multiple people at once. This meant:
Faster downloads (you’re not bottlenecked by one slow peer).
Fairer sharing (you upload to others while downloading).
Less strain on the original host (the more people downloading, the more sources there are).
The key innovation? Tit-for-tat incentives—if you don’t upload to others, they won’t upload to you. This kept freeloaders in check and made the system self-sustaining
Architecture
💡Read the entire protocol specification at https://www.bittorrent.org/beps/bep_0003.html
The base components of the system are :
static metainfo file (a “torrent” file) - A metadata file that describes the content (filename, size, hashes) and points to the tracker.
a tracker - A server (or decentralized network) that helps peers find each other in the swarm.
Seed (an original downloader) - A peer that has the complete file and shares it with others.
Leecher (the end user downloader) - A peer downloading the file (may also upload pieces it already has).
Torrent File
This is a bencoded file, checkout a simple explanation and implementation in the attached link. Each torrent file decodes to something like this
{
"announce": "<http://tracker.ubuntu.com:6969/announce>", // Tracker URL
"info": {
"name": "ubuntu-22.04-desktop-amd64.iso", // File name
"piece length": 262144, // 256 KB per piece
"pieces": "a3b456c7890def... (SHA-1 hash list)", // Hashes for each piece
"length": 3650725888, // File size (~3.6 GB)
"files": [ // (For multi-file torrents)
{ "path": "ubuntu-22.04/README.txt", "length": 1024 },
{ "path": "ubuntu-22.04/iso/image.iso", "length": 3650724864 }
]
},
"created by": "mktorrent 1.1", // Tool used
"creation date": 1640995200 // Unix timestamp
}
announce
The tracker URL that coordinates peers
In trackerless torrents (DHT), this field may be omitted.
info
name
: Name of the file/directory.piece length
: Size of each "piece" (typically 256 KB–2 MB).pieces
: Concatenated SHA-1 hashes of every piece (the real file would have 1000s of hashes).length
: Total file size (for single-file torrents).files
: List of files (for multi-file/directory torrents).
Optional Fields
created by
: Software that generated the torrent.comment
: Human-readable notes (e.g., "Official Ubuntu ISO").
Tracker
The tracker maintains a real-time log of peers downloading the same file and helps them connect (swarm formation).
It uses a simple HTTP-based protocol to exchange peer lists—no file data is handled.
Peers announce their presence (IP/port) to the tracker and receive a list of other active peers.
The tracker is not involved in actual data transfer between peers.
Seeds
A seed is a peer that has the complete file and shares it with others in the swarm. Without at least one seed, downloads cannot complete. Seeds ensure availability by:
Uploading full copies of the file to new peers.
Keeping the swarm alive even after distributing the initial copy (if needed).
Reducing bandwidth strain on the original publisher.
So to download a file the journey starts by downloading the torrent file first and use that with a bittorent client to start downloading, and incase you want to upload and share, upload the file to the system and share the torrent file to whoever wants to download
Piece Selection Algorithms
BitTorrent doesn’t just randomly grab file chunks—it uses a smart, self-optimizing system to ensure fast, resilient downloads. Here’s how its Piece Selection Algorithm works, broken down into its clever policies:
1. Sub-Pieces: Keeping the Pipes Full
Files are split into pieces (e.g., 256 KB), then further into sub-pieces (~16 KB).
Why? To maintain steady TCP throughput by always having multiple sub-pieces in transit.
Think of it like a conveyor belt: As soon as one sub-piece finishes, another is requested.
2. Strict Policy: Finish What You Start
Once a sub-piece is requested, the client prioritizes grabbing the rest of that piece first.
Goal: Avoid half-baked pieces cluttering the swarm.
3. Rarest First: Save the Endangered Pieces
The core of BitTorrent’s efficiency. Clients prioritize downloading the rarest pieces first.
Why?
Prevents "last-piece syndrome" (where one missing chunk stalls everyone).
Spreads diversity in the swarm, so no single peer becomes a bottleneck.
Rewards peers who upload rare pieces (tit-for-tat incentives kick in).
4. Random First Piece: Jumpstarting Downloads
At the start, you have nothing to trade. So, the client grabs a random piece first (not rarest).
Logic: Rare pieces download slower (fewer sources). Better to get any piece fast to start uploading.
5. Endgame Mode: The Final Sprint
When 99% done, the client broadcasts requests for missing sub-pieces to all peers.
Tradeoff: Wastes a little bandwidth but prevents one slow peer from delaying completion.
Why This Matters
BitTorrent’s algorithm isn’t just fast—it’s self-healing. By:
Prioritizing rarity, it ensures no piece goes extinct.
Balancing fairness (tit-for-tat) with urgency (endgame mode).
Optimizing for both seeders (who want to offload data) and leechers (who want speed).
The result? A swarm where every peer’s selfishness (wanting fast downloads) aligns with the collective good.
Fun fact: This is why dead torrents often fail at 99%—no seed has the rarest piece!
Resource Allocation Algorithms
BitTorrent doesn’t just move data—it enforces a fair-play economy where peers trade uploads for downloads. Here’s how its Resource Allocation system works, broken into its key mechanisms:
1. Choking Algorithm: Tit-for-Tat in Action
Peers unchoke (allow uploads to) the 4 fastest downloaders at any time.
Why? It mirrors game theory’s "tit-for-tat": You scratch my back, I’ll scratch yours.
Updated every 10 seconds using a 20-second average to avoid TCP thrashing.
Result: Free riders get choked out; generous peers get faster downloads.
2. Optimistic Unchoking: Giving New Peers a Chance
One slot is reserved for a random peer, even if they’re slow.
Rotated every 30 seconds to test if new connections outperform existing ones.
Purpose: Discovers hidden fast peers and helps newcomers join the swarm.
3. Anti-Snubbing: Retaliation Mode
If a peer gets nothing from another peer for 60 seconds, it’s "snubbed."
Response: Stop uploading to that peer (except during optimistic unchoking).
Logic: Punishes uncooperative peers while searching for better partners.
4. Upload-Only Mode: Seeds Play Nice Too
Once a peer finishes downloading, it switches to seeding mode.
Now, it prioritizes peers with the highest upload rates (not download rates).
Why? Ensures fast replication of pieces to keep the swarm healthy.
Let’s talk implementation!
Thankfully the protocol spec is very clear, which makes life easier to navigate the implementation. Here I have particularly focused on downloading the file
The client establishes a TCP connection with a peer who has the “piece” that is going to be downloaded.
💡The file is divided into multiple pieces and each piece is divided into multiple blocks. Typically a peer with a piece has all the blocs of that piece
So now we can think about the steps to download a particular piece as
Read torrent file
Get info hash(Info hash is a unique identifier for a torrent file. It's used when talking to trackers or peers.) and get response from tracker about peers who have the particular piece
connect to the peer (TCP) connection
send handshake
get bitfield message
send interested message
receive unchoke message
for all blocks in piece
send req for block(”piece” message in above diagram)
store block
Combine all blocks and download piece
And to download the file we just need to download all possible pieces!
Checkout the implementation in rust https://github.com/AkhileshManda/BitClient
Key Upgrades to the Protocol: How BitTorrent Evolved for Survival
BitTorrent’s early design had critical weaknesses—centralized trackers were vulnerable to shutdowns, and ISPs aggressively throttled its traffic. Two major upgrades transformed it into the unstoppable, ISP-friendly protocol we know today. Here’s a deep dive into these improvements, covering bulk traffic marking and the decentralized tracker (DHT), including its underlying mechanics like keyspace partitioning and Kademlia’s peer discovery.
Bulk Traffic Marking: Making Peace with ISPs
The Problem
Early BitTorrent traffic was a network administrator’s nightmare. Unlike web browsing or video streaming, which have predictable bandwidth patterns, BitTorrent would saturate connections by aggressively using all available bandwidth. This clashed with real-time applications like VoIP and online gaming, leading ISPs to throttle or block BitTorrent traffic entirely.
The Solution
BitTorrent’s version 4 update introduced bulk traffic tagging, marking its packets as low-priority ("scavenger class") using DiffServ or similar QoS flags. This allowed:
ISPs to deprioritize (but not block) BitTorrent during congestion.
Fair coexistence with latency-sensitive traffic.
Business adoption, as enterprises could now use BitTorrent without triggering network alarms.
The Impact
This small change had massive repercussions. Suddenly, BitTorrent was no longer public enemy number one for ISPs. Companies like Blizzard Entertainment began using it to distribute game patches, and Linux distributions relied on it to handle massive download spikes.
Decentralized Tracker (DHT): Eliminating the Weakest Link
The Problem
Originally, BitTorrent relied on centralized trackers—servers that maintained lists of peers for each torrent. This created a single point of failure:
Legal vulnerability: Trackers could be sued or shut down (e.g., Pirate Bay’s legal battles).
DDoS risk: Overloaded trackers could cripple entire swarms.
Cost: Hosting trackers required infrastructure and bandwidth.
The Solution: Distributed Hash Tables (DHT)
BitTorrent’s version 4.1 replaced trackers with a peer-to-peer discovery system based on Kademlia DHT. Here’s how it works:
Keyspace Partitioning: Who Stores What?
Every peer and torrent is assigned a 160-bit ID (SHA-1 hash).
These IDs exist in a virtual ring (modulo 2¹⁶⁰).
A torrent’s peer list is stored on the closest peer(s) to its ID.
Consistent hashing ensures minimal reshuffling when peers join/leave.
The Overlay Network: Structured Peer Connections
Peers maintain connections to others in a structured way using k-buckets:
Each bucket holds up to k peers (typically 8) at exponentially increasing distances.
Distance metric: XOR between peer IDs (ensures symmetry and efficient lookups).
Peers periodically refresh buckets to keep the network alive.
Kademlia’s Lookup Protocol: Finding Peers Efficiently
When a peer wants to find others for a torrent:
It queries the closest peers it knows for the target torrent ID.
Each hop halves the remaining distance, converging in O(log n) steps.
Parallel queries (
α = 3
) prevent delays from slow peers.
Key RPCs (Remote Procedure Calls):
PING: Check if a peer is alive.
STORE: Save a peer list for a torrent.
FIND_NODE: Discover peers near a given ID.
FIND_VALUE: Retrieve a peer list if available.
The Impact
No more trackers: Torrents could survive legal attacks and server failures.
Self-healing: Peers joining/leaving didn’t disrupt the network.
Scalability: Supported millions of peers with minimal overhead.
Conclusion
Overall the thing that psyced me the most is understand behind the scenes of a software that helped me get SOO many games and movies. For those who have used some sort of torrent client you know that finding the right torrent was also a skill in itself ( I admit it was not a skill I ever had). There were a ton of resources online that helped with the implementation that I’ll tag below. Hope you enjoyed the read and picked up something new!
Resources
The paper : https://web.cs.ucla.edu/classes/cs217/05BitTorrent.pdf
Protocol specification : https://www.bittorrent.org/beps/bep_0003.html
Arpit Bhayani’s playlist :
Bencoding specification : https://code.google.com/archive/p/bencode-net/wikis/BEncode.wiki
Implementation Resource #1 :https://app.codecrafters.io/courses/bittorrent/introductio
My implementation in Rust (only has download from single peer) : https://github.com/AkhileshManda/BitClient