In a traditional hash-based sharding system using shard = hash(key) % N, adding or removing a single server changes N, which causes a catastrophic reshuffling of almost all keys. If you go from 10 to 11 servers, approximately 90% of your cached data will map to a different server, causing a massive "cache stampede" to the database.
Consistent Hashing solves this elegantly. When a server is added or removed, only a tiny fraction of keys (~K/N keys, where K is total keys and N is total servers) need to be remapped.
position = hash("Server-A").position = hash("user:12345").When a new Server D is added, it is placed on the ring at hash("Server-D"). Only the keys located between Server D and the previous server (counterclockwise) need to be reassigned to Server D. All other keys remain untouched.
When Server B is removed, its keys are transferred to the next server clockwise on the ring. Again, only a small fraction of keys are affected.
A naive consistent hashing ring can suffer from uneven load distribution. If servers happen to be hashed close together on the ring, one server might be responsible for a disproportionately large arc of the ring.
Virtual Nodes solve this by creating multiple "replicas" of each physical server on the ring.
Server-A at one position, place Server-A-1, Server-A-2, ..., Server-A-150 at 150 different positions on the ring.Consistent Hashing is used extensively in industry: