Consistent Hashing

Adding more servers (horizontal scaling) is the key to managing load. Distributing data access across a set of servers allows for high throughput.

In cloud systems it's a given that a server will go down and another will come up to take its place. Consistent hashing is a technique that minimizes the thundering herd of cache misses when the number of servers changes.

If you have a fixed set of servers n and use a naive modulo hash to route requests, about (n-1) / n keys get remapped on a topology change — getting worse as cluster size grows. Modulo hashing has no memory of where a key was. Every topology change is a complete re-derivation from scratch, so the old and new assignments are essentially independent. Two independent uniform distributions over different-sized sets overlap very little.

Consistent hashing puts each server on a ring. Each incoming request is hashed to a position and the closest clockwise server handles the request. This reduces the number of keys that need to be redistributed to 1 / n. It trades a big coordinated miss storm for a small, localized one where only the keys of the preceding neighbor's range are affected. The load is further balanced by adding virtual nodes to the ring. Consistent hashing doesn't solve the problem of hot keys, though. That could be handled by request coalescing, more v-nodes for hot keys or local application caches.

References

Thundering Herd Problem

This article is part of the system design series where I am summarizing chapters from The System Design Interview: Volume 1 / Volume 2 amongst other related content