System Design - Memcached and Common Caching Issues

Subscribe Send me a message home page tags

#system design  #memcached  #caching 

Related Readings

Memcached Basics

Memcached is primarily used as a performance-enhancing complement to the database layer. It is used for optimizing disk I/O or for optimizing CPU (i.e. cache expensive-to-compute values). The memcached API is simple to use and the distribution of data is entirely accomplished on the client side. Data in memcached is stored completely in memory and this offers low latency for data access. If the memory becomes full, memcached will evict older data based on LRU.

Memcached clients have a list of memcached server node addresses (IP address and port) and use a consistent hashing algorithm to determine which memcached node caches a key. This can be problematic when we add more nodes to the cluster or an existing node fails because the client may not have the updated key-node mapping information.


Fig 1: Consistent Hashing

Memcached follows the "cache aside" pattern. Data reads are typically serviced through the cache and the data writes are handled by the database. In the case of cache misses or data writes, the database layer handles the requests and the result will be stored in a memcached node. To modify the data, applications must update the database and also update or delete the data in the memcached nodes. In some cases, applications may use expiration times when the application can handle stale data.

Here is the high level architecture of an application that uses memcached as the caching tier.


Fig 2: High Level Architecture

Common Issues with Caching

Memcached is not a clustered solution because memcached nodes are independent and unaware of the presence or state of other memcached nodes. This means there is no coordination between memcached nodes and it makes memcached susceptible to common issues with caching:

In short, the following components seem to be missing in memcached:

----- END -----