Caching is a necessary technique to make service scalable and it provides the following benefits:
- decreased network costs
- improved responsiveness
- increased performance
- availability of content during network interruption
Before we implement the caching layer, we need to consider the following questions:
- What to cache?
- How to cache?
- When to cache?
The first question is what we should put in a cache? A good rule of thumb is that caching is less useful for dynamic data. If the data are constantly changing, the effort of saving the data to the cache hoping that the same will be requested at a later time is wasted. Caching is more useful for static data that is frequently requested.
The second question is how to cache data. There are different ways setting up a cache. We could have a private cache inside the application or have a shared cache which behaves like a stand-alone service. Fundamentally, cache is a collection of copied data. Therefore, we need to handle the data inconsistency between copied versions and the original version. There are different strategies to read and write data while making sure the data in the cache is not too obsolete.
The third question is when to cache. We could add data to cache when it is requested the first time. This type of cache is called lazy cache. Alternatively, we could load some of the data during the application startup. This process is called cache warming.
In the following sections, we will provide a brief description of different caching strategies.
- Oracle - Coherence Developer's Guid on Caching
- Microsoft - Best Practice - Caching
- Caching Strategies and How to Choose the Right One
Different Types of Caching
There are two main types of caches and the categorization depends on the actual physical location of the cache.
Private cache sits inside the application. It's close to the application code so the latency is low. It can also be used as a buffer when there is network issue because private caching can work without network connection.
The structure of private cache is displayed in the figure below. We make a few observations:
- Because caches are private, each application instance has its own copy. It's very likely different application instances have different content in their cache and data inconsistency issue may arise.
- Application may get data from the cache directly but ultimately the application or the cache need to get the data from the database. With private cache, each application instance needs to maintain a database connection. This approach may not be scalable because there is a limit of how many connection a database can have at a given moment.
Shared cache often works as a separate service. This approach is more scalable and the database hides behind the cache. As the caching becomes a service, it's very easy to perform horizontal scaling by adding more servers. Another benefit of shared caching is that all applications have the same view of the cached content. Of course there is always cost associated with benefits. Fetching data from a shared cache requires a service call so shared caching has higher latency than private caching.
There are two main strategies:
- In-line cache (or read-through/write-through cache)
- Cache aside
The structure is shown in the figure below:
In-line cache and cache-aside are two different ways to organize read and write workflow. The workflow is a logical concept and the physical location of the cache does not matter. That's why we don't have blue server box in the figure above.
One obvious drawback of the cache-aside approach is application still has direct connection to the database. As mentioned earlier, this may cause scaling difficulties. The benefit of cache-aside is that the system is resilient to the cache failure and it is possible to have a data model in the cache that is different from the one in the database.
We will describe two read strategies in this section. The first strategy is called read-through. When application wants to fetch data, if the data is not in the cache, the caching service will query the database and add the returned value to the cache. Then it returns the data to the application. The second strategy is called refresh ahead. Strictly speaking, this is more about how we refresh the cache. The idea is that if an item in the cache is frequently requested then we should proactively refresh the item before it expires. In this way, we can save the database query when the item is requested after it's expired.
We will describe two write strategies in this section. For a write-through cache, when the cache receive a write request from the application, it first writes the data to the database and it only replies back to the application after it receives the confirmation from database that the item is written successfully. On the contrary, for a write-behind cache, the cache will reply to a write request from application immediately and schedule a write to database at a later time. The write-behind cache has higher throughput and low latency but there is data-loss risk because the write to persistent database is delayed.
----- END -----
©2019 - 2022 all rights reserved