Implementing cache invalidation is wrong

Datetime:2017-04-19 05:48:11         Topic: Nosql          Share        Original >>
Here to See The Original Article!!!

18 Apr 2017 - by 'Maurits van der Schee'

There, I've said it!Again! It is my firm belief that it is. Instead of arguing why this is true I will try to negate the argument I hear most often from people arguing otherwise. In this document we are talking about a primary (data) store and a cache. It would help if you would think about the cache as for instance a Redis or Memcache instance used by a web server and the primary data store is typically a relational database server.

By removing or updating the affected cache records on an update to the primary store we can ensure that the data that is in the cache is always correct.

This is only true under the condition that you can detect the failure of a cache update and that you are willing to fail the transaction that writes to the primary store if the cache update or removal fails. If your cache update may silently fail, then you cannot ensure that the cache is correct. If your cache update is not part of the transaction of the primary store then even a non-silent failure will break the consistency of the cache. Typically the query cache of a database and the memory mapper of a disk driver are implemented in as a transactional cache and are thus reliable.

A transactional cache must be tightly bound to the implementation of the primary store in order to be efficient. Your database for instance already has a local transactional memory cache in the form of query cache. If you wanted to trade write for read performance you may distribute this cache using a transactional replication method. Bottom line is that if you are building a cache system for your database you are most likely re-implementing an already existing and proven system. So either you are doing something that has already been done, or you implementation is not transactional and thus wrong/unreliable.

I feel that once you accept that the cache is not reliable it is better to embrace this inconsistency and talk about a data contract or promise. Not all your data fits in your cache. Your cache must be faster and smaller than your primary data store. If it was big enough you would make it your primary store. If it isn't faster, then you should not use it. Therefor we cannot rely on the data being available in the cache. There are two methods to remove data from a cache: eviction and expiration. When the cache is full "eviction" needs to happen. The least important data is removed to make room for more important data. Often a "least recently used" (LRU) algorithm is used.

The other method to remove data from a cache is by expiration. Expiration assigns a time-to-live (TTL) to each of the cache records and when the time-to-live is reached the data will be removed. Removal of data is not a problem, as it causes the data to be "refreshed". When data is requested from a cache we talk about a "hit" or a "miss". On a "miss" the "fresh" data is retrieved from the primary store and also stored in the cache to enable "hits" on subsequent requests. In case of a "hit" the data from the cache is served, without checking it's "freshness" in the primary store. But we know that it is never older than the TTL and that is a reliable contract.

My advice: keep it simple, don't implement invalidation!