Introduction
Writing Policies
Write-through Cache
Write on both storages (cache and DB), which can happen concurrently or one after the other.
Increases the write latency but ensures strong consistency between the DB and the cache.
Write Back Cache
Write first to the cache and asynchronously write to the DB.
Low latency, but inconsistent data when reading from db.
Write-Around Cache
Write only to the DB.
The cache is updated after the next cache miss.
It is not favorable for reading recently updated data.
Eviction Policies
Least recently used (LRU)
Most recently used (MRU)
Least frequently used (LFU)
Most frequently used (MFU)
Data Temperature
Hot: highly accessed data
Warm: This is less frequently accessed data
Cold: Rarely accessed data
The idea is to evict the cold data.
Cache invalidation
Invalid/stale data must be marked for deletion
TTL: Time to Live
Add static metadata to each cache entry. And upon expiry, delete those.
Active Expiration: Actively checks the TTL of cache entries through a daemon process
Passive expiration: checks the TTL of entry at the time of access.
Storage Mechanism
Which Cache Server
Use a Hash function. Consistent hashing is preferred for distributed systems to handle scaling or crashes.
Locate Cache
Use a hashing function to locate a cache entry to read/write inside a cache server.
Use Bloom filters to check if the cache doesn't exist.
Gives Probabilistic output: Probably Yes | Firm No.
If yes, use other mechanisms to check if it exists.
Use Doubly linked list to store the data
Adding/removing data is a constant time operation
Evict from tail
Relocate an entry to head (Order based on access - LRU)
Sharding in cache clusters
Sharding involves splitting up cache data among multiple cache servers.
Dedicated cache servers
Flexibility for scale and infra needs.
Allows for cache as a service.
Co-locate cache
Generally used for reduction in CAPEX (Capital Expenditures) and OPEX (Operational Expenditures)
Good for read-only. Issues with updates. If one server updates some data, it will update its cache, but what about others?
Cache Clients
Does the hashing and has information about all the nodes and connects with them.
Last updated
Was this helpful?