Caching
Cache-aside vs write-through vs write-back, eviction policies (LRU/LFU/FIFO/TTL), where caches live, and Redis vs Memcached.
A cache is a fast copy of slow data, and every caching decision is really two decisions: how writes flow (the strategy) and what gets thrown out when it’s full (the eviction policy). Get those right and you’ve absorbed most of your read load; get them wrong and you serve stale data, lose writes, or fall over the moment a hot key expires.
Write strategies
| Strategy | Use when | Avoid when | Gotcha |
|---|---|---|---|
| Cache-aside (lazy) | read-heavy workloads; most common default | writes must be instantly reflected in cache | first read after a write is a miss; risk of stale entries until TTL |
| Write-through | you need the cache always consistent with the DB | write-heavy paths where the extra write hurts latency | slower writes; caches data that may never be read |
| Write-back (write-behind) | very write-heavy, can tolerate some loss for speed | data you cannot afford to lose | a crash before flush loses the un-written data |
The strategies sit on a clear spectrum. Cache-aside makes the application responsible for the cache — it’s the pragmatic default because it only caches what’s actually read and degrades gracefully (if the cache dies, you just hit the DB). Write-through trades write latency for freshness — the cache is never stale because every write hits both stores, but you pay a double write and cache data that may never be read. Write-back is the fastest writer (acknowledge from the cache, flush to the DB asynchronously) and the most dangerous — a crash between acknowledge and flush loses committed-looking data, so it’s only for data you can afford to lose.
Eviction policies
When the cache fills, something must go. The policy decides what:
- LRU (least-recently-used) evicts the entry untouched for the longest — great when recent access predicts future access, which is true often enough that LRU is the common default.
- LFU (least-frequently-used) evicts the entry with the fewest hits — better when popularity is stable (a few items are perennially hot regardless of recency).
- FIFO evicts the oldest-inserted regardless of use — simple, but it’ll throw out a hot item just because it’s been around a while.
TTL is orthogonal to all three: it expires an entry by age no matter how full the cache is. You almost always use both — a TTL to bound staleness, plus an eviction policy to bound size.
def get_user(user_id):
key = f"user:{user_id}"
cached = redis.get(key)
if cached is not None: # HIT
return json.loads(cached)
user = db.query("SELECT * FROM users WHERE id = %s", user_id) # MISS → source of truth
redis.set(key, json.dumps(user), ex=300) # populate with a 300s TTL
return user
# On update, invalidate so the next read re-populates fresh:
def update_user(user_id, data):
db.update("users", user_id, data)
redis.delete(f"user:{user_id}") # delete, don't update — avoids a stale write raceThis is the canonical cache-aside pattern. The non-obvious line is the delete on write, not an update. If two writers both tried to set the new value, their writes could interleave so the cache ends up holding the older one (a stale-write race). Deleting sidesteps the race entirely: whoever reads next takes a single clean miss and repopulates from the source of truth. A miss is cheap; a wrong cached value can persist for the whole TTL.
The stampede, and where caches live
01 Learning objectives
0 / 4 done02 Curated reading
03 Knowledge check
- 01medium
Read-heavy workload, occasional staleness is OK, cache should self-heal on a miss. Which strategy?
- 02medium
Which eviction policy removes the entry unused for the longest time?
- 03hard
Write-back (write-behind) caching's main risk is:
04 Interview questions
browse all ↗What gets asked on this topic — tap a card for how to approach it, the follow-ups, and the trap. Company tags are best-effort & sourced.
-
Compare cache-aside, write-through, and write-back. When do you use each?
Cache-aside (lazy loading): app checks cache; on a miss it reads the DB, populates the cache, and returns. Most common for read-heavy workloads; only requested data is cached, but the first hit is a miss and stale data is possible without invalidation.
Write-through: writes go to cache and DB together, so the cache is always consistent — at the cost of higher write latency and caching data that may never be read. Write-back (write-behind): write to cache immediately and flush to the DB asynchronously — fast writes and great for write-heavy bursts, but a cache crash before flush loses data. Pick cache-aside by default; write-through when reads must never be stale; write-back when write latency dominates and you can tolerate the durability risk.
Follow-ups they push on- Which strategy risks data loss and why?
- How do you keep cache-aside from serving stale data?
Red flag Using write-back for data you can't afford to lose. An async-flush cache that dies before flushing loses the unwritten writes.
source: AWS Builders' Library — Caching challenges and strategies ↗ -
LRU vs LFU vs FIFO eviction, plus TTL — how do you choose?
When a cache is full it evicts by policy. LRU drops the least-recently-used entry — the default; great when recent access predicts future access (temporal locality). LFU drops the least-frequently-used — better when some items are persistently hot regardless of recency, but it can keep stale 'once-popular' items. FIFO evicts the oldest inserted regardless of use — simple but ignores access patterns.
TTL is orthogonal: it bounds staleness by expiring entries after a time, independent of capacity pressure. Typical setup: LRU for capacity eviction plus a TTL for freshness.
Follow-ups they push on- When does LFU beat LRU?
- How does TTL interact with an eviction policy?
Red flag Treating TTL as an eviction policy. TTL bounds staleness over time; LRU/LFU/FIFO decide what to drop under memory pressure — they solve different problems.
source: GeeksforGeeks — Cache eviction policies ↗ -
Implement an LRU cache with O(1) get and put. What data structures do you use?
Combine a hash map (key → node) with a doubly linked list ordered by recency. The map gives O(1) lookup; the linked list gives O(1) move-to-front and O(1) eviction at the tail.
On
get: look up the node in the map, unlink it, move it to the head (most recent), return its value. Onput: if present, update and move to head; if new, insert at head and add to the map; if over capacity, remove the tail node and delete its key from the map. Both operations are O(1) because every step is a constant-time pointer/map update.Map (key -> node) + DLL: head=newest ... tail=evictFollow-ups they push on- Why a doubly (not singly) linked list?
- How would you make it thread-safe?
- In an interview, can you use a language built-in like LinkedHashMap?
Red flag Using an array or scanning the list to find the LRU item — that's O(n). The hash map + DLL pairing is what keeps both operations O(1).
source: LeetCode — LRU Cache (146) ↗ -
What is a cache stampede (thundering herd) and how do you prevent it?
A cache stampede happens when a hot key expires and many concurrent requests all miss simultaneously, then all hit the database at once to recompute the same value — a spike that can overload the backend.
Mitigations: request coalescing / locking so only one request recomputes while others wait for or briefly serve the old value; early/probabilistic expiration so one request refreshes the key slightly before it expires; stale-while-revalidate, serving the old value while refreshing in the background; and jittering TTLs so many keys don't expire at the same instant.
Follow-ups they push on- How does a per-key lock prevent the dogpile?
- What is probabilistic early expiration?
Red flag Giving many hot keys the same fixed TTL, so they expire together and trigger a synchronized backend stampede. Add jitter and coalesce recomputation.
source: AWS Builders' Library — Caching challenges and strategies ↗ -
Why is cache invalidation hard, and what are the failure modes (stale reads)?
There are really two hard problems: deciding *when* a cached value is no longer valid, and making the cache and source of truth agree across concurrent updates. With cache-aside, a classic race: reader gets a miss and starts loading the old value; a writer updates the DB and deletes the cache key; the reader then writes its stale value back — now the cache is wrong indefinitely.
Approaches: delete (don't update) the key on write so the next read repopulates fresh; use short TTLs to bound staleness; version keys; or for tighter consistency use write-through. There's no free lunch — you trade consistency, latency, and complexity.
Follow-ups they push on- Why delete the key on write instead of updating it?
- How does a short TTL bound the damage?
Red flag Updating the cache in place on writes (instead of deleting) and ignoring the read-load/write-delete interleaving — you cache a stale value that never self-heals.
source: AWS Builders' Library — Caching challenges and strategies ↗ -
Redis vs Memcached — when would you pick each?
Memcached is a simple, multithreaded, in-memory key→blob cache — extremely fast and easy to scale for pure caching of opaque values. Redis is a richer in-memory data store: it has data structures (lists, sets, sorted sets, hashes, streams), optional persistence, replication, pub/sub, Lua scripting, and clustering.
Pick Memcached when you just need a fast, large, simple cache and want multithreaded throughput per node. Pick Redis when you need those data structures, durability, atomic operations, pub/sub, rate-limiter counters, leaderboards, or built-in replication/clustering — which is most modern use cases.
Follow-ups they push on- When does Memcached's multithreading actually win?
- What Redis features make it more than a cache?
Red flag Saying 'Redis is just a faster Memcached'. The real difference is Redis's data structures, persistence, and clustering, not raw speed.
source: AWS — Redis vs Memcached ↗ -
Where can caches live across a request's path, and what does each layer cache?
Caching exists at many layers, each closer to the user is cheaper: browser cache (per-user, static assets via Cache-Control/ETag); CDN / edge (shared, geo-distributed static and cacheable responses); application / in-process (a local in-memory map — fastest but per-instance and not shared); distributed cache (Redis/Memcached — shared across app servers); and the database query/buffer cache.
The instinct: cache as close to the user as the data's freshness allows. Each layer trades reach (shared vs per-instance) against latency and invalidation difficulty.
Follow-ups they push on- Trade-off of in-process vs distributed cache?
- Why is CDN caching great for static but tricky for personalized content?
Red flag Caching personalized/private data in a shared CDN or proxy layer, leaking one user's data to another. Mark it private/no-store.
source: system-design-primer — Caching ↗ -
What are cache penetration and cache avalanche, and how do they differ from a stampede?
Cache penetration: requests for keys that don't exist anywhere always miss the cache and hit the DB — common in scraping/attacks. Fix by caching the negative result (a short-TTL 'null' marker) or screening with a Bloom filter of valid keys.
Cache avalanche: a large set of keys expire at once (or the cache itself goes down), so traffic floods the DB en masse. Fix by jittering TTLs, layering caches, and adding rate limiting / circuit breakers in front of the DB. Versus a stampede, which is many concurrent requests for *one* expiring hot key — avalanche is many keys at once, penetration is keys that never exist.
Follow-ups they push on- How does a Bloom filter stop penetration cheaply?
- Why does jittering TTLs help avalanche?
Red flag Not caching negative lookups, so a flood of requests for nonexistent keys bypasses the cache entirely and hammers the database.
source: GeeksforGeeks — Cache penetration, avalanche, stampede ↗ -
On a write, should you update the cache in place or delete (invalidate) the key? Why is delete usually safer?
Prefer delete (invalidate) over update-in-place. Updating the cache directly on write opens a race: two concurrent writers can set the cache in the opposite order from how they hit the DB, leaving the cache holding the older value permanently. Deleting the key sidesteps that — the next read just repopulates from the source of truth.
Delete is also cheaper (you don't recompute a value that may never be read) and avoids caching an intermediate state. The cost is one guaranteed cache miss after each write. For very hot keys you can refresh asynchronously, but the default rule is invalidate, don't update.
What a strong answer coversUpdate-in-place risks concurrent writers leaving a stale value forever.
Delete forces the next read to repopulate from the source of truth.
Delete avoids recomputing values that may never be read.
Cost: one cache miss after each write.
Quick self-checkOn a database write, the more robust cache strategy is usually to:
-
Risky — concurrent writers can reorder and leave a stale value permanently.
-
Correct — avoids the reorder race and stale-value pinning.
-
Wrong — that prolongs staleness, not fixes it.
-
Wrong — relying solely on TTL serves stale data until expiry.
Follow-ups they push on- Walk through the concurrent-writer race that update-in-place causes.
- When might you refresh the cache asynchronously instead of deleting?
Red flag Writing the new value straight into the cache on every update. Concurrent writes can reorder and pin a stale value; deleting the key is the robust default.
source: AWS Builders' Library — Caching challenges and strategies ↗ -
What makes a good cache key, and why is cache hit ratio the metric that matters most?
A good cache key is deterministic (same logical request → same key), specific enough to avoid collisions (include the parameters that change the result — user, locale, version), and normalized (sort query params, lowercase where appropriate) so equivalent requests share a key. Over-specific keys (including irrelevant params like a request id) fragment the cache and tank the hit rate; under-specific keys serve the wrong data.
Hit ratio (hits / total lookups) is the headline metric because a cache only pays off when most reads avoid the backend. A low hit ratio means you're spending memory and adding a layer for little benefit — investigate whether keys are too granular, TTLs too short, or the working set exceeds cache capacity.
What a strong answer coversKeys must be deterministic, normalized, and include exactly the result-affecting params.
Over-specific keys fragment the cache; under-specific keys serve wrong data.
Hit ratio = hits / lookups — the core measure of cache value.
Low hit ratio → keys too granular, TTL too short, or working set > capacity.
Follow-ups they push on- How can including a request id or timestamp in the key destroy the hit ratio?
- What does a sudden hit-ratio drop usually indicate?
Red flag Baking a unique/volatile value (request id, current timestamp) into the cache key, so every request misses — you've added overhead with a near-zero hit ratio.
source: AWS — Caching best practices ↗ -
Why does Redis need a persistence and eviction policy, and what's the difference between RDB and AOF?
Redis holds data in memory, so two policies matter. Eviction (
maxmemory-policy) decides what happens when memory fills —noeviction(reject writes),allkeys-lru,volatile-ttl, etc. Pick LRU/LFU variants when using Redis as a cache;noevictionwhen it's a primary store you can't silently drop from.Persistence decides what survives a restart. RDB takes periodic point-in-time snapshots — compact, fast to load, but you lose writes since the last snapshot. AOF (append-only file) logs every write operation — far better durability (down to per-write fsync) at the cost of larger files and slower restart. Many run both: AOF for durability, RDB for fast restores. Treating Redis purely as a cache means you may not need persistence at all.
What a strong answer coversEviction policy governs behavior at
maxmemory; choose LRU/LFU for cache use.RDB = periodic snapshots: compact and fast to load, but loses recent writes.
AOF = append-only write log: stronger durability, bigger/slower.
Often run both; a pure cache may skip persistence entirely.
Follow-ups they push on- When would you choose noeviction over allkeys-lru?
- What's the durability/performance tradeoff of AOF fsync-everysec vs always?
Red flag Running Redis as a primary datastore with `noeviction` unset and no persistence, then losing data on a restart or silently dropping writes at maxmemory.
source: Redis — Persistence ↗ -
Debugging: after deploying a new code version, users report seeing old data that won't refresh. Where would you look in the caching layers?
Stale data after a deploy almost always means a cache somewhere is serving the old version. Walk the layers from the client inward: the browser cache (
Cache-Control/Expireson the asset — a missing hash/versioned filename means the browser reuses the old bundle), the CDN/edge (needs a purge/invalidation for changed assets), the application/distributed cache (Redis/Memcached entries not invalidated by the deploy), and finally the DB query cache.Debug tooling: inspect response headers for
Age,X-Cache: HIT, andCache-Control; force-reload to bypass the browser; check whether the CDN was purged. The durable fix for static assets is cache-busting — content-hashed filenames so a new version is a new URL and old caches simply don't apply.What a strong answer coversCheck layers client→server: browser → CDN/edge → app/distributed cache → DB.
Inspect
Age,X-Cache, andCache-Controlto locate the serving cache.Static assets need content-hashed filenames (cache-busting) per deploy.
A CDN may need an explicit purge/invalidation after deploy.
Quick self-checkUsers keep getting the old JS bundle after a deploy. The most reliable fix is to:
-
Wrong — unscalable and doesn't fix CDN/proxy caches.
-
Correct — old caches simply don't match the new URL.
-
Partial at best — still serves stale until expiry and wastes revalidation.
-
Throws away all caching benefits; unnecessary with cache-busting.
Follow-ups they push on- How does a content hash in the filename make cache invalidation automatic?
- What does the Age header tell you about which cache served the response?
Red flag Serving versioned JS/CSS under a stable filename with a long max-age, so browsers and CDNs keep the old bundle after deploy. Hash the filename so a new build is a new URL.
source: MDN — HTTP caching (cache busting) ↗ -
When is adding a cache the WRONG move? Name cases where caching hurts more than it helps.
Caching is not free — it adds a consistency problem and an extra failure mode. It's the wrong move when the data changes more often than it's read (you invalidate constantly, getting a near-zero hit ratio while paying the cost), when staleness is unacceptable (account balances, inventory at checkout, anything where a wrong value causes real harm), when the working set is far larger than memory so you thrash with evictions, or when the backend is already fast enough that the cache only adds complexity and a coherence bug surface.
The instinct: reach for caching when reads dominate writes and a little staleness is tolerable; otherwise the extra layer buys complexity, not speed.
What a strong answer coversWrite-heavy / frequently-changing data → constant invalidation, low hit ratio.
Strict-correctness data (balances, inventory) → staleness causes real harm.
Working set >> cache memory → thrashing evictions, little benefit.
Already-fast backend → cache adds complexity and a new failure/coherence surface.
Quick self-checkFor which workload is adding a cache LEAST likely to help?
-
Wrong — this is the ideal case for caching.
-
Correct — constant invalidation yields a near-zero hit ratio and added cost.
-
Wrong — caching a hot read key is highly effective.
-
Wrong — caching memoizes that cost; a clear win.
Follow-ups they push on- Why does a write-heavy workload defeat most caching strategies?
- How do you decide the read:write ratio threshold where caching pays off?
Red flag Adding a cache reflexively 'for performance' on write-heavy or correctness-critical data. You inherit invalidation bugs and a new failure mode for little or negative gain.
source: AWS Builders' Library — Caching challenges and strategies ↗ -
What's the difference between read-through and cache-aside caching? Who is responsible for the database read in each?
Both are lazy-loading read strategies, but they differ in *who* loads on a miss. In cache-aside (lazy loading) the application owns the logic: it checks the cache, and on a miss it reads the DB and writes the result back to the cache itself. The cache is just a dumb store; the app is the orchestrator.
In read-through, the cache sits inline and loads from the DB on a miss transparently — the application only ever talks to the cache, and a provider/loader function populates it. Read-through centralizes the load logic (less duplicated code, consistent behavior) but needs cache support for it; cache-aside is more flexible and the most common pattern. Both still need a write strategy and TTLs to manage staleness.
What a strong answer coversCache-aside: the application reads the DB on a miss and populates the cache.
Read-through: the cache loads from the DB on a miss transparently.
Read-through centralizes load logic; cache-aside is more flexible/common.
Both are lazy (load on miss) and still need write/TTL strategies.
Quick self-checkIn cache-aside, who reads the database on a cache miss?
-
Wrong — that describes read-through, not cache-aside.
-
Correct — the app orchestrates the miss in cache-aside.
-
Wrong — databases don't push into caches in this pattern.
-
Wrong — a miss triggers a load, just by the application here.
Follow-ups they push on- Why is cache-aside more common despite read-through's cleaner app code?
- How does write-through pair with read-through?
Red flag Conflating the two and assuming the cache auto-loads in cache-aside. In cache-aside the application must explicitly read the DB and repopulate on every miss.
source: AWS — Database caching strategies (lazy loading vs read-through) ↗