2.8 ★ core [J][A] 14 interview Q's

Caching

Cache-aside vs write-through vs write-back, eviction policies (LRU/LFU/FIFO/TTL), where caches live, and Redis vs Memcached.

A cache is a fast copy of slow data, and every caching decision is really two decisions: how writes flow (the strategy) and what gets thrown out when it’s full (the eviction policy). Get those right and you’ve absorbed most of your read load; get them wrong and you serve stale data, lose writes, or fall over the moment a hot key expires.

Key vocabulary

Cache hit / miss: A hit serves data straight from the cache; a miss falls through to the source of truth (DB) and usually populates the cache. Hit ratio is the headline metric.
Cache-aside vs write-through: Cache-aside: the app loads the cache lazily on a miss. Write-through: every write goes to cache and DB together. Aside optimizes reads; write-through keeps the cache fresh at write time.
Eviction vs expiry: Eviction (LRU/LFU/FIFO) removes entries because the cache is full. Expiry (TTL) removes them because they're too old. They're orthogonal — you usually use both.
Stale read: Serving a cached value the DB has since changed. The fundamental cache tradeoff: speed bought with a window of possible staleness.
Thundering herd / stampede: When a hot key expires, every concurrent request misses simultaneously and stampedes the database with identical recomputations — a self-inflicted load spike.

Write strategies

FIG 1 · cache-aside vs write-through Cache-aside loads lazily on a miss and the app owns both stores; write-through writes cache and DB together on every write, so the cache is never stale.

Strategy	Use when	Avoid when	Gotcha
Cache-aside (lazy)	read-heavy workloads; most common default	writes must be instantly reflected in cache	first read after a write is a miss; risk of stale entries until TTL
Write-through	you need the cache always consistent with the DB	write-heavy paths where the extra write hurts latency	slower writes; caches data that may never be read
Write-back (write-behind)	very write-heavy, can tolerate some loss for speed	data you cannot afford to lose	a crash before flush loses the un-written data

Cache-aside optimizes reads; write-through optimizes freshness; write-back optimizes write speed.

The strategies sit on a clear spectrum. Cache-aside makes the application responsible for the cache — it’s the pragmatic default because it only caches what’s actually read and degrades gracefully (if the cache dies, you just hit the DB). Write-through trades write latency for freshness — the cache is never stale because every write hits both stores, but you pay a double write and cache data that may never be read. Write-back is the fastest writer (acknowledge from the cache, flush to the DB asynchronously) and the most dangerous — a crash between acknowledge and flush loses committed-looking data, so it’s only for data you can afford to lose.

Eviction policies

When the cache fills, something must go. The policy decides what:

LRU (least-recently-used) evicts the entry untouched for the longest — great when recent access predicts future access, which is true often enough that LRU is the common default.
LFU (least-frequently-used) evicts the entry with the fewest hits — better when popularity is stable (a few items are perennially hot regardless of recency).
FIFO evicts the oldest-inserted regardless of use — simple, but it’ll throw out a hot item just because it’s been around a while.

TTL is orthogonal to all three: it expires an entry by age no matter how full the cache is. You almost always use both — a TTL to bound staleness, plus an eviction policy to bound size.

Cache-aside read with TTL, and safe invalidation

def get_user(user_id):
    key = f"user:{user_id}"
    cached = redis.get(key)
    if cached is not None:                 # HIT
        return json.loads(cached)

    user = db.query("SELECT * FROM users WHERE id = %s", user_id)  # MISS → source of truth
    redis.set(key, json.dumps(user), ex=300)  # populate with a 300s TTL
    return user

# On update, invalidate so the next read re-populates fresh:
def update_user(user_id, data):
    db.update("users", user_id, data)
    redis.delete(f"user:{user_id}")        # delete, don't update — avoids a stale write race

This is the canonical cache-aside pattern. The non-obvious line is the delete on write, not an update. If two writers both tried to set the new value, their writes could interleave so the cache ends up holding the older one (a stale-write race). Deleting sidesteps the race entirely: whoever reads next takes a single clean miss and repopulates from the source of truth. A miss is cheap; a wrong cached value can persist for the whole TTL.

The stampede, and where caches live

From the trenches AWS

Amazon’s Builders’ Library is blunt about a truth that surprises engineers: a cache is a tradeoff, not free performance, and adding one introduces a new mode of failure. Their core warning is about modal behavior — a system tuned to run with a warm cache can be quietly under-provisioned for the uncached path, so the moment the cache empties (a deploy, a node loss, a region failover, a mass TTL expiry) the full unbuffered load lands on the database at once and takes it down. The cache that was protecting you becomes the reason you fall over when it blinks.

The remedies map directly to this lesson: add jitter so TTLs don’t expire in lockstep, use request coalescing / single-flight so a cache miss on a hot key doesn’t fan out into a stampede, and — critically — load-test the cold-cache path, not just the warm one, so you know the system survives without the cache. The deeper lesson is to treat the cache as a performance optimization layered on a system that must still function (if slowly) when the cache is gone.

read the writeup ↗ aws.amazon.com

Key points

Two decisions: write strategy (cache-aside / write-through / write-back) and eviction policy (LRU / LFU / FIFO / TTL).
Cache-aside is the read-heavy default (app-managed, degrades gracefully); write-through keeps it fresh at write-latency cost; write-back is fastest but can lose data.
LRU is the usual eviction default; TTL expiry is orthogonal and used alongside it.
On write, prefer invalidate (delete) over update to dodge stale-write races — a miss is cheaper than a wrong value.
Caches stack browser → CDN → in-process → distributed → DB; closer to the user is faster but harder to invalidate.
Watch for stampedes on hot-key expiry; mitigate with single-flight locks, TTL jitter, or stale-while-revalidate — and load-test the cold-cache path.
Redis = rich structures + persistence + clustering; Memcached = simple multithreaded blob cache.

01 Learning objectives

0 / 4 done

02 Curated reading

Amazon Builders' Library — Caching challenges and strategies
essential eng blog 18m — Production caching tradeoffs from AWS.

03 Knowledge check

knowledge check3 questions · pass ≥ 70%

01medium
Read-heavy workload, occasional staleness is OK, cache should self-heal on a miss. Which strategy?
02medium
Which eviction policy removes the entry unused for the longest time?
03hard
Write-back (write-behind) caching's main risk is:

04 Interview questions

browse all ↗

What gets asked on this topic — tap a card for how to approach it, the follow-ups, and the trap. Company tags are best-effort & sourced.

Commonly asked mid concept very common Compare cache-aside, write-through, and write-back. When do you use each?
Cache-aside (lazy loading): app checks cache; on a miss it reads the DB, populates the cache, and returns. Most common for read-heavy workloads; only requested data is cached, but the first hit is a miss and stale data is possible without invalidation.
Write-through: writes go to cache and DB together, so the cache is always consistent — at the cost of higher write latency and caching data that may never be read. Write-back (write-behind): write to cache immediately and flush to the DB asynchronously — fast writes and great for write-heavy bursts, but a cache crash before flush loses data. Pick cache-aside by default; write-through when reads must never be stale; write-back when write latency dominates and you can tolerate the durability risk.
Follow-ups they push on
- Which strategy risks data loss and why?
- How do you keep cache-aside from serving stale data?
Red flag Using write-back for data you can't afford to lose. An async-flush cache that dies before flushing loses the unwritten writes.
source: AWS Builders' Library — Caching challenges and strategies ↗
Commonly asked mid concept common LRU vs LFU vs FIFO eviction, plus TTL — how do you choose?
When a cache is full it evicts by policy. LRU drops the least-recently-used entry — the default; great when recent access predicts future access (temporal locality). LFU drops the least-frequently-used — better when some items are persistently hot regardless of recency, but it can keep stale 'once-popular' items. FIFO evicts the oldest inserted regardless of use — simple but ignores access patterns.
TTL is orthogonal: it bounds staleness by expiring entries after a time, independent of capacity pressure. Typical setup: LRU for capacity eviction plus a TTL for freshness.
Follow-ups they push on
- When does LFU beat LRU?
- How does TTL interact with an eviction policy?
Red flag Treating TTL as an eviction policy. TTL bounds staleness over time; LRU/LFU/FIFO decide what to drop under memory pressure — they solve different problems.
source: GeeksforGeeks — Cache eviction policies ↗
Commonly asked mid coding very common Implement an LRU cache with O(1) get and put. What data structures do you use?
Combine a hash map (key → node) with a doubly linked list ordered by recency. The map gives O(1) lookup; the linked list gives O(1) move-to-front and O(1) eviction at the tail.
On get: look up the node in the map, unlink it, move it to the head (most recent), return its value. On put: if present, update and move to head; if new, insert at head and add to the map; if over capacity, remove the tail node and delete its key from the map. Both operations are O(1) because every step is a constant-time pointer/map update.
Map (key -> node) + DLL: head=newest ... tail=evict
Follow-ups they push on
- Why a doubly (not singly) linked list?
- How would you make it thread-safe?
- In an interview, can you use a language built-in like LinkedHashMap?
Red flag Using an array or scanning the list to find the LRU item — that's O(n). The hash map + DLL pairing is what keeps both operations O(1).
source: LeetCode — LRU Cache (146) ↗
Commonly asked senior design common What is a cache stampede (thundering herd) and how do you prevent it?
A cache stampede happens when a hot key expires and many concurrent requests all miss simultaneously, then all hit the database at once to recompute the same value — a spike that can overload the backend.
Mitigations: request coalescing / locking so only one request recomputes while others wait for or briefly serve the old value; early/probabilistic expiration so one request refreshes the key slightly before it expires; stale-while-revalidate, serving the old value while refreshing in the background; and jittering TTLs so many keys don't expire at the same instant.
Follow-ups they push on
- How does a per-key lock prevent the dogpile?
- What is probabilistic early expiration?
Red flag Giving many hot keys the same fixed TTL, so they expire together and trigger a synchronized backend stampede. Add jitter and coalesce recomputation.
source: AWS Builders' Library — Caching challenges and strategies ↗
Commonly asked senior trick common Why is cache invalidation hard, and what are the failure modes (stale reads)?
There are really two hard problems: deciding *when* a cached value is no longer valid, and making the cache and source of truth agree across concurrent updates. With cache-aside, a classic race: reader gets a miss and starts loading the old value; a writer updates the DB and deletes the cache key; the reader then writes its stale value back — now the cache is wrong indefinitely.
Approaches: delete (don't update) the key on write so the next read repopulates fresh; use short TTLs to bound staleness; version keys; or for tighter consistency use write-through. There's no free lunch — you trade consistency, latency, and complexity.
Follow-ups they push on
- Why delete the key on write instead of updating it?
- How does a short TTL bound the damage?
Red flag Updating the cache in place on writes (instead of deleting) and ignoring the read-load/write-delete interleaving — you cache a stale value that never self-heals.
source: AWS Builders' Library — Caching challenges and strategies ↗
Commonly asked mid concept common Redis vs Memcached — when would you pick each?
Memcached is a simple, multithreaded, in-memory key→blob cache — extremely fast and easy to scale for pure caching of opaque values. Redis is a richer in-memory data store: it has data structures (lists, sets, sorted sets, hashes, streams), optional persistence, replication, pub/sub, Lua scripting, and clustering.
Pick Memcached when you just need a fast, large, simple cache and want multithreaded throughput per node. Pick Redis when you need those data structures, durability, atomic operations, pub/sub, rate-limiter counters, leaderboards, or built-in replication/clustering — which is most modern use cases.
Follow-ups they push on
- When does Memcached's multithreading actually win?
- What Redis features make it more than a cache?
Red flag Saying 'Redis is just a faster Memcached'. The real difference is Redis's data structures, persistence, and clustering, not raw speed.
source: AWS — Redis vs Memcached ↗
Commonly asked junior concept common Where can caches live across a request's path, and what does each layer cache?
Caching exists at many layers, each closer to the user is cheaper: browser cache (per-user, static assets via Cache-Control/ETag); CDN / edge (shared, geo-distributed static and cacheable responses); application / in-process (a local in-memory map — fastest but per-instance and not shared); distributed cache (Redis/Memcached — shared across app servers); and the database query/buffer cache.
The instinct: cache as close to the user as the data's freshness allows. Each layer trades reach (shared vs per-instance) against latency and invalidation difficulty.
Follow-ups they push on
- Trade-off of in-process vs distributed cache?
- Why is CDN caching great for static but tricky for personalized content?
Red flag Caching personalized/private data in a shared CDN or proxy layer, leaking one user's data to another. Mark it private/no-store.
source: system-design-primer — Caching ↗
Commonly asked senior concept occasional What are cache penetration and cache avalanche, and how do they differ from a stampede?
Cache penetration: requests for keys that don't exist anywhere always miss the cache and hit the DB — common in scraping/attacks. Fix by caching the negative result (a short-TTL 'null' marker) or screening with a Bloom filter of valid keys.
Cache avalanche: a large set of keys expire at once (or the cache itself goes down), so traffic floods the DB en masse. Fix by jittering TTLs, layering caches, and adding rate limiting / circuit breakers in front of the DB. Versus a stampede, which is many concurrent requests for *one* expiring hot key — avalanche is many keys at once, penetration is keys that never exist.
Follow-ups they push on
- How does a Bloom filter stop penetration cheaply?
- Why does jittering TTLs help avalanche?
Red flag Not caching negative lookups, so a flood of requests for nonexistent keys bypasses the cache entirely and hammers the database.
source: GeeksforGeeks — Cache penetration, avalanche, stampede ↗
★ must-know Commonly asked senior concept common On a write, should you update the cache in place or delete (invalidate) the key? Why is delete usually safer?
Prefer delete (invalidate) over update-in-place. Updating the cache directly on write opens a race: two concurrent writers can set the cache in the opposite order from how they hit the DB, leaving the cache holding the older value permanently. Deleting the key sidesteps that — the next read just repopulates from the source of truth.
Delete is also cheaper (you don't recompute a value that may never be read) and avoids caching an intermediate state. The cost is one guaranteed cache miss after each write. For very hot keys you can refresh asynchronously, but the default rule is invalidate, don't update.
What a strong answer covers
- Update-in-place risks concurrent writers leaving a stale value forever.
- Delete forces the next read to repopulate from the source of truth.
- Delete avoids recomputing values that may never be read.
- Cost: one cache miss after each write.
Quick self-check
On a database write, the more robust cache strategy is usually to:
Follow-ups they push on
- Walk through the concurrent-writer race that update-in-place causes.
- When might you refresh the cache asynchronously instead of deleting?
Red flag Writing the new value straight into the cache on every update. Concurrent writes can reorder and pin a stale value; deleting the key is the robust default.
source: AWS Builders' Library — Caching challenges and strategies ↗
Commonly asked mid concept common What makes a good cache key, and why is cache hit ratio the metric that matters most?
A good cache key is deterministic (same logical request → same key), specific enough to avoid collisions (include the parameters that change the result — user, locale, version), and normalized (sort query params, lowercase where appropriate) so equivalent requests share a key. Over-specific keys (including irrelevant params like a request id) fragment the cache and tank the hit rate; under-specific keys serve the wrong data.
Hit ratio (hits / total lookups) is the headline metric because a cache only pays off when most reads avoid the backend. A low hit ratio means you're spending memory and adding a layer for little benefit — investigate whether keys are too granular, TTLs too short, or the working set exceeds cache capacity.
What a strong answer covers
- Keys must be deterministic, normalized, and include exactly the result-affecting params.
- Over-specific keys fragment the cache; under-specific keys serve wrong data.
- Hit ratio = hits / lookups — the core measure of cache value.
- Low hit ratio → keys too granular, TTL too short, or working set > capacity.
Follow-ups they push on
- How can including a request id or timestamp in the key destroy the hit ratio?
- What does a sudden hit-ratio drop usually indicate?
Red flag Baking a unique/volatile value (request id, current timestamp) into the cache key, so every request misses — you've added overhead with a near-zero hit ratio.
source: AWS — Caching best practices ↗
Commonly asked senior concept occasional Why does Redis need a persistence and eviction policy, and what's the difference between RDB and AOF?
Redis holds data in memory, so two policies matter. Eviction (maxmemory-policy) decides what happens when memory fills — noeviction (reject writes), allkeys-lru, volatile-ttl, etc. Pick LRU/LFU variants when using Redis as a cache; noeviction when it's a primary store you can't silently drop from.
Persistence decides what survives a restart. RDB takes periodic point-in-time snapshots — compact, fast to load, but you lose writes since the last snapshot. AOF (append-only file) logs every write operation — far better durability (down to per-write fsync) at the cost of larger files and slower restart. Many run both: AOF for durability, RDB for fast restores. Treating Redis purely as a cache means you may not need persistence at all.
What a strong answer covers
- Eviction policy governs behavior at maxmemory; choose LRU/LFU for cache use.
- RDB = periodic snapshots: compact and fast to load, but loses recent writes.
- AOF = append-only write log: stronger durability, bigger/slower.
- Often run both; a pure cache may skip persistence entirely.
Follow-ups they push on
- When would you choose noeviction over allkeys-lru?
- What's the durability/performance tradeoff of AOF fsync-everysec vs always?
Red flag Running Redis as a primary datastore with `noeviction` unset and no persistence, then losing data on a restart or silently dropping writes at maxmemory.
source: Redis — Persistence ↗
Commonly asked mid debug occasional Debugging: after deploying a new code version, users report seeing old data that won't refresh. Where would you look in the caching layers?
Stale data after a deploy almost always means a cache somewhere is serving the old version. Walk the layers from the client inward: the browser cache (Cache-Control/Expires on the asset — a missing hash/versioned filename means the browser reuses the old bundle), the CDN/edge (needs a purge/invalidation for changed assets), the application/distributed cache (Redis/Memcached entries not invalidated by the deploy), and finally the DB query cache.
Debug tooling: inspect response headers for Age, X-Cache: HIT, and Cache-Control; force-reload to bypass the browser; check whether the CDN was purged. The durable fix for static assets is cache-busting — content-hashed filenames so a new version is a new URL and old caches simply don't apply.
What a strong answer covers
- Check layers client→server: browser → CDN/edge → app/distributed cache → DB.
- Inspect Age, X-Cache, and Cache-Control to locate the serving cache.
- Static assets need content-hashed filenames (cache-busting) per deploy.
- A CDN may need an explicit purge/invalidation after deploy.
Quick self-check
Users keep getting the old JS bundle after a deploy. The most reliable fix is to:
Follow-ups they push on
- How does a content hash in the filename make cache invalidation automatic?
- What does the Age header tell you about which cache served the response?
Red flag Serving versioned JS/CSS under a stable filename with a long max-age, so browsers and CDNs keep the old bundle after deploy. Hash the filename so a new build is a new URL.
source: MDN — HTTP caching (cache busting) ↗
Commonly asked senior trick occasional When is adding a cache the WRONG move? Name cases where caching hurts more than it helps.
Caching is not free — it adds a consistency problem and an extra failure mode. It's the wrong move when the data changes more often than it's read (you invalidate constantly, getting a near-zero hit ratio while paying the cost), when staleness is unacceptable (account balances, inventory at checkout, anything where a wrong value causes real harm), when the working set is far larger than memory so you thrash with evictions, or when the backend is already fast enough that the cache only adds complexity and a coherence bug surface.
The instinct: reach for caching when reads dominate writes and a little staleness is tolerable; otherwise the extra layer buys complexity, not speed.
What a strong answer covers
- Write-heavy / frequently-changing data → constant invalidation, low hit ratio.
- Strict-correctness data (balances, inventory) → staleness causes real harm.
- Working set >> cache memory → thrashing evictions, little benefit.
- Already-fast backend → cache adds complexity and a new failure/coherence surface.
Quick self-check
For which workload is adding a cache LEAST likely to help?
Follow-ups they push on
- Why does a write-heavy workload defeat most caching strategies?
- How do you decide the read:write ratio threshold where caching pays off?
Red flag Adding a cache reflexively 'for performance' on write-heavy or correctness-critical data. You inherit invalidation bugs and a new failure mode for little or negative gain.
source: AWS Builders' Library — Caching challenges and strategies ↗
Commonly asked mid concept occasional What's the difference between read-through and cache-aside caching? Who is responsible for the database read in each?
Both are lazy-loading read strategies, but they differ in *who* loads on a miss. In cache-aside (lazy loading) the application owns the logic: it checks the cache, and on a miss it reads the DB and writes the result back to the cache itself. The cache is just a dumb store; the app is the orchestrator.
In read-through, the cache sits inline and loads from the DB on a miss transparently — the application only ever talks to the cache, and a provider/loader function populates it. Read-through centralizes the load logic (less duplicated code, consistent behavior) but needs cache support for it; cache-aside is more flexible and the most common pattern. Both still need a write strategy and TTLs to manage staleness.
What a strong answer covers
- Cache-aside: the application reads the DB on a miss and populates the cache.
- Read-through: the cache loads from the DB on a miss transparently.
- Read-through centralizes load logic; cache-aside is more flexible/common.
- Both are lazy (load on miss) and still need write/TTL strategies.
Quick self-check
In cache-aside, who reads the database on a cache miss?
Follow-ups they push on
- Why is cache-aside more common despite read-through's cleaner app code?
- How does write-through pair with read-through?
Red flag Conflating the two and assuming the cache auto-loads in cache-aside. In cache-aside the application must explicitly read the DB and repopulate on every miss.
source: AWS — Database caching strategies (lazy loading vs read-through) ↗