2.5 [J][A][I] 13 interview Q's

Concurrency & parallelism

Concurrency vs parallelism, threads vs processes vs async I/O, and the hazards (races, deadlocks) plus locks/mutexes vs semaphores.

Concurrency and parallelism sound interchangeable and aren’t. One is about structure (dealing with many things at once); the other is about execution (doing many things at the same instant). The hazards — races and deadlocks — and the tools to prevent them follow directly from that distinction, and from one question: is your bottleneck the CPU or the wait?

Key vocabulary

Concurrency vs parallelism: Concurrency = interleaving tasks so progress overlaps (one cook switching between dishes). Parallelism = tasks running truly simultaneously on multiple cores (two cooks). You can have concurrency on a single core.
Thread vs process: Threads share the same memory within a process (cheap, but they can corrupt shared state). Processes have isolated memory (safe, but communicate via IPC and cost more to spawn).
Async I/O: Non-blocking I/O: fire the request, register a callback, and let the thread do other work until the result is ready. A single thread can juggle thousands of in-flight I/O operations.
Race condition: A bug where the result depends on the unlucky timing of two threads touching shared state — classically the gap between reading a value and writing it back.
Mutex vs semaphore: A mutex grants exclusive access to one holder (binary lock). A semaphore is a counter permitting up to N concurrent holders (e.g. a pool of 5 connections).

The three models

The first decision is what’s slowing you down. A CPU-bound workload (hashing, image resizing, number-crunching) spends its time computing, so it wants parallelism — real cores doing real work simultaneously. An I/O-bound workload (database calls, HTTP requests, file reads) spends its time waiting on something else, so it wants concurrency — often async I/O, where one thread fires thousands of requests and reacts as each completes, never blocking on any single wait.

The execution units sit on a spectrum of sharing vs safety:

Threads share the process’s memory. Coordinating them is cheap (just touch a shared variable), but that same shared memory is exactly what lets them corrupt each other’s state — the source of race conditions.
Processes have isolated memory. Nothing one process does can scribble on another’s data (safe), but they’re heavier to spawn and must communicate over IPC (pipes, sockets, shared-memory segments).
Async I/O sidesteps the question: a single thread handles thousands of concurrent waits because none of them block. This is the Node/Python-asyncio model — superb for I/O, useless for CPU work (a heavy computation blocks the one loop and stalls everything; see the event loop in 4.1).

The hazards and the tools

A race condition is when the result depends on the unlucky interleaving of two threads touching shared state. A deadlock is when threads wait on each other’s locks forever. Locks/mutexes and semaphores are the primitives that prevent the first — and, if used carelessly, cause the second. That double edge is the whole story: the same tool that closes the race can hang the program.

Tool	Use when	Avoid when	Gotcha
Mutex / lock	exactly one thread may touch a resource at a time	the section is read-only or contention is nil	lock ordering bugs cause deadlock; keep critical sections tiny
Semaphore (counting)	cap concurrency at N (e.g. 5 DB connections)	you only need mutual exclusion (use a mutex)	forgetting to release leaks permits until exhaustion
Async I/O	many concurrent I/O-bound ops on few threads	CPU-bound work — it'll block the loop	a blocking call in an async path stalls everything
Worker pool / processes	CPU-bound parallel work across cores	lightweight I/O — overhead isn't worth it	shared-memory threads can race; isolated processes need IPC

Match the tool to whether the bottleneck is the CPU or the wait.

A race condition, and the lock that fixes it

Two threads run  balance = balance + 100  on a shared balance of 0:

  T1: read balance (0)
  T2: read balance (0)        ← both read before either writes
  T1: write 0 + 100 = 100
  T2: write 0 + 100 = 100     ← T1's update is lost; result is 100, not 200

FIX — make read-modify-write atomic with a mutex:
  lock(m)
    balance = balance + 100
  unlock(m)
  // now T2 waits for the lock; final balance is 200.

The bug lives in the gap between read and write — a window where another thread can sneak in with a stale value. The += 100 looks atomic in source code but compiles to read, add, store; the mutex collapses those three steps into one indivisible operation, so no other thread can observe the in-between state. (Hardware atomics like compare-and-swap solve the same problem lock-free for simple cases, but the mutex generalizes to any critical section.)

Deadlock: the four conditions and the cheap fix

FIG 1 · the deadlock cycle Thread A holds X and wants Y; thread B holds Y and wants X. The wait-for graph has a cycle, so neither can proceed — they hang forever.

Deadlock needs all four of Coffman’s conditions at once: mutual exclusion (locks are exclusive), hold-and-wait (you hold one lock while requesting another), no preemption (a lock can’t be forcibly taken), and a circular wait (a cycle in the wait-for graph, as above). Break any single one and deadlock is impossible.

01 Learning objectives

0 / 3 done

02 Knowledge check

knowledge check2 questions · pass ≥ 70%

01medium
Concurrency vs parallelism:
02hard
A mutex differs from a counting semaphore in that:

03 Interview questions

browse all ↗

What gets asked on this topic — tap a card for how to approach it, the follow-ups, and the trap. Company tags are best-effort & sourced.

Commonly asked junior concept very common Concurrency vs parallelism — what's the difference?
Concurrency is about *dealing with* many tasks by interleaving them — making progress on several by switching between them, even on a single core. Parallelism is *doing* many tasks at the same instant on multiple cores.
Rob Pike's line: concurrency is about structure, parallelism is about execution. A single-threaded async server is concurrent but not parallel; a CPU-bound job split across 8 cores is parallel. You can have concurrency without parallelism and vice versa.
Follow-ups they push on
- Can you have parallelism without concurrency?
- Where does Node's event loop sit on this axis?
Red flag Using the terms interchangeably. Interleaving on one core is concurrency, not parallelism.
source: GeeksforGeeks — Concurrency vs parallelism ↗
Commonly asked mid concept common Threads vs processes — what's shared, what's isolated, and when do you pick each?
A process has its own isolated memory space; threads within a process share the same heap/address space. Threads are cheaper to create and communicate through shared memory; processes are heavier but isolated — a crash or memory corruption in one process can't directly corrupt another.
Pick threads for fine-grained shared-memory work where communication cost matters; pick processes for isolation and fault containment (and, in languages with a GIL like CPython, to get true CPU parallelism). The tradeoff is shared-memory speed vs. isolation and safety.
Follow-ups they push on
- How do processes communicate without shared memory? (IPC, pipes, sockets)
- Why does the GIL push CPython to multiprocessing for CPU-bound work?
Red flag Assuming threads always parallelize CPU work — a global interpreter lock (CPython) serializes bytecode, so threads help I/O but not CPU-bound loops.
source: GeeksforGeeks — Difference between process and thread ↗
Commonly asked mid debug very common What is a race condition? Show a classic example and how to fix it.
A race condition is when the result depends on the unpredictable timing/interleaving of concurrent operations on shared state. Classic case: two threads run balance = balance + 100. That's read-modify-write: both read the same old value, both add 100, both write back — one update is lost.
Fix by making the critical section atomic: guard it with a mutex/lock, use an atomic increment, or use a compare-and-swap. The general principle is to serialize access to shared mutable state so only one thread is in the critical section at a time.
Follow-ups they push on
- Why isn't `x++` atomic?
- What's a check-then-act race (e.g. 'if not exists, create')?
Red flag Assuming a single statement like `count++` is atomic — it compiles to load/add/store, which can interleave.
source: GeeksforGeeks — Race condition ↗
Commonly asked mid concept common Mutex vs semaphore — define each and when you'd use which.
A mutex provides mutual exclusion: one holder at a time, and ownership matters — the thread that locks it should unlock it. Use it to protect a single shared resource / critical section.
A semaphore is a counter that permits up to N concurrent holders (acquire decrements, release increments, block at zero). A binary semaphore (N=1) resembles a lock but has no ownership and is often used for signaling between threads. Use a counting semaphore to cap concurrency — e.g. limit to 10 simultaneous DB connections.
Follow-ups they push on
- What does 'ownership' give a mutex that a semaphore lacks?
- How would you bound a connection pool with a semaphore?
Red flag Treating a binary semaphore as a drop-in mutex. Without ownership, any thread can release it, which permits subtle bugs a mutex prevents.
source: GeeksforGeeks — Mutex vs semaphore ↗
Commonly asked senior concept common What is a deadlock, what four conditions cause it, and how do you prevent it?
A deadlock is when threads wait forever on each other's locks. It needs all four Coffman conditions simultaneously: mutual exclusion, hold-and-wait, no preemption, and circular wait.
Break any one to prevent it. The most practical: impose a global lock-ordering so all threads acquire locks in the same order (kills circular wait); or acquire all locks at once (kills hold-and-wait); or use lock timeouts / tryLock and back off. Example deadlock: thread A holds lock 1 and wants lock 2 while thread B holds lock 2 and wants lock 1.
Follow-ups they push on
- Which Coffman condition is easiest to remove in practice? (circular wait via ordering)
- Deadlock vs livelock vs starvation?
Red flag Adding 'just one more lock' to fix a race and creating a deadlock instead. Inconsistent lock acquisition order is the usual culprit.
source: GeeksforGeeks — Deadlock and conditions ↗
Commonly asked mid concept common Why does async I/O let a single thread handle thousands of connections?
Most server work is I/O-bound — waiting on the network, disk, or a database. With blocking I/O each connection ties up a thread that just sits idle during the wait, so 10k connections need ~10k threads (expensive memory + context switching).
Non-blocking async I/O flips this: the thread issues the I/O and immediately moves on; the OS notifies it (epoll/kqueue) when data is ready, and a callback/continuation resumes. One thread multiplexes thousands of in-flight waits because no thread blocks on the wait. The catch: a CPU-bound task blocks the loop and starves everyone, so async shines for I/O-bound, not CPU-bound, work.
Follow-ups they push on
- When does async hurt? (CPU-bound work blocking the event loop)
- How is this different from a thread-per-request server?
Red flag Believing async is faster for everything. It wins on I/O concurrency; a heavy CPU computation still blocks the single event-loop thread.
source: MDN — Asynchronous JavaScript / event loop ↗
Commonly asked senior concept occasional What is optimistic vs pessimistic locking, and when do you pick each?
Pessimistic locking assumes conflicts are likely, so it locks the row/resource up front (SELECT ... FOR UPDATE) and others wait. Safe but reduces concurrency and risks deadlocks.
Optimistic locking assumes conflicts are rare: read freely, and at write time check a version number (or timestamp/ETag) — if it changed, someone else won, so reject and retry. Great for low-contention, read-heavy workloads; wasteful retries under high contention. Pick pessimistic for hot, highly-contended rows; optimistic for mostly-independent updates.
Follow-ups they push on
- How does a version column implement optimistic locking?
- How does this map to HTTP's If-Match / 412?
Red flag Using optimistic locking on a hotly-contended counter — you'll thrash on retries. High contention favors pessimistic locking.
source: Martin Fowler — Optimistic Offline Lock ↗
★ must-know Commonly asked senior concept common What is a deadlock vs a livelock vs starvation? Distinguish all three.
Deadlock: threads are blocked forever, each waiting on a resource another holds — nobody moves (e.g. the AB/BA lock-ordering cycle). Livelock: threads aren't blocked and keep *changing state* in response to each other, but make no progress — like two people stepping aside in the same direction repeatedly in a hallway. Starvation: a thread *can* run but is perpetually denied the resource because others keep winning it (e.g. a low-priority thread under a greedy scheduler).
Key distinction: deadlock = stuck and idle; livelock = busy but unproductive; starvation = some progress overall, but one thread is unfairly shut out.
What a strong answer covers
- Deadlock: mutual blocking, zero activity, circular wait.
- Livelock: active state changes but no forward progress.
- Starvation: a thread is runnable but perpetually denied the resource.
- Fairness/aging fixes starvation; lock ordering fixes deadlock.
Quick self-check
Two threads each detect a conflict, both back off and immediately retry in lockstep, repeating forever without blocking. This is:
Follow-ups they push on
- How can a naive retry-on-conflict loop cause livelock?
- How does priority aging address starvation?
Red flag Calling any 'no progress' situation a deadlock. Livelock threads are actively running, and starvation still has overall progress — different causes, different fixes.
source: GeeksforGeeks — Deadlock, Starvation, and Livelock ↗
Commonly asked senior concept occasional Why are atomic operations and compare-and-swap (CAS) faster than locks for simple shared counters?
A mutex involves OS-level machinery: contended threads may block and be parked/woken by the scheduler, which costs context switches. CAS is a single hardware instruction — 'if memory still holds the value I read, swap in the new value, else fail' — so a lock-free counter just loops read → compute → CAS, retry on failure entirely in user space with no blocking.
Under low-to-moderate contention this is much cheaper. The tradeoff: CAS works for small single-word updates; complex multi-variable invariants still need locks, and very high contention makes CAS retry loops spin wastefully. This is the basis of lock-free/atomic data structures.
What a strong answer covers
- Locks can block threads → context-switch and scheduler overhead.
- CAS is one atomic CPU instruction; lock-free loops stay in user space.
- Great for single-word updates (counters, flags); not for multi-variable invariants.
- Under heavy contention, CAS retry loops can spin and waste CPU.
Follow-ups they push on
- What is the ABA problem in CAS-based algorithms?
- When does a spinning CAS loop perform worse than a mutex?
Red flag Assuming lock-free is always faster. CAS shines for tiny updates under modest contention; complex invariants and extreme contention can favor locks.
source: GeeksforGeeks — Compare and Swap (CAS) ↗
Commonly asked mid debug common Debugging: a Node.js endpoint that does heavy synchronous JSON crypto makes ALL other requests slow. Why, and how do you fix it?
Node runs your JavaScript on a single event-loop thread. A heavy *synchronous* CPU task (a big loop, sync crypto, JSON over megabytes) doesn't yield, so it blocks the event loop — every other pending request, timer, and callback stalls until it finishes. Async I/O isn't the issue; CPU-bound sync work is.
Fixes: move the CPU work off the loop — use a worker thread (or worker_threads/a child process), the async variant of the crypto API, or offload to a separate service/queue. The rule: never run long synchronous CPU work on the event loop thread.
What a strong answer covers
- Node's JS executes on one event-loop thread; sync CPU work blocks everything.
- Async I/O is fine — the culprit is synchronous CPU-bound code.
- Fix: worker threads / child process / async crypto / offload to a service.
- Symptom: latency spikes across unrelated endpoints during the heavy call.
Quick self-check
A synchronous CPU-heavy handler slows all Node.js requests. The correct fix is to:
Follow-ups they push on
- Why doesn't adding more async/await help a CPU-bound loop?
- When would you reach for a separate service vs a worker thread?
Red flag Trying to fix it by sprinkling `async/await`. Awaiting doesn't yield during a synchronous CPU loop — you must move the computation off the event-loop thread.
source: Node.js — Don't block the event loop ↗
Commonly asked senior concept occasional What is thread starvation in a connection/thread pool, and how does it cause a 'deadlock' without any locks?
Pool starvation: every thread (or DB connection) in a bounded pool is busy waiting on a resource that can only be supplied by *another* task that's now stuck in the pool's queue with no thread to run it. No mutex is involved, yet the system wedges — a 'pool-induced deadlock'.
Classic case: a request handler holds a pooled thread and synchronously calls back into the same service/pool, which is exhausted; the inner call waits for a thread that will only free up when the outer call returns. Fixes: never block a pooled thread waiting on the same pool, size pools to account for nested calls, separate pools for distinct workloads (bulkheading), and add timeouts so waiters fail fast instead of hanging forever.
What a strong answer covers
- All pool threads/connections busy → queued work can't get a worker.
- A task blocking on work that needs the same exhausted pool wedges the system.
- It looks like a deadlock but has no locks — it's resource exhaustion.
- Fixes: bulkhead separate pools, avoid nested same-pool blocking, add timeouts.
Follow-ups they push on
- How does the bulkhead pattern prevent one workload from starving others?
- Why do checkout/borrow timeouts help even if they don't fix the root cause?
Red flag Blocking a pooled worker on a call that itself needs a worker from the same exhausted pool. Use separate pools and timeouts, and avoid nested same-pool blocking.
source: Microsoft — Bulkhead pattern ↗
Commonly asked senior trick occasional Why is double-checked locking for lazy singleton initialization subtly broken without proper memory visibility?
Double-checked locking checks instance == null, locks only if null, checks again inside the lock, then constructs. The subtle bug is memory visibility / instruction reordering: object construction isn't atomic — the reference can become visible to other threads *before* the constructor's writes are flushed, so a second thread sees a non-null but partially-initialized object.
The fix depends on the memory model: in Java, mark the field volatile (which since JMM 5 establishes the needed happens-before ordering); other languages need their equivalent memory barrier / acquire-release semantics. The deeper lesson: correctness under concurrency needs the language's memory model guarantees, not just mutual exclusion.
What a strong answer covers
- The flaw is reordering/visibility, not the locking logic itself.
- A thread can publish the reference before the constructor's writes are visible.
- Fix in Java: volatile field (post-Java-5 memory model).
- Lesson: concurrency correctness requires memory-model guarantees, not just locks.
Quick self-check
What makes naive double-checked locking unsafe?
Follow-ups they push on
- What does `volatile` guarantee that a plain field doesn't?
- Why is a static holder/initialization-on-demand idiom often cleaner than DCL?
Red flag Believing the second null-check alone makes DCL safe. Without volatile/memory barriers, reordering can expose a half-constructed instance.
source: Wikipedia — Double-checked locking ↗
Commonly asked mid concept common Thread-per-request vs event-loop (reactive) servers — what's the tradeoff at high concurrency?
Thread-per-request (classic Java/Tomcat, Apache prefork) assigns each connection a thread. The model is simple — blocking code reads top-to-bottom — but each thread costs ~1MB+ of stack and context-switch overhead, so tens of thousands of concurrent connections exhaust memory and the scheduler (the C10k problem).
Event-loop / reactive servers (Node, Netty, nginx) handle many connections on a few threads via non-blocking I/O and callbacks, scaling to huge connection counts with low memory. The cost is programming complexity (callbacks/async) and the danger that any blocking call freezes the loop. Threads suit moderate concurrency with blocking dependencies; event loops suit massive I/O-bound concurrency.
What a strong answer covers
- Thread-per-request: simple blocking code, but per-thread memory + context switches cap concurrency.
- Event loop: few threads, non-blocking I/O, scales to huge connection counts.
- This is the classic C10k scaling story.
- Event loops demand non-blocking code; one blocking call stalls everyone.
Follow-ups they push on
- What is the C10k problem?
- How do virtual/green threads (e.g. Java loom, goroutines) blur this divide?
Red flag Assuming 'more threads = more scale'. Past a point, thread memory and context-switching dominate; that's exactly what event-loop models were built to avoid.
source: GeeksforGeeks — Thread per request vs event-driven model ↗