CI/CD

Pipeline stages (build → test → deploy) and writing a basic GitHub Actions workflow.

CI/CD automates the path from a pushed commit to running software: CI (continuous integration) builds and tests every change so problems surface in minutes, and CD (continuous delivery/deployment) takes a green build and ships it. The mental model is a pipeline of stages — build → test → deploy — where a failure at any stage stops the line before bad code reaches users. The payoff is fast feedback: a broken build is caught minutes after the commit, while the author still has the change in their head, instead of days later in a manual QA pass.

Key vocabulary

CI vs CD: Continuous Integration = merge + build + test every change automatically. Continuous Delivery = every green build is deployable (a human clicks release). Continuous Deployment = every green build auto-ships to production with no manual gate.
Pipeline / workflow: The ordered set of stages triggered by an event (push, PR, tag). In GitHub Actions a workflow contains jobs; each job contains steps; jobs run in parallel by default unless one needs another.
Runner: The machine that executes a job — a fresh, ephemeral VM/container (e.g. ubuntu-latest) for cloud-hosted runners, or your own self-hosted box. Each job gets a clean environment, so you can't rely on state from a previous job.
Artifact: A build output (compiled binary, Docker image, test report) produced in one stage and passed to a later one — what the build stage hands the deploy stage so you deploy exactly what you tested.
Fail fast: Ordering checks so the cheapest, most-likely-to-fail ones run first — lint and unit tests before slow e2e and deploy — so a broken change is rejected in seconds, not after twenty minutes of pipeline.

The stages: build → test → deploy

Each stage gates the next. The ordering matters because each is cheaper and faster than the one after, so you want the most likely failures to fail first (fail fast).

FIG 1 · the pipeline A push triggers the pipeline; each stage must go green to unlock the next, and only a fully-tested artifact reaches deploy. A red stage stops the line.

Stage	What runs	Where it runs	Fails the build when
Build	Compile, install deps, lint, build the image/artifact	On a runner, on every push/PR	Code doesn't compile, lint errors, broken deps
Test	Unit → integration → e2e (cheap-to-expensive)	Runner, often parallel matrix across versions	Any test fails or coverage drops below the gate
Deploy	Push the tested artifact to an environment	Runner with deploy creds → staging/prod	Smoke checks fail post-deploy (→ auto-rollback)

Cheap, fast checks first; the slow, costly deploy only runs on a fully-green build.

A healthy pipeline deploys the same artifact through environments — build once in the build stage, then promote that identical image through staging to production. Rebuilding per-environment risks shipping something subtly different from what your tests validated: a dependency that resolved to a newer patch version, a different build timestamp, a config baked at the wrong moment. “Test what you ship, ship what you tested” is only true if it’s literally the same bytes.

A GitHub Actions workflow

A workflow is a YAML file in .github/workflows/. You declare what triggers it (on:), then one or more jobs, each a sequence of steps. Here is a complete CI workflow that lints, tests, and (only on the main branch) builds and pushes a Docker image.

.github/workflows/ci.yml

name: CI

on:
  push:
    branches: [main]
  pull_request:           # also run on every PR

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node: [20, 22, 24]   # test across Node versions in parallel
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node }}
          cache: npm
      - run: npm ci
      - run: npm run lint
      - run: npm test -- --coverage

  build-image:
    needs: test                          # only after every test job is green
    if: github.ref == 'refs/heads/main'  # and only on main
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v4
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v6
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}

The test job runs on every push and PR, fanning out across three Node versions via the matrix. The build-image job declares needs: test, so it waits for all three matrix legs to pass, and the if: guard limits it to the main branch — PRs get tested but never publish an image. Note ${{ ... }} is GitHub Actions’ expression syntax (contexts, secrets, matrix values), and secrets.GITHUB_TOKEN is an auto-provisioned credential scoped by the permissions: block.

The pwn request — when CI runs untrusted code with secrets GitHub

The most damaging CI vulnerability class isn’t a flaky test — it’s a misconfigured trigger that hands repository secrets to attacker-controlled code. The classic case: a workflow uses pull_request_target (which runs with the base repo’s secrets) but then checks out and runs code from the fork’s PR head. A malicious contributor opens a PR, the workflow dutifully executes their npm scripts or build steps with secrets.* in the environment, and those secrets get exfiltrated. The same shape bites self-hosted runners on public repos: an untrusted PR can run arbitrary commands on a machine inside your network. The hardening is the least-privilege mindset applied to CI: never run untrusted PR code with privileged triggers, scope permissions: to the minimum each job needs, prefer the short-lived auto-provisioned GITHUB_TOKEN over long-lived personal tokens, and pin third-party actions to a full commit SHA so a compromised action tag can’t silently swap in malicious code. Your pipeline executes code on every push — treat it as a production attack surface, because it is one.

read the writeup ↗ docs.github.com

Key points

CI = build + test every change automatically; CD = ship green builds (delivery = manual release gate, deployment = fully automatic).
Pipeline stages build → test → deploy gate one another — fail fast on the cheapest checks; promote the same artifact through environments rather than rebuilding.
In GitHub Actions: a workflow has jobs (parallel by default; chain with needs) made of steps (run: commands or uses: actions). Triggers are declared under on:.
Each job runs on a fresh, ephemeral runner — pass build outputs forward with artifacts or a registry; nothing persists between jobs.
Keep secrets in the secrets store (${{ secrets.* }}) and scope tokens least-privilege; never commit credentials to the workflow YAML, and never run untrusted PR code with privileged triggers.

01 Learning objectives

0 / 2 done

02 Curated reading

GitHub Actions — Documentation
optional 20m — Triggers, jobs, steps for a first workflow.

03 Interview questions

browse all ↗

What gets asked on this topic — tap a card for how to approach it, the follow-ups, and the trap. Company tags are best-effort & sourced.

Commonly asked mid concept very common What is the difference between continuous integration, continuous delivery, and continuous deployment?
Continuous integration (CI): developers merge to a shared branch frequently, and every push automatically builds and runs the test suite, so integration problems surface in minutes, not at a big-bang merge.
Continuous delivery (CD): every change that passes CI is automatically built into a deployable, release-ready artifact and pushed through environments up to a staging gate — but the final push to production is a manual button.
Continuous deployment: the same pipeline, with the manual gate removed — every change that passes all automated checks goes straight to production, no human in the loop. The distinction people get wrong is delivery (human approves the prod release) vs deployment (fully automated to prod).
Follow-ups they push on
- Where exactly is the manual gate in delivery vs deployment?
- What must be true about your test suite to safely do continuous deployment?
Red flag Using 'continuous delivery' and 'continuous deployment' interchangeably — the difference is whether a human approves the production release.
source: GitHub docs — About continuous integration ↗
Commonly asked mid coding very common Write a basic GitHub Actions workflow that runs tests on every pull request. Explain the trigger, jobs, and steps.
A workflow is YAML in .github/workflows/. The top-level on sets the trigger, jobs are units that run on a runner, and each job has steps.
name: CI
on:
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npm test
on: pull_request triggers on every PR to main; the single test job runs on a fresh Ubuntu runner; steps check out the code, set up Node, install deps deterministically with npm ci, and run the suite. Jobs run in parallel by default; needs: makes one wait on another.
Follow-ups they push on
- How do you make a deploy job run only after the test job passes?
- Why `npm ci` instead of `npm install` in CI?
Red flag Forgetting `actions/checkout` (the runner starts empty, so the build has no source), or using `npm install` instead of `npm ci` so the lockfile is not respected and builds become non-reproducible.
source: GitHub docs — Writing workflows / quickstart ↗
Commonly asked mid concept common Explain the typical stages of a CI/CD pipeline: build, test, deploy. What runs where?
Build: compile/transpile, install dependencies, and produce a versioned, immutable artifact (a binary, a bundle, or — most commonly — a container image) pushed to a registry. The key principle is build once and promote that same artifact through every environment.
Test: run fast unit tests first (fail early), then integration tests, then optionally end-to-end tests, plus quality and security scans (lint, SAST, dependency/vulnerability scan). Order from cheapest/fastest to slowest so the pipeline fails fast.
Deploy: ship the already-built artifact to staging, run smoke tests, then promote to production with a rollout strategy (rolling/blue-green/canary) and health checks that can trigger automatic rollback. Building a fresh artifact per environment is the anti-pattern — you would no longer be testing what you ship.
Follow-ups they push on
- Why build the artifact once and promote it rather than rebuilding per environment?
- Why run unit tests before integration and e2e tests?
Red flag Rebuilding the artifact separately for staging and production — you then deploy something you never actually tested, defeating the point of the pipeline.
source: GitHub docs — About continuous deployment ↗
Commonly asked senior concept common How do you handle secrets (API keys, deploy credentials) in a CI/CD pipeline?
Never hardcode secrets in source, the workflow file, or build logs. Inject them at runtime from a secret store: GitHub Actions encrypted secrets / environments, or an external manager like HashiCorp Vault, AWS Secrets Manager, or a cloud key vault. The CI system makes them available as masked env vars so they do not print in logs.
Stronger still: prefer short-lived, scoped credentials over long-lived static keys — for cloud deploys, use OIDC so the workflow exchanges its identity token for temporary cloud credentials, eliminating stored long-lived keys entirely. Scope secrets to the environment that needs them and gate production secrets behind required reviewers. And remember a secret echoed into a log or committed to git is compromised forever — rotate it.
Follow-ups they push on
- Why is OIDC-based short-lived credential exchange better than a stored static cloud key?
- What do you do the moment a secret leaks into a build log?
Red flag Putting credentials in the repo or in plain workflow env, or echoing a secret in a debug step — once it lands in git history or a log it must be treated as permanently compromised and rotated.
source: GitHub docs — Using secrets in GitHub Actions ↗
Commonly asked senior concept common Compare blue-green and canary deployment strategies. When would you choose each?
Blue-green runs two full environments: blue (current) serves all traffic while green (new) is deployed and verified, then you flip traffic to green at once. Rollback is instant — flip back to blue. Cost: you run double the infrastructure during the cutover, and a bad release hits 100% of users the moment you switch.
Canary releases the new version to a small slice of traffic (say 5%), watches error rates and latency, then gradually ramps to 100%. It limits blast radius and catches problems with real traffic before everyone is exposed, but it is more complex (traffic splitting, automated metric analysis) and the rollout is slower.
Pick blue-green when you want a clean, instant, all-or-nothing switch and can afford duplicate capacity; pick canary when blast-radius control matters and you have the observability to judge a partial rollout.
Follow-ups they push on
- What does each strategy give you for rollback?
- What observability do you need to run a canary safely?
Red flag Calling a deployment a 'canary' when there is no automated metric analysis gating the ramp — without watching error/latency on the small slice, you have just slowed down a full rollout, not limited blast radius.
source: AWS — Blue/Green vs Canary deployment strategies ↗
Commonly asked senior debug common Your CI build passes locally but fails intermittently in the pipeline. How do you approach a flaky build?
Flakiness almost always comes from hidden non-determinism. Hunt the usual sources: tests that depend on execution order or shared mutable state; reliance on real time/timezone, random seeds, or wall-clock sleeps instead of waiting on a condition; tests hitting real networks/external services; and concurrency races. The 'works locally' clue points at environment differences — different dependency versions, missing lockfile pinning, or fewer CPUs on the runner exposing a race.
Approach: make it reproducible (run the suite repeatedly, randomize order, run in a clean container matching CI), then isolate the offending test and fix the root cause. Pin dependencies with a lockfile and npm ci, mock external calls, and replace sleeps with explicit waits. Blanket auto-retry hides flakes and erodes trust in the suite — fix, do not paper over.
Follow-ups they push on
- Why does 'passes locally' point you toward environment/ordering differences?
- Why is blindly retrying failed tests a bad long-term fix?
Red flag Slapping an automatic retry on the whole suite so red turns green — the underlying race or shared-state bug stays, and the team stops trusting CI failures.
source: GitHub docs — Continuous integration concepts ↗
Commonly asked senior concept common What is a deployment gate / required approval, and where do manual gates belong in a pipeline?
A gate is a condition that must pass before a stage proceeds — automated (tests green, security scan clean, smoke checks pass) or manual (a required human approval). In GitHub Actions you implement this with environments that have required reviewers and optionally a wait timer or branch restrictions; a job targeting that environment pauses until approved.
Where gates belong: automated quality gates everywhere (fail fast on tests/lint/scans), and a manual approval only at the boundary you actually want a human to own — typically the promotion to production. That manual prod gate is exactly the line between continuous *delivery* (human approves prod) and continuous *deployment* (no gate). You also gate to protect the production *secrets/credentials*, which are scoped to that environment and unlocked only after approval.
The senior framing: minimize manual gates (they create bottlenecks and false confidence) and lean on strong automated checks; reserve human approval for genuinely high-risk, irreversible promotions.
What a strong answer covers
- A gate blocks a stage until a condition passes — automated (tests/scans) or manual (approval).
- GitHub Actions: environments with required reviewers / wait timer pause a job until approved.
- Put automated gates everywhere (fail fast); reserve manual approval for the prod promotion.
- That manual prod gate is the line between continuous delivery and continuous deployment.
- Environment gates also protect prod secrets, unlocked only after the gate passes.
Follow-ups they push on
- How does a required-reviewer environment gate relate to delivery vs deployment?
- Why can too many manual gates be worse than fewer, stronger automated ones?
- How does gating an environment also protect production credentials?
Red flag Gating every stage with manual approvals 'to be safe' — it creates bottlenecks and rubber-stamp approvals; strong automated gates plus a single human gate at prod promotion is the better pattern.
source: GitHub docs — Using environments for deployment ↗
Commonly asked mid concept common Why and how do you cache dependencies in CI? What's the difference between caching and an artifact?
CI runners start clean every run, so without caching you re-download every dependency on each build — slow and wasteful. A dependency cache restores files like node_modules/~/.npm keyed on a hash of the lockfile (package-lock.json): a cache *hit* restores them in seconds; a cache *miss* (lockfile changed) rebuilds and saves a fresh cache. In GitHub Actions the setup-* actions can do this with one cache: line, or you use actions/cache directly.
The distinction interviewers want: a cache is a build-time optimization — it is keyed, can be evicted, and you must never *depend* on it existing (a miss must still produce a correct build). An artifact is an *output* you deliberately persist — the built binary/image/test report you pass between jobs or download later. Cache = speed, may vanish; artifact = a result you must keep.
Key the cache carefully: too broad and you serve stale deps; too narrow and you never hit it. Hashing the lockfile is the sweet spot.
What a strong answer covers
- Runners are ephemeral; caching avoids re-downloading deps every run.
- Key the cache on a lockfile hash — hit restores fast, miss rebuilds and re-saves.
- Cache = build-time speedup, evictable, must never be *required* for correctness.
- Artifact = a deliberate output you persist (binary/image/report) and pass between jobs.
- Bad cache keys cause stale dependencies (too broad) or constant misses (too narrow).
Quick self-check
What is the right cache key for a Node project's `node_modules` cache?
Follow-ups they push on
- Why must your build still succeed on a cache miss?
- What goes wrong if your cache key is the branch name instead of the lockfile hash?
- When would you use an artifact instead of a cache?
Red flag Treating a cache like an artifact and depending on it being present, or keying it too loosely so a stale `node_modules` is restored after the lockfile changed — leading to 'works in CI but with old deps' bugs.
source: GitHub docs — Caching dependencies to speed up workflows ↗
Commonly asked mid concept occasional How do you run the same CI job across multiple language versions or OSes efficiently?
Use a build matrix. Instead of copy-pasting a near-identical job per Node version or OS, you declare a matrix and CI fans out one job per combination automatically, running them in parallel. In GitHub Actions:
strategy:
matrix:
node: [18, 20, 22]
os: [ubuntu-latest, windows-latest]
That single job definition expands to 6 parallel jobs (3 versions × 2 OSes), each on its own runner. You can include/exclude specific combinations and set fail-fast (cancel the rest on first failure) on or off depending on whether you want full results.
The value is coverage without duplication: test the support matrix you promise users, catch a version-specific break early, and keep the workflow DRY. The tradeoff is runner minutes — a wide matrix multiplies cost, so test the combinations that matter, not every permutation.
What a strong answer covers
- A matrix fans one job definition out into one parallel job per combination.
- matrix: { node: [...], os: [...] } expands to the cross-product, each on its own runner.
- include/exclude tune specific combos; fail-fast controls cancel-on-first-failure.
- Gives coverage of your support matrix without duplicating job YAML.
- Cost grows with the cross-product — test combinations that matter, not every permutation.
Follow-ups they push on
- What does `fail-fast: false` change about a matrix run?
- How would you exclude one specific version/OS combination?
- What's the cost tradeoff of a very wide matrix?
Red flag Duplicating an entire job per version/OS instead of using a matrix — it's verbose, drifts out of sync, and you forget to update one copy; the matrix keeps all combinations defined in one place.
source: GitHub docs — Running variations of jobs in a workflow (matrix) ↗
Commonly asked senior concept occasional Why is a fast CI feedback loop so important, and how do you keep a pipeline fast as it grows?
The whole point of CI is fast feedback on whether a change is safe. A pipeline that takes 40 minutes breaks the developer's flow — they context-switch, stack up un-merged PRs, and start ignoring or working around the signal. Speed is what keeps CI trustworthy and keeps people integrating frequently.
Keep it fast as it grows: parallelize (split the test suite across runners / use a matrix), fail fast by ordering cheap checks first (lint and unit tests before slow e2e), cache dependencies and build outputs, and only run what changed for large monorepos (path filters / affected-project detection). Build the artifact once and promote it rather than rebuilding per stage.
The senior framing: treat pipeline duration as a product metric you budget and watch — when a stage gets slow, profile it like you would slow code. A flaky or slow pipeline is a tax on every single merge.
What a strong answer covers
- CI exists for fast feedback; a slow pipeline breaks flow and erodes trust in the signal.
- Parallelize test suites and use matrices to spread work across runners.
- Fail fast: cheap checks (lint, unit) before slow ones (integration, e2e).
- Cache deps/build outputs and only run what changed in big monorepos.
- Treat pipeline duration as a tracked metric — profile a slow stage like slow code.
Follow-ups they push on
- Why does ordering fast tests before slow ones matter even at the same total cost?
- How does 'only test what changed' work in a monorepo?
- What's the cost of letting a pipeline creep to 40 minutes?
Red flag Letting pipeline time creep unbounded — once feedback takes tens of minutes, developers batch changes and stop trusting CI, which defeats the purpose of continuous integration entirely.
source: GitHub docs — About continuous integration ↗
Commonly asked senior debug occasional A deploy to production succeeds but the app is broken; rolling back code didn't fix it. How do you reason about the failure and prevent it?
First separate the layers: a 'green' deploy only means the *pipeline* succeeded, not that the app *works*. If rolling back the code didn't fix it, the breakage is almost certainly not in the code artifact — look at the things that aren't versioned with the image: a database migration that already ran (and is irreversible), a changed config/feature flag, a new infra/secret value, or a dependency/external service.
The migration case is the classic trap: code rolls back instantly, but a schema change (dropped column, altered type) does not, so old code now hits an incompatible schema. The discipline is backward-compatible, expand-then-contract migrations — deploy schema changes that both old and new code can run against, ship code, then remove the old shape in a later release — so rolling back code is always safe.
Prevention: add post-deploy smoke tests/health checks that gate the rollout (so a broken deploy auto-rolls-back before users see it), decouple migrations from code deploys, use feature flags to separate 'deployed' from 'released', and ensure rollbacks are actually tested, not assumed.
What a strong answer covers
- A green pipeline ≠ a working app — 'success' is about the deploy, not behavior.
- If code rollback didn't help, the cause is unversioned state: migrations, config, flags, secrets, deps.
- Irreversible DB migrations are the classic trap — code reverts, schema doesn't.
- Fix with expand-then-contract backward-compatible migrations so rollback is always safe.
- Prevent with post-deploy smoke tests that gate/auto-rollback, plus feature flags to separate deploy from release.
Follow-ups they push on
- Why doesn't rolling back code fix a forward database migration?
- What does an expand-then-contract migration look like in practice?
- How do feature flags let you separate 'deployed' from 'released'?
Red flag Assuming a code rollback always restores a known-good state — irreversible schema migrations and out-of-band config changes aren't part of the artifact, so the rollback leaves old code running against a changed world.
source: GitHub docs — About continuous deployment ↗
Commonly asked senior concept occasional Why is trunk-based development paired with feature flags so common in CI/CD, and what problem does it solve over long-lived branches?
Long-lived feature branches drift away from main for days or weeks, so when they finally merge you get merge hell — big, painful, conflict-ridden integrations exactly when you can least afford surprises. That defeats the 'continuous' in continuous integration, whose whole premise is integrating *frequently* so problems surface in small, cheap increments.
Trunk-based development has everyone commit small changes to main (or very short-lived branches merged within a day), keeping the branch always releasable. The obvious tension: how do you merge unfinished work without shipping it? Feature flags — you merge the code behind an off-by-default flag, so it's integrated and tested continuously but invisible to users until you flip it on. This also decouples deploy from release: deploying code and exposing a feature become separate decisions, enabling canary/gradual rollouts and instant kill-switches.
Senior framing: small frequent merges + flags keep integration cheap and continuous and make release a runtime toggle rather than a deployment event — at the cost of flag hygiene (you must clean up stale flags).
What a strong answer covers
- Long-lived branches drift from main → painful big-bang merges that defeat continuous integration.
- Trunk-based: small frequent commits to main, kept always releasable.
- Feature flags let you merge unfinished work off-by-default — integrated and tested, not yet exposed.
- Flags decouple deploy from release: shipping code and turning a feature on are separate decisions.
- Enables canary/gradual rollout + instant kill-switch; cost is flag hygiene (remove stale flags).
Follow-ups they push on
- How do feature flags let you merge incomplete work to main safely?
- What does 'decoupling deploy from release' buy you operationally?
- What's the maintenance cost of feature flags over time?
Red flag Sitting on a long-lived branch 'until the feature is done' — it diverges from main and turns into a high-risk merge; the CI premise is to integrate small changes continuously, using flags to hide the unfinished parts.
source: GitHub docs — About continuous integration ↗