> cs·fundamentals
interview 0% 28m read
3.7 [J][A] 11 interview Q's

Stored routines, views & triggers

The database objects beyond tables: views and materialized views, stored procedures/functions, and triggers — plus UPSERT, full-text search, and JSON/JSONB. Logic and shapes that live in the database itself.

Tables hold your data, but a database can hold logic and derived shapes too — saved queries (views), reusable code (routines), and code that fires on writes (triggers) — plus a few high-leverage features (UPSERT, full-text search, JSON) that save you from reaching for another tool. Knowing these by name means you recognize them in a legacy schema and reach for them when they’re the right call.

Views and materialized views

A view is a query with a name. It stores no data; every time you select from it, the database runs the underlying query against the live tables. It’s the cleanest way to encapsulate a gnarly join or restrict which columns a role can see.

A view that hides a join
-- define once
CREATE VIEW customer_order_totals AS
SELECT c.id, c.name, COUNT(o.id) AS orders, COALESCE(SUM(o.total), 0) AS spent
FROM        customers c
LEFT JOIN   orders    o ON o.customer_id = c.id
GROUP BY    c.id, c.name;

-- query it like any table
SELECT name, spent FROM customer_order_totals WHERE spent > 1000 ORDER BY spent DESC;

The view doesn’t store anything — that second query re-runs the join every time. That’s perfect when the data must be live, but expensive if the aggregation is heavy and queried constantly. That’s where a materialized view earns its keep.

A view recomputes from base tables on every read; a materialized view stores a cached result refreshed on demand.VIEW — virtualSELECT … (saved query)recomputed every readcustomersordersalways fresh · cost paid each queryMATERIALIZED — storedcached result on diskonly on REFRESHcustomersordersfast read · data is a snapshot (can be stale)
FIG 1 · view vs materialized view A view is a window onto live tables (always fresh, recomputed each query). A materialized view is a stored snapshot (fast to read, stale until you REFRESH).

A materialized view stores the result physically:

CREATE MATERIALIZED VIEW daily_revenue AS
SELECT date_trunc('day', created_at) AS day, SUM(total) AS revenue
FROM orders GROUP BY 1;

REFRESH MATERIALIZED VIEW daily_revenue;   -- recompute when you want fresh numbers

Reads are now as fast as a table scan, but the numbers are only as current as your last REFRESH (often run on a schedule). The trade is the same one as caching: read speed for staleness.

Stored procedures and functions

A stored procedure or user-defined function is logic that lives and runs inside the database. A function returns a value and can be called from a query; a procedure runs a block of statements.

-- PostgreSQL (PL/pgSQL) — a function used inside a query
CREATE FUNCTION order_total(oid INT) RETURNS NUMERIC AS $$
  SELECT COALESCE(SUM(quantity * unit_price), 0)
  FROM order_items WHERE order_id = oid;
$$ LANGUAGE sql;

SELECT id, order_total(id) FROM orders;
In the database (routine)In the application
Round-tripsone call does the workmay need several queries
Reuse / enforcementevery client gets the same logiceach app must reimplement it
Visibilityhidden from app code — easy to forgetlives with the rest of your code
Testing / version controlawkward (lives in the DB)normal unit tests, normal diffs
Routines trade visibility and testability for fewer round-trips and central enforcement. Modern apps lean app-side; routines shine for data-heavy logic and legacy systems.

Triggers

A trigger runs a function automatically when a write happens. The classic, safe use is an audit log — record every change without the app having to remember to.

An audit trigger (conceptual)
-- when a product's price changes, log the old and new value automatically
CREATE TRIGGER log_price_change
AFTER UPDATE OF price ON products
FOR EACH ROW
WHEN (OLD.price IS DISTINCT FROM NEW.price)
EXECUTE FUNCTION write_price_audit();   -- inserts into price_audit(old, new, at)

Now any UPDATE to products.price — from the app, a migration, or a DBA at the console — is logged. No app code can forget to do it, because the database does it.

UPSERT — insert or update, atomically

A constant need: “insert this row, or if it already exists, update it.” Done naively with a SELECT then an INSERT/UPDATE, two concurrent requests race and one fails or duplicates. UPSERT does it in one atomic statement.

-- add 10 to stock, or create the row at 10 if the product isn't tracked yet
INSERT INTO inventory (product_id, qty) VALUES (3, 10)
ON CONFLICT (product_id) DO UPDATE
  SET qty = inventory.qty + EXCLUDED.qty;   -- EXCLUDED = the row you tried to insert

ON CONFLICT (product_id) names the unique constraint that decides “already exists.” The SQL-standard spelling is MERGE; Postgres and SQLite use ON CONFLICT. Either way it replaces the read-modify-write race from chapter 3.5 with a single safe statement.

Reaching for WHERE description LIKE '%wireless%' feels natural — and it’s a trap. A leading wildcard ('%...') can’t use a B-tree index (chapter 3.4’s leftmost-prefix rule), so it scans every row. For real text search, use the engine’s full-text index:

-- PostgreSQL: an indexed full-text match, with stemming ("mice" matches "mouse")
SELECT name FROM products
WHERE to_tsvector('english', description) @@ to_tsquery('wireless & mouse');

tsvector normalizes the text (lowercasing, stemming, dropping stop-words) and a GIN index makes the match fast. MySQL’s equivalent is a FULLTEXT index + MATCH … AGAINST. For anything beyond simple matching (ranking, typo tolerance, facets) a dedicated search engine (Elasticsearch, Postgres + pg_trgm) is the next step — but built-in FTS handles a surprising amount.

JSON / JSONB — relational and document in one

Modern SQL engines store and query JSON natively, so “I need MongoDB for this flexible field” is often wrong — a JSONB column does it without leaving your relational database.

-- events(id, data JSONB)
SELECT data->>'user' AS user        -- ->> extracts a field as text
FROM   events
WHERE  data @> '{"type": "click"}'  -- @> = "contains this JSON" (GIN-indexable)
  AND  (data->>'value')::int > 10;

JSONB is stored in a parsed binary form and can be indexed, so containment queries are fast. The guidance: keep the structured, queried, related data in real columns (they get constraints, types, and clean joins) and use JSONB for the genuinely variable or sparse bag of attributes — not as an excuse to avoid schema design.

01 Learning objectives

0 / 7 done

02 Curated reading

03 Knowledge check

knowledge check6 questions · pass ≥ 70%
  1. 01easy

    UPSERT (INSERT … ON CONFLICT DO UPDATE) lets you…

  2. 02easy

    A trigger runs automatically on INSERT/UPDATE/DELETE, even if the app code doesn't call it.

  3. 03medium

    A view vs a materialized view:

  4. 04medium

    Why is WHERE description LIKE '%mouse%' slow on a big table?

  5. 05medium

    SQLite supports stored procedures.

  6. 06medium

    A JSONB column is useful for…

04 Interview questions

browse all ↗

What gets asked on this topic — tap a card for how to approach it, the follow-ups, and the trap. Company tags are best-effort & sourced.

  • Commonly asked mid concept common What is the difference between a view and a materialized view, and when would you use each?

    A regular view is just a saved query — it stores no data. Every read re-runs the underlying SELECT against the live tables, so results are always current but you pay the full query cost on each access.

    A materialized view stores the computed result on disk, so reads are cheap — but the data is a snapshot that goes stale until you REFRESH MATERIALIZED VIEW.

    Use a plain view to centralize/simplify a query, present a stable interface, or restrict columns for security. Use a materialized view when the query is expensive and slightly-stale results are acceptable: dashboards, reporting rollups, precomputed aggregates.

    Red flag Believing a regular view caches its results (it doesn't — it re-executes every time), or treating a materialized view as always up to date.

    source: PostgreSQL — Materialized Views ↗
  • Commonly asked mid trick occasional Can you INSERT/UPDATE/DELETE through a view?

    Sometimes. A simple view — one base table, no aggregation, DISTINCT, GROUP BY, window functions, or set operations — is automatically updatable: writes pass straight through to the base table. Complex views (joins, aggregates) are not directly writable; you make them writable with an INSTEAD OF trigger that translates the change to the right base tables.

    Add WITH CHECK OPTION so an INSERT/UPDATE can't create a row that would fall outside the view's WHERE and silently disappear.

    Red flag Assuming any view is updatable, then being surprised when a write to a join/aggregate view errors out.

    source: PostgreSQL — CREATE VIEW (updatable views) ↗
  • Commonly asked mid concept common What are triggers good for, and why are they dangerous in production?

    A trigger runs a function automatically on INSERT/UPDATE/DELETE (BEFORE, AFTER, or INSTEAD OF). Legitimate uses: writing audit/history rows, enforcing invariants the schema can't express, maintaining a derived/denormalized column, or keeping a summary table in sync.

    The danger is that triggers are invisible side effects. They fire on every row change, hide business logic away from the application, add latency to every write, can cascade or recurse, and quietly make bulk operations slow. They're powerful but easy to abuse.

    Red flag Burying critical business logic in triggers so behavior becomes 'spooky action at a distance', or ignoring their per-row cost on large bulk writes.

    source: PostgreSQL — CREATE TRIGGER ↗
  • Commonly asked senior concept occasional When should business logic live in stored procedures/functions versus the application?

    Pushing logic into the database keeps it close to the data: fewer round trips, atomic multi-statement work, reuse across apps and languages, and often faster set-based processing.

    The costs: logic is now split across two codebases, it's harder to version/test/debug, you take on DB-vendor lock-in, and it burns scarce DB CPU that's hard to scale horizontally.

    The modern default is to keep business logic in the application and reserve DB routines for data-intensive, set-based, or integrity-critical work where the round-trip or consistency win is real.

    Red flag Either extreme: cramming all business logic into the DB (unmaintainable, unscalable), or chatty app code looping row-by-row over work that should be one set-based statement.

    source: PostgreSQL — CREATE PROCEDURE ↗
  • Commonly asked senior coding occasional How do you refresh a materialized view without blocking reads?

    A plain REFRESH MATERIALIZED VIEW mv takes an exclusive lock and blocks every reader until it finishes. Use REFRESH MATERIALIZED VIEW CONCURRENTLY mv instead: it rebuilds without blocking SELECTs. The trade-offs are that it requires a UNIQUE index on the view (so it can diff rows) and it runs slower.

    Schedule refreshes off-peak (cron / pg_cron) or kick them off right after the upstream load completes. If you need near-real-time freshness, full refresh is the wrong tool — maintain a trigger-updated summary table instead.

    Red flag Running a plain (non-concurrent) refresh on a hot view during business hours and locking out every reader.

    source: PostgreSQL — REFRESH MATERIALIZED VIEW ↗
  • Commonly asked mid debug occasional A table's writes are mysteriously slow and some rows change on their own — how do you debug it?

    Symptoms like 'an UPDATE touched rows I never wrote', unexplained slow writes, or stack depth limit exceeded almost always trace back to a trigger.

    Steps: list the triggers on the table (\d table in psql, or information_schema.triggers), read the trigger function, note BEFORE vs AFTER and which events fire it, and look for a trigger that writes back to the same table (recursion) or a per-row trigger running during a bulk operation. Add RAISE NOTICE to trace, and temporarily ALTER TABLE ... DISABLE TRIGGER to isolate the culprit.

    Red flag Debugging the application for hours when an AFTER trigger is the real cause — or disabling a trigger in production to test and forgetting to re-enable it.

    source: PostgreSQL — Overview of Trigger Behavior ↗
  • Commonly asked mid concept occasional What's the difference between a stored function and a stored procedure in PostgreSQL?

    A function returns a value (scalar, row, or set) and is meant to be *called inside* a SQL statement — SELECT my_fn(x). Because it runs *within* the calling query's transaction, it cannot issue COMMIT/ROLLBACK.

    A procedure (added in Postgres 11) is invoked with CALL my_proc(...), may return nothing, and crucially can manage transactions — it can COMMIT/ROLLBACK mid-body, which is what makes procedures right for batch jobs that process and commit in chunks. So: need a value inside a query -> function; need explicit transaction control for multi-step/batch work -> procedure.

    What a strong answer covers
    • Function: returns a value, called inside SQL (SELECT f(...)), no transaction control.

    • Procedure: called with CALL, can COMMIT/ROLLBACK in its body.

    • Procedures (PG 11+) suit batch jobs that commit in chunks; functions suit computed values.

    • A function runs inside the caller's transaction; it can't open/close one.

    Quick self-check

    You need a routine that processes a million rows in batches, committing every 10,000. In Postgres you should write a…

    Red flag Trying to COMMIT inside a function (errors), or assuming 'function' and 'procedure' are just two names for the same thing.

    source: PostgreSQL docs — CREATE PROCEDURE (transaction control) ↗
  • Commonly asked mid concept occasional When do you reach for a BEFORE, AFTER, or INSTEAD OF trigger?

    BEFORE fires before the row change and can modify or veto it — use it to validate, normalize/derive a column (set updated_at, lowercase an email), or RETURN NULL to skip the operation. The row isn't written yet, so you can't see its final generated id.

    AFTER fires once the change is committed to the row — use it for side effects that depend on the final state: writing an audit/history row, enqueuing a notification, maintaining a summary table. It can see the new id.

    INSTEAD OF applies only to views: it replaces the (impossible) direct write with custom logic, which is how you make a complex/multi-table view updatable.

    What a strong answer covers
    • BEFORE: validate / mutate / cancel the row before it's written (can RETURN NULL to skip).

    • AFTER: react to the committed change — audit logs, notifications, summary maintenance.

    • INSTEAD OF: only on views; substitutes custom DML to make a non-updatable view writable.

    • BEFORE can't see auto-generated values (id/serial); AFTER can.

    Quick self-check

    You want to reject or normalize a value before it's stored. Which trigger timing fits?

    Red flag Using an AFTER trigger to try to alter the row (too late) or a BEFORE trigger to read the generated primary key (not assigned yet).

    source: PostgreSQL docs — Overview of Trigger Behavior (BEFORE/AFTER/INSTEAD OF) ↗
  • ★ must-know Commonly asked mid concept common What is SQL injection, and how do stored procedures and parameterized queries relate to preventing it?

    SQL injection happens when user input is concatenated into a query string, so input like ' OR 1=1 -- becomes executable SQL. The real defense is parameterized queries / prepared statements: the SQL text and the data travel separately, so input is always treated as a value, never as code.

    Stored procedures help *only if* they use parameters internally — a procedure that builds and EXECUTEs a string from its arguments (dynamic SQL) is just as injectable. So 'use stored procedures' is not itself the fix; 'never interpolate untrusted input into SQL' is. ORMs parameterize by default, which is a big part of why they're safer out of the box.

    What a strong answer covers
    • Injection = untrusted input concatenated into SQL text and executed as code.

    • Fix = parameterized queries / prepared statements: code and data sent separately.

    • Stored procedures are safe only when parameterized; dynamic SQL inside them is still vulnerable.

    • ORMs parameterize by default; the danger returns the moment you build raw SQL by string concat.

    Quick self-check

    Which most reliably prevents SQL injection?

    Red flag Believing 'we use stored procedures, so we're safe from injection' — a procedure that concatenates input into dynamic SQL is exactly as vulnerable as inline string-building.

    source: OWASP — SQL Injection Prevention Cheat Sheet ↗
  • Commonly asked senior concept occasional When is a materialized view the wrong tool, and what would you use instead?

    A materialized view recomputes its *entire* result on REFRESH — there's no built-in incremental update in core Postgres. So it's the wrong tool when you need near-real-time freshness or the base data is huge and changes constantly: each full refresh is expensive and the data is stale between refreshes.

    Better alternatives by need: for freshness, maintain a summary/rollup table updated incrementally by triggers or in the write path (comment_count); for ad-hoc speed without staleness, just add the right indexes to the plain view's query; for genuinely incremental materialization, reach for an external tool or an extension (e.g. continuous aggregates in TimescaleDB). Materialized views fit *expensive, periodically-refreshed reporting* — dashboards that tolerate minutes/hours of lag.

    What a strong answer covers
    • Core Postgres materialized views refresh in full — no incremental maintenance.

    • Wrong for near-real-time needs or huge, constantly-changing base data.

    • Freshness alternative: an incrementally-maintained summary table (triggers / write-path updates).

    • Speed-without-staleness alternative: index the plain view's underlying query.

    • Right fit: expensive, periodically-refreshed reporting that tolerates lag.

    Red flag Using a materialized view for data that must be fresh, then refreshing it constantly and paying a full recompute each time instead of maintaining an incremental summary table.

    source: PostgreSQL docs — Materialized Views (refresh is full recompute) ↗
  • Commonly asked senior debug occasional How do you prevent a row-level trigger from recursively firing on its own writes?

    If an AFTER UPDATE trigger on a table issues another UPDATE on the same table, that write fires the trigger again — risking infinite recursion and a stack depth limit exceeded error.

    Guards: (1) make the trigger's write a no-op when nothing changed — in a BEFORE trigger, IF NEW IS DISTINCT FROM OLD THEN … ELSE RETURN NULL stops the cascade once values stabilize; (2) only re-write when a condition flips, so the second pass changes nothing and the chain ends; (3) use pg_trigger_depth() to act only at depth 1; (4) restructure so the trigger updates a *different* table. The cleanest fix is usually a BEFORE trigger that mutates NEW in place (no second UPDATE needed at all) rather than issuing a recursive write.

    What a strong answer covers
    • A trigger that writes back to its own table re-fires itself -> potential infinite recursion.

    • Symptom: stack depth limit exceeded.

    • Guard with a 'did anything actually change?' check (NEW IS DISTINCT FROM OLD).

    • Or gate on pg_trigger_depth(), or update a different table.

    • Best: a BEFORE trigger that edits NEW directly — no recursive UPDATE at all.

    Red flag Writing an AFTER trigger that UPDATEs the same table unconditionally, causing it to re-fire forever and hit the stack-depth limit.

    source: PostgreSQL docs — Trigger Procedures / recursion behavior ↗