Section E · Execution · Read morning-of

Day-Of Tactics

Read the morning of each round.

The night before

  • Reread 00-START-HERE, 11-sql-problems, and 15-interview-questions highlights.
  • Sleep. Caffeine recovers from one thing; sleep does not.
  • Pick a quiet room with a good camera, good light, good audio. Test it.
  • Pull up the JD in a tab; have it visible behind your camera (printed sticky if you need it).
  • Have water. Have a notepad and pen. Phone face-down, off the desk.
  • If it's a shared-screen SQL round: open your editor of choice (or whatever they tell you), warm up the keyboard with one window function.

30 minutes before

  • Re-read your one-paragraph self-summary out loud.
  • Solve one SQL window-function problem on paper. Just to warm up the muscles.
  • Stand up, walk for 10 minutes. Don't keep cramming.
  • Three minutes before, sit, breathe slow for 60 seconds. Smile — it changes your voice.

SQL live coding — the framework that wins rounds

Apply this every SQL coding round:

  1. Restate the schema. "So we have a users table at one row per user, and an orders table at one row per order — is that right?"
  2. Clarify grain & constraints. "What's the output grain? One row per user per month?" "Are timestamps UTC?" "Can a user be deleted?" "Are there duplicates?"
  3. Walk through a tiny example. "For user A with orders on Jan 1 and Jan 15, the answer would be 2 — right?"
  4. State the approach. "I'll filter active users, aggregate orders by month, then join."
  5. Identify the pattern. "This is a latest-row-per-group — ROW_NUMBER pattern." "This is gaps-and-islands." The pattern name is half the answer.
  6. Code in CTEs. One concept per CTE. Name them well: active_users, monthly_orders. Resist the urge to write one giant nested query.
  7. Talk while you type. Silence reads as stuck. Narration reads as competent.
  8. Test against the example. Walk your code line by line against the input.
  9. Mention edge cases. NULLs, ties, time-zone confusion, empty input, single-row groups. Even if you don't fix all of them, calling them out signals seniority.
  10. Discuss performance. "At scale, I'd add a partition_date filter here. And I'd worry about the grain after this join — let me verify."
The single line that wins rooms

"What's the grain of this CTE?" Say it out loud at every major step. It's the senior data engineer's tic and interviewers love hearing it.

SQL gotchas to flag unprompted

Senior signal: you mention these before the interviewer asks. Pick the ones relevant to the problem:

  • NULL handling"I should COALESCE this — otherwise the NOT IN silently drops rows."
  • Join fanout"If I join here without aggregating first, the grain explodes. Let me aggregate in a CTE."
  • Time zones"Are these timestamps UTC? Because DATE_TRUNC on a TZ-aware timestamp behaves differently from a naive one."
  • Ties in window functions"I'll add a secondary order-by column so ties are deterministic."
  • Window frame defaults"I'll specify the ROWS BETWEEN explicitly because the default RANGE frame can surprise."
  • Division by zero"I'll wrap the denominator in NULLIF in case it's zero."
  • SELECT *"In production I'd pull only the columns I need."

Design / system rounds — the framework

For any "design a data system for X" question:

  1. Clarify the goal. What business questions does this answer? What's the latency requirement? Who consumes it?
  2. Identify the entities — these become dims.
  3. Identify the events — these become facts. State the grain of each out loud.
  4. Sketch the data flow. Source → load → transform → serve. Mention the tools you'd use at each layer.
  5. Pick the materialization strategy. Batch frequency. Incremental vs full rebuild. Aggregation layers.
  6. Mention testing & quality upfront, not as an afterthought. "I'd add unique/not_null tests on this dim, plus a singular test for the business rule."
  7. Mention monitoring. Source freshness, model run-time, anomaly detection.
  8. Talk about evolution. "As volume scales, the daily aggregation table will get slow — I'd move to incremental, partitioned by date_key."
  9. Discuss tradeoffs explicitly. "I chose batch over streaming because the dashboards consuming this don't need sub-minute freshness. If they did, I'd reach for [stream tool]."

When you don't know something

Never bluff

Data engineers detect bluffing as fast as AI engineers do — the field is tight and the failure modes of fake experience are predictable.

Patterns that work:

"I haven't used Iceberg in production. My closest reference point is Delta Lake, where I know the operational story — ACID semantics over object storage, time travel via a metadata layer. I'd expect Iceberg's shape to be similar; the gotcha I'd want to verify is how partition evolution works in practice. Want me to reason about it or shift to a related topic?"

"I haven't worked with that specific BI tool. From what I've read, it's [your sentence]. The thing that matters most for the data team — semantic layer, embedded analytics, query-fingerprinting — is [your understanding]. How does the team use it day to day?"

The reframe: "I don't know""Here's the boundary of what I know and how I'd extend it."

When they push back on your answer

They might be testing whether you'll fold. Or they might be right. Hold your view if you have a reason; update if you don't.

"That's fair pushback. Where I'd hold ground is X — because [reason — usually a grain or correctness concern]. Where I think you've got the better of it is Y — I hadn't weighed [factor]."

That sentence shows judgment and humility. Both win points.

Specific traps to watch for

1"What's the simplest way to dedup this table?"

Don't reflexively say SELECT DISTINCT. That's the band-aid answer.

"DISTINCT removes exact duplicates, but if the duplicates differ in any column, it won't help. The right question is: what's the candidate key — what should be unique? Once we know that, ROW_NUMBER over the key with a tiebreaker is the canonical pattern. Then upstream, I'd add a uniqueness test on the key so this doesn't recur."

2"This query is slow — can you speed it up?"

Don't immediately add indexes / restructure CTEs. Ask what the query plan says.

"Before I refactor, can you share the query plan? I want to see where the row blow-up is. Common causes — no partition filter, an unexpected fanout, data skew — each has a different fix. If it's scanning the whole table I'd add a date filter; if it's a fanout I'd aggregate before joining; if it's spilling I'd reduce columns or push down filters. Without the plan I'm guessing."

3"Should this be streaming?"

Default to no, unless there's a clear latency requirement.

"Streaming costs in complexity — exactly-once is hard, debugging is harder, observability is harder. So I ask: what's the actual latency requirement? If consumers are fine with hourly batch, hourly batch wins. If they need sub-minute, streaming becomes worth the cost. Many 'real-time' requirements turn into 'this dashboard refreshes every 5 minutes' once you ask, and 5-minute micro-batch is plenty."

4"Tell me how you'd improve our [thing]"

You don't know their thing yet. Don't fake.

"Without knowing the current architecture I'd be guessing. Can I ask a few clarifying questions and then take a stab? [Listen, then sketch]. What I'd want to understand first: what's the failure mode you're trying to fix? Speed? Trust? Cost? The improvement strategy is different for each."

5"Walk me through a project end-to-end"

They want depth. Pick one project — whatever you can speak to honestly and in detail. Use STAR structure and don't sprint:

  • Situation: 30 sec
  • Task: 30 sec
  • Action: 90 sec — the meat, with specific decisions and tradeoffs
  • Result: 30 sec — what shipped and what you learned

Total ~3 minutes. They'll dig in with follow-ups.

Questions to ask them (have these ready)

Have 5-6 ready, ask the 2-3 best based on what came up:

  1. "What's the state of the warehouse today — clean and modeled, or wedge of legacy SQL?"
  2. "How does the team handle data quality alerts when they fire? Is there an on-call rotation?"
  3. "How are metric definitions managed — semantic layer, dbt models, convention?"
  4. "What's the one model in production right now that everyone's afraid to touch, and why?"
  5. "What's the team's biggest analytics question that's still hard to answer with the current setup?"
  6. "How does the data team collaborate with engineering on schema changes? Contracts or convention?"
  7. "What does the on-call story look like for the warehouse — runaway queries, cost spikes, late jobs?"
  8. "What's a recent project the data team got wrong, and what changed because of it?" (culture signal)

Don't ask about salary or remote in technical rounds. Save for recruiter / hiring manager.

Your closing answer

When asked "anything else you want to add?" — have one more substantive thing:

Honest, specific, useful

"One thing I want to flag: I know parts of the JD describe more years of [specific stack — Iceberg, streaming, whatever your gap is] than I have. What I bring instead is strong SQL fundamentals, dbt fluency, modeling discipline, and the instinct to ask 'what's the grain' before writing anything. If the team's biggest needs are 'someone who can ship analytics models well and own quality,' I'd argue I'm closer than my CV suggests. If you specifically need deep [gap area] from day one, I'd be honest that I'd be ramping. I'd want to know which axis matters more for this specific seat."

This acknowledges the gap, reframes around your strength, asks a smart calibration question.

State management — the meta-skill

  • Between rounds: stand up, water, breathe, reset face. Do not replay the round.
  • After all rounds: write down 5 questions they asked and 3 things you said well. Input for the next stage.
  • Don't try to remember every question — pick highlights.

If something goes badly

It will. One SQL query will compile wrong. One blank moment.

  • Name it briefly: "Let me come back to that — I want to give you a better answer than I started." Then pivot.
  • Don't dwell. The interviewer's memory is dominated by recency. Strong answer in question 9 erases mediocre answer in question 4.
  • Don't apologize twice. Once is composure; twice is anxiety leaking.

After the interview

  • Send a short thank-you email within 24 hours. Three sentences. Reference one specific thing they said. Don't restate your case.
  • Note things you'd answer differently. Adjust this folder.
  • If you advance: schedule the next round, drill again, iterate.

The one mantra

"Grain first. Tests always. Talk through tradeoffs."

Read it ten times before you start the call. Then go.