Data Analytics Engineering
A study guide for Senior Data Analytics Engineer / Data Engineer interviews — SQL, dbt, data modeling, warehouses, orchestration. Tailored toward modern AI / GPU / inference platform companies.
If you only have an hour
Open 00-START-HERE first. Then drill 15-interview-questions with drill mode on. Reread 16-day-of the morning of.
Drill: 30 Questions
Hide-show answers · progress saved
Drill: SQL Problems
Window functions · gaps and islands · dedup
SQL Deep Dive
CTEs · window functions · query plans
Section A · Orient
00
Start Here
Master index, study schedule, key reframe. Read first.
01
The Role Decoded
What "Senior Data Analytics Engineer" actually means, especially at AI infra companies. The analyst / analytics engineer / data engineer triangle.
02
Positioning From Scratch
Mindset — how to interview honestly when your domain stack doesn't perfectly match theirs.
Section B · Core Technical
03
The Modern Data Stack
Warehouse + dbt + orchestrator + BI — how the pieces fit, and what each layer is good at.
04
SQL Deep Dive
CTEs, window functions, query optimization, execution plans, anti-patterns. The single most-tested skill in this loop.
05
dbt Deep Dive
Models, sources, tests, snapshots, macros, packages, exposures. The defining tool of analytics engineering.
06
Data Modeling
Kimball star schema, slowly-changing dimensions (SCD types 1/2/3), one-big-table, normalization tradeoffs.
07
Data Pipelines
ETL vs ELT, batch vs streaming, idempotency, backfills, late-arriving data, schema evolution.
08
Data Quality & Testing
dbt tests, Great Expectations, freshness checks, anomaly detection, the contract pattern.
09
Warehouses & Lakehouses
Snowflake, BigQuery, Redshift, Databricks. Iceberg / Delta / Hudi. What each is good at; how they bill you.
Section C · Coding
10
SQL Pattern Menu
Pattern recognition: gaps and islands, dedup, running totals, sessionization, latest-per-group. The 12 patterns that cover 80% of SQL interview questions.
11
SQL Problems Worked Out
11 SQL problems with multiple approaches, drill mode hides solutions, progress tracker.
12
Python for Data
pandas, polars, common transformations, when to leave SQL and reach for Python.
Section D · Production / Cloud
13
Orchestration
Airflow, Dagster, Prefect — DAGs, dependencies, retries, idempotency, what to choose when.
14
Data Observability
Lineage, freshness, volume, schema, distribution. Tools: Monte Carlo, Elementary, OpenLineage, DataHub.
Section E · Domain & Execution
17
AI Compute Platform Data
What data looks like at an AI / GPU / inference platform: GPU telemetry, inference logs, billing, multi-tenant analytics, unit economics.
15
Practice Interview Questions
~30 Q&A across 8 sections. Drill mode hides answers by default. Per-question practice tracker.
16
Day-Of Tactics
Structural moves, traps to watch, recovery patterns, questions to ask them, closing statement.
Study Paths
If you have 7+ days
- Day 1: 01, 02 (orient) → 03 (modern data stack)
- Day 2: 04 (SQL deep dive) + 10 (SQL patterns)
- Day 3: 11 (SQL problems on a timer)
- Day 4: 05 (dbt) + 06 (modeling)
- Day 5: 07 (pipelines) + 08 (quality) + 09 (warehouses)
- Day 6: 12 (Python) + 13 (orchestration) + 14 (observability) + 17 (AI compute domain)
- Day 7: Drill 15. Read 16. Sleep.
If you have 2-3 days
01, 02, 04, 05, 06, 11 (drill SQL problems), 15 (drill Q&A), 16. Skim the rest.
If you have < 24 hours
01, 02, 11 (drill the SQL problems — most likely to come up), 15 (drill all Qs), 16. Skim 04, 05, 06 headings only.