Start Here
Interview prep for senior/staff Data Scientist loops where the work is shipping production models end-to-end — calibrated against two specific roles, one in fraud/identity and one in multimodal sensor AI.
The two roles, in plain English
This guide is calibrated against two open Data Scientist roles that live on the same spectrum but at different points:
- SentiLink — Staff Data Scientist, Full Stack ($220k–$260k + equity, US remote, 6+ yrs with PhD or 8+ yrs with Masters). Identity-verification & fraud platform — real-time APIs verifying hundreds of millions of identities. Build fraud models end-to-end: data acquisition → featurization → labeling → training → experimentation → production → monitoring. Python 3, PostgreSQL, AWS (EC2, S3, RDS, Redshift). The JD explicitly says "unusual insights drive our competitive advantage rather than optimization of new machine learning methodologies" — meaning: deep domain knowledge and inventive features beat sklearn novelty.
- Archetype AI — Data Scientist (San Mateo, on-site, 4+ yrs). Multimodal sensor-fusion platform — "Newton," a real-time multimodal LLM for physical-world AI. The role sits in GTM / Solutions Engineering: review customer assets (video, sensor streams), prepare time-series datasets, design prompts and n-shot examples, configure lens parameters, support Solutions Architects through customer evaluations. Python, Jupyter, signal processing, Encord-style labeling.
The SentiLink role is "staff IC who owns a model domain end-to-end." The Archetype role is "mid-level applied scientist who turns customer data into a working POC." Different seniorities, both full-stack applied: both expect you to go from raw data to a shipped artifact without a separate ML engineer to hand off to.
Neither role rewards ML novelty for its own sake. SentiLink wants "deep domain understanding drives development"; Archetype wants "iterative preprocessing cycles with evaluation and refinement". Both want someone who can be dropped on a messy data problem and produce a defensible, testable artifact without waiting for direction.
What the rounds typically test
Loops for full-stack / applied DS roles usually include:
- ML fundamentals — supervised models for tabular data, evaluation metrics (especially under class imbalance), calibration, model selection. Expect "given these tradeoffs, which model?" framing, not whiteboarding gradient descent.
- Feature engineering deep-dive — almost guaranteed at SentiLink. Often a take-home or live session: "here's a fraud-ish dataset, what features do you build and why?" Domain-flavored at every step.
- Production / code quality — write production-ready Python with tests. SentiLink explicitly says "production code that can be relied on for real-time decision making." Archetype implies it via "executable iterative preprocessing cycles."
- Applied design — "Customer brings X video stream / fraud signal — walk us through how you'd build a model / prompt / pipeline." End-to-end ownership signal.
- Coding — Python data manipulation, light algorithms, occasionally a SQL screen.
- Behavioral — "tell me about a project you owned end-to-end," "describe a time domain knowledge changed your modeling approach."
This guide covers all six.
The folder, in reading order
The numbering follows the order you should read in. Five sections:
Section A — Orient (read first)
| File | Why |
|---|---|
| 01-the-roles | Decode what each role actually involves — SentiLink and Archetype side by side |
| 02-positioning-from-scratch | Mindset before content — leveraging adjacent experience without overclaiming |
Section B — Core DS concepts (the technical core)
| File | Why |
|---|---|
| 03-ml-fundamentals | Supervised models for tabular data; staff-level model selection and evaluation |
| 04-feature-engineering | "Inventive feature engineering" demystified — entity features, leakage, drift, domain insight |
| 05-fraud-and-imbalanced-data | Class imbalance, cost-sensitive learning, threshold tuning, delayed labels. SentiLink-flavored. |
| 06-prompt-engineering-applied | N-shot, prompt templates, lens configuration, prompt eval. Archetype-flavored. |
| 07-time-series-and-signals | Signal processing, time-series features, sensor data prep, video alignment |
| 08-data-pipelines-applied | Data acquisition, labeling workflows, iterative preprocessing, feature pipelines |
| 09-production-ml | Writing production code, tests, real-time inference, monitoring, retraining cadence |
Section C — Coding (DSA)
| File | Why |
|---|---|
| 10-coding-fundamentals | Python patterns for full-stack DS — testability, vectorization, generators |
| 11-coding-problems | Drillable Python problems with feature-engineering and evaluation flavor |
Section D — Production / Cloud
| File | Why |
|---|---|
| 12-aws-data-stack | EC2, S3, RDS, Redshift, PostgreSQL — the SentiLink stack and how to discuss it |
| 13-mlops-applied | CI/CD for models, shadow deploys, canaries, drift monitoring, retraining |
Section E — Reference & Execution
| File | Why |
|---|---|
| 14-domain-context | Fraud & identity (SentiLink) and multimodal sensor AI (Archetype) vocabulary |
| 15-interview-questions | ~30 Q&A drill set with hide-show answers |
| 16-day-of | Structural moves, recovery patterns, closing statement. Reread morning of. |
Study schedule
If you have 7+ days
- Day 1: 01, 02 (orient) → 03 (ML fundamentals)
- Day 2: 04 (features), 05 (fraud/imbalanced — for SentiLink)
- Day 3: 06 (prompts — for Archetype), 07 (time-series & signals)
- Day 4: 08 (pipelines), 09 (production ML)
- Day 5: 10, 11 (coding drills on a timer)
- Day 6: 12, 13 (AWS, MLOps), 14 (domain context)
- Day 7: Drill 15. Reread 16. Sleep.
If you have 2–3 days
01, 02, 03 (ML), 04 (features), role-relevant 05 or 06, 09 (production ML), 11 (drill 4–5 problems), 15 (drill), 16. Skim everything else.
If you have < 24 hours
01, 02, 04 (features — central to both roles), 05 or 06 (whichever applies), 09 (production ML), 15 (drill all questions), 16. Skim 03, 07, 13 headings only.
The single most important reframe
Neither of these companies is hiring a researcher chasing SOTA. They're hiring someone who can extract competitive advantage from data and domain understanding and ship it as production code. SentiLink puts it plainly: "unusual insights drive our competitive advantage rather than optimization of new machine learning methodologies."
What this means tactically:
- When you get a modeling prompt, your first move is domain questions, not algorithm questions. "What does fraud actually look like in this data? Who labels it? How fresh are the labels? What's the cost of a false positive vs false negative for the customer?" That's how you signal you'll thrive here.
- When you describe past work, lead with features and data decisions, not models. "I noticed that X co-occurred with Y in 90% of fraud cases — built a feature for that — moved AUC from 0.82 to 0.88." That's the SentiLink/Archetype version of impact.
- When you discuss production, treat it as table stakes. Tests, monitoring, drift detection, retraining cadence — these are part of the deliverable, not the SRE's problem.
- When you get a vague open-ended prompt ("here's a sensor stream, customer wants to detect X"), don't reach for fancy models. Reach for iteration loops — what's the smallest preprocessing + model + eval you can ship today, and how would you improve it tomorrow?
What winning looks like
You don't need to be the deepest researcher in the loop. You need to be the candidate who:
- Picks the right model for the job in under 30 seconds, and can articulate why (calibration vs ranking, latency budget, label availability, drift profile).
- Builds features that encode domain insight — and can talk about how they discovered the insight, not just the feature.
- Handles class imbalance / delayed labels / drifting distributions without panicking, because they've thought about these failure modes before.
- Writes clean Python with tests, types, and small functions — like an engineer would, not like a notebook would.
- Reasons about production from day one — latency, monitoring, retraining triggers, on-call.
- Asks the right domain questions early, and uses the answers to shape the modeling approach.
If you can do those six things on demand, you're in.