Coding Fundamentals
Arrays, Hash Tables, Strings, Sorting, DFS — the patterns and decision rules. See 11-coding-problems for the worked solutions to specific problems.
Use Python for AI-engineer interviews unless they specify otherwise. It's expected for AI/ML roles, has the cleanest data-structure syntax, and lets you focus on logic instead of language ceremony.
How to approach a coding problem in an interview
A consistent framework matters more than language fluency. For every problem:
- Restate the problem. "Just to make sure I understand — you want me to..."
- Ask about constraints. Input size? Memory limits? Are inputs sorted? Can there be duplicates? Empty input?
- Walk through an example. Pick a small one. Trace it on paper / shared doc.
- Talk through the brute-force first. Even if you'll discard it. Shows you can think; gives you a fallback.
- Identify the pattern. Hash map? Two pointers? Sliding window? DFS? (See pattern menu below.)
- State complexity. "Brute force is O(n²). I think we can do O(n) with a hash map."
- Code it. Talk while you type. Don't go silent.
- Test with examples. Walk your code against the example. Find the bug before they do.
- Discuss tradeoffs. Time, space, edge cases, what you'd add for production.
That sequence applied consistently looks senior. Skipping straight to coding looks junior even when the code is right.
Arrays
The everyday data structure. Python list. Properties:
- Random access: O(1) by index.
- Append: amortized O(1).
- Insert/delete in middle: O(n) — full shift.
- Search: O(n) unsorted, O(log n) sorted (binary search).
- In-place vs copy: "in place" = don't allocate a new array.
Two pointers (sorted array)
Use when you have a sorted array and need to find pairs/triplets meeting a property.
def two_sum_sorted(nums, target):
left, right = 0, len(nums) - 1
while left < right:
s = nums[left] + nums[right]
if s == target:
return [left, right]
if s < target:
left += 1
else:
right -= 1
return []
Time O(n), space O(1). The "if smaller, move left up; if bigger, move right down" insight only works because the array is sorted.
Sliding window
Use when looking for a contiguous subarray/substring with some property.
def longest_substring_at_most_k_distinct(s, k):
counts = {}
left = 0
best = 0
for right in range(len(s)):
counts[s[right]] = counts.get(s[right], 0) + 1
while len(counts) > k:
counts[s[left]] -= 1
if counts[s[left]] == 0:
del counts[s[left]]
left += 1
best = max(best, right - left + 1)
return best
Time O(n), space O(k). The window grows by moving right, shrinks by moving left.
Prefix sum
Use when asked about subarray sums, range queries.
def prefix_sums(nums):
prefix = [0]
for n in nums:
prefix.append(prefix[-1] + n)
return prefix
# Sum of nums[i..j] (inclusive) = prefix[j+1] - prefix[i]
Time O(n) preprocess, O(1) per query. Combined with a hash map, this is the foundation for Subarray Sum Equals K.
Kadane's max subarray
def max_subarray(nums):
best = current = nums[0]
for n in nums[1:]:
current = max(n, current + n)
best = max(best, current)
return best
Time O(n), space O(1).
Hash Tables (dicts and sets)
The most useful data structure in interviews. Python dict and set.
- Insert / lookup / delete: average O(1), worst O(n) (collisions; rarely matters).
- Use a
dictfor key→value mapping. - Use a
setfor "have I seen this?". collections.Counter: dict subclass for counting;collections.defaultdict(int/list/set)for grouping.
Signals that hash tables solve it
- "Find pairs" → look up complement.
- "Find duplicates" → set membership.
- "Group by" → dict from group → list.
- "Frequency" → Counter.
- "Cache results" → memoize.
Two Sum — the canonical hash-map problem
def two_sum(nums, target):
seen = {} # value → index
for i, n in enumerate(nums):
complement = target - n
if complement in seen:
return [seen[complement], i]
seen[n] = i
return []
Time O(n), space O(n). This pattern — "look for the complement in a hash map as you iterate" — appears in dozens of variations.
Counter pattern
from collections import Counter
counts = Counter("anagram") # Counter({'a': 3, 'n': 1, 'g': 1, 'r': 1, 'm': 1})
counts1 == counts2 # True iff same multiset of chars (anagram check)
defaultdict pattern (grouping)
from collections import defaultdict
groups = defaultdict(list)
for word in words:
key = ''.join(sorted(word))
groups[key].append(word)
Group anagrams in one pass. Default value of empty list lets you .append without checking existence.
String Manipulation
In Python, strings are immutable sequences. Common moves:
- Indexing & slicing:
s[0],s[-1],s[i:j]. Slicing is O(j-i). - Concatenation:
''.join(parts)is O(total length);s += tin a loop is O(n²) — never do this. - Splitting:
s.split(),s.split(','). - Char ↔ ASCII:
ord('a')→ 97;chr(97)→ 'a'. - Lower/upper:
s.lower(),s.upper().
Anagram / palindrome family
- Valid anagram → compare Counters (or sorted strings).
- Palindrome check → two pointers from both ends.
- Longest palindrome substring → expand-around-center, O(n²).
Common string patterns
- Frequency map for anagrams, character counts.
- Sliding window for substring problems with constraints.
- Two pointers for palindromes, reversal in place.
- Vertical scan for prefix problems (Longest Common Prefix).
- Stack for matching brackets, undo operations.
Sorting
You almost never implement a sort in an interview — you call sorted() or list.sort(). But you must know:
- Time: Python uses Timsort, O(n log n) average and worst, O(n) on already-sorted.
- Space: O(n) extra for
sorted; O(1) extra-ish for in-placelist.sort(). - Stable: yes, Timsort is stable.
- Custom keys:
sorted(items, key=lambda x: (x.priority, x.created_at)). Tuple keys give you multi-level sort.
When you'd choose a different sort
| Need | Approach |
|---|---|
| Top K (don't need full sort) | Heap, O(n log k) |
| Counting small-range integers | Counting sort, O(n + k) |
| Almost-sorted data | Insertion sort, O(n) average — Timsort exploits this |
| Streaming / external | External merge sort |
Sort-then-X is often the unlock
- Merge intervals: sort by start, then sweep.
- Find duplicates: sort, then look for adjacent duplicates.
- Group anagrams: sort each word's chars, group by key.
If you're stuck, ask: "would sorting first make this easier?" — surprisingly often, yes.
Implement merge sort (if asked)
def merge_sort(a):
if len(a) <= 1:
return a
mid = len(a) // 2
left = merge_sort(a[:mid])
right = merge_sort(a[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i]); i += 1
else:
result.append(right[j]); j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
Time O(n log n), space O(n). Stable. Comfortable to derive on the fly.
Quicksort is also worth knowing — average O(n log n), worst O(n²) on already-sorted unless randomized; in-place; not stable.
Depth-First Search (DFS)
DFS is for trees and graphs. The shape:
def dfs(node, visited):
if node is None or node in visited:
return
visited.add(node)
process(node)
for neighbor in neighbors(node):
dfs(neighbor, visited)
Recursive vs iterative
Recursive is shorter; iterative (with explicit stack) avoids Python's recursion limit and gives you control.
def dfs_iter(start):
stack = [start]
visited = set()
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
process(node)
for neighbor in neighbors(node):
if neighbor not in visited:
stack.append(neighbor)
Trees: pre-order / in-order / post-order
def preorder(node): # root, left, right
if not node: return
visit(node)
preorder(node.left)
preorder(node.right)
def inorder(node): # left, root, right — yields sorted order in a BST
if not node: return
inorder(node.left)
visit(node)
inorder(node.right)
def postorder(node): # left, right, root — useful for deletion, height
if not node: return
postorder(node.left)
postorder(node.right)
visit(node)
Common DFS problems
- Number of Islands (count connected components in grid)
- Path Sum (does any root→leaf path equal target?)
- Max Depth of Tree
- Validate BST
- Course Schedule (cycle detection — DFS with three colors)
- All Permutations / Combinations (backtracking is DFS over a state tree)
Backtracking — the DFS variant
def permutations(nums):
result = []
def backtrack(path, remaining):
if not remaining:
result.append(path[:])
return
for i, n in enumerate(remaining):
path.append(n)
backtrack(path, remaining[:i] + remaining[i+1:])
path.pop() # undo
backtrack([], nums)
return result
The "do, recurse, undo" shape is the heart of backtracking.
When DFS, when BFS?
- DFS: any-path / all-paths / cycle detection / topological sort / tree traversal.
- BFS: shortest path in unweighted graph / level-order traversal / minimum-step problems.
If they say "shortest" or "fewest steps," default to BFS. Otherwise DFS is usually simpler.
Breadth-First Search (BFS) — the companion
from collections import deque
def bfs(start):
queue = deque([start])
visited = {start}
while queue:
node = queue.popleft()
process(node)
for neighbor in neighbors(node):
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
Use deque for O(1) popleft. list.pop(0) is O(n) — slower BFS bug.
Big-O cheat sheet
What every senior candidate states unprompted:
| Operation | List | Dict | Set | Heap |
|---|---|---|---|---|
| Lookup by index/key | O(1) | O(1) | O(1) (membership) | n/a |
| Insert | O(1) append, O(n) middle | O(1) | O(1) | O(log n) |
| Delete | O(n) | O(1) | O(1) | O(log n) |
| Min/max | O(n) | O(n) | O(n) | O(1) peek, O(log n) pop |
| Search | O(n) | O(1) | O(1) | O(n) |
Time complexity classes (sense of speed):
- O(1): constant — instant.
- O(log n): logarithmic — binary search.
- O(n): linear — single pass.
- O(n log n): comparison sort, divide-and-conquer.
- O(n²): nested loops, brute force pairs.
- O(2ⁿ): exponential, brute-force subsets.
- O(n!): factorial, permutations.
Interviewers expect you to state complexity at the end of every solution.
Common Python tools to have at your fingertips
# Iteration
enumerate(items) # (index, item) pairs
zip(a, b) # parallel iteration
reversed(items) # reverse iterator
# Collections
from collections import Counter, defaultdict, deque, OrderedDict
# Heap (priority queue)
import heapq
heapq.heappush(heap, item)
smallest = heapq.heappop(heap)
heapq.nsmallest(k, items)
heapq.nlargest(k, items)
# Sorting
sorted(items, key=lambda x: x.priority, reverse=True)
items.sort(key=...) # in-place
# String
s.startswith(prefix)
s.find(needle) # -1 if not found
'-'.join(parts)
s.split(',')
# Math
abs(x), min(a, b), max(a, b), sum(items)
import math; math.inf, -math.inf
# Set operations
a | b # union
a & b # intersection
a - b # difference
# functools.cache for memoization
from functools import cache
@cache
def fib(n):
if n < 2: return n
return fib(n-1) + fib(n-2)
A clean coding-interview answer template
What to say while you code:
"Restating: I want to [X]. Inputs are [Y]. Constraints are [Z]. Brute force would be [O(?)] — [describe]. I think we can do better with [pattern]. Walking through example [E]: at step 1 I'd [A]; at step 2 I'd [B]; result is [R]. Coding it up now... [code while talking]. Time complexity is [T] because [why]; space is [S] because [why]. Edge cases I want to verify: empty input, single element, duplicates, all-same. Want me to walk through one of those?"
Practiced once, that template gets you through 80% of coding interviews looking composed.
What to drill
The three commonly named — Longest Common Prefix, Valid Anagram, Subarray Sum Equals K — are walked through in 11-coding-problems. Beyond those, drill these LeetCode classics by topic:
- Array: Two Sum, Best Time to Buy/Sell Stock, Maximum Subarray
- Hash table: Group Anagrams, Top K Frequent Elements, Contains Duplicate
- String: Valid Palindrome, Longest Substring Without Repeating Characters, Reverse Words in a String
- Sorting: Merge Intervals, Sort Colors, Kth Largest Element
- DFS: Number of Islands, Path Sum, Validate BST, Clone Graph
Two-three problems per topic, fully solved out loud, is enough.