Section C · Coding

Coding Fundamentals

Arrays, Hash Tables, Strings, Sorting, DFS — the patterns and decision rules. See 11-coding-problems for the worked solutions to specific problems.

Default language

Use Python for AI-engineer interviews unless they specify otherwise. It's expected for AI/ML roles, has the cleanest data-structure syntax, and lets you focus on logic instead of language ceremony.

How to approach a coding problem in an interview

A consistent framework matters more than language fluency. For every problem:

Restate the problem. "Just to make sure I understand — you want me to..."
Ask about constraints. Input size? Memory limits? Are inputs sorted? Can there be duplicates? Empty input?
Walk through an example. Pick a small one. Trace it on paper / shared doc.
Talk through the brute-force first. Even if you'll discard it. Shows you can think; gives you a fallback.
Identify the pattern. Hash map? Two pointers? Sliding window? DFS? (See pattern menu below.)
State complexity. "Brute force is O(n²). I think we can do O(n) with a hash map."
Code it. Talk while you type. Don't go silent.
Test with examples. Walk your code against the example. Find the bug before they do.
Discuss tradeoffs. Time, space, edge cases, what you'd add for production.

That sequence applied consistently looks senior. Skipping straight to coding looks junior even when the code is right.

The pattern menu — recognize before solving

If the problem mentions...	Likely pattern
"Find pair / triplet / sum equal to..."	Hash map (or two pointers if sorted)
"Subarray / substring with property..."	Sliding window / prefix sum
"Sorted array..."	Two pointers, binary search
"Frequency / count / duplicates..."	Hash map
"Tree / graph traversal..."	DFS or BFS
"Shortest path / level by level..."	BFS
"All paths / combinations / permutations..."	DFS / backtracking
"Top K / Kth largest..."	Heap (priority queue) or quickselect
"Anagram / palindrome..."	Hash map of char counts, or two pointers
"Min/max of last N..."	Monotonic stack/queue, sliding window
"Overlapping intervals..."	Sort by start, sweep

This mental checklist is what you actually use in real interviews. Memorize it.

Arrays

The everyday data structure. Python list. Properties:

Random access: O(1) by index.
Append: amortized O(1).
Insert/delete in middle: O(n) — full shift.
Search: O(n) unsorted, O(log n) sorted (binary search).
In-place vs copy: "in place" = don't allocate a new array.

Two pointers (sorted array)

Use when you have a sorted array and need to find pairs/triplets meeting a property.

two_pointers.py

def two_sum_sorted(nums, target):
    left, right = 0, len(nums) - 1
    while left < right:
        s = nums[left] + nums[right]
        if s == target:
            return [left, right]
        if s < target:
            left += 1
        else:
            right -= 1
    return []

Time O(n), space O(1). The "if smaller, move left up; if bigger, move right down" insight only works because the array is sorted.

Sliding window

Use when looking for a contiguous subarray/substring with some property.

sliding_window.py

def longest_substring_at_most_k_distinct(s, k):
    counts = {}
    left = 0
    best = 0
    for right in range(len(s)):
        counts[s[right]] = counts.get(s[right], 0) + 1
        while len(counts) > k:
            counts[s[left]] -= 1
            if counts[s[left]] == 0:
                del counts[s[left]]
            left += 1
        best = max(best, right - left + 1)
    return best

Time O(n), space O(k). The window grows by moving right, shrinks by moving left.

Prefix sum

Use when asked about subarray sums, range queries.

prefix_sum.py

def prefix_sums(nums):
    prefix = [0]
    for n in nums:
        prefix.append(prefix[-1] + n)
    return prefix

# Sum of nums[i..j] (inclusive) = prefix[j+1] - prefix[i]

Time O(n) preprocess, O(1) per query. Combined with a hash map, this is the foundation for Subarray Sum Equals K.

Kadane's max subarray

kadane.py

def max_subarray(nums):
    best = current = nums[0]
    for n in nums[1:]:
        current = max(n, current + n)
        best = max(best, current)
    return best

Time O(n), space O(1).

Hash Tables (dicts and sets)

The most useful data structure in interviews. Python dict and set.

Insert / lookup / delete: average O(1), worst O(n) (collisions; rarely matters).
Use a dict for key→value mapping.
Use a set for "have I seen this?".
collections.Counter: dict subclass for counting; collections.defaultdict(int/list/set) for grouping.

Signals that hash tables solve it

"Find pairs" → look up complement.
"Find duplicates" → set membership.
"Group by" → dict from group → list.
"Frequency" → Counter.
"Cache results" → memoize.

Two Sum — the canonical hash-map problem

two_sum.py

def two_sum(nums, target):
    seen = {}  # value → index
    for i, n in enumerate(nums):
        complement = target - n
        if complement in seen:
            return [seen[complement], i]
        seen[n] = i
    return []

Time O(n), space O(n). This pattern — "look for the complement in a hash map as you iterate" — appears in dozens of variations.

Counter pattern

counter.py

from collections import Counter
counts = Counter("anagram")  # Counter({'a': 3, 'n': 1, 'g': 1, 'r': 1, 'm': 1})
counts1 == counts2  # True iff same multiset of chars (anagram check)

defaultdict pattern (grouping)

defaultdict.py

from collections import defaultdict
groups = defaultdict(list)
for word in words:
    key = ''.join(sorted(word))
    groups[key].append(word)

Group anagrams in one pass. Default value of empty list lets you .append without checking existence.

String Manipulation

In Python, strings are immutable sequences. Common moves:

Indexing & slicing: s[0], s[-1], s[i:j]. Slicing is O(j-i).
Concatenation: ''.join(parts) is O(total length); s += t in a loop is O(n²) — never do this.
Splitting: s.split(), s.split(',').
Char ↔ ASCII: ord('a') → 97; chr(97) → 'a'.
Lower/upper: s.lower(), s.upper().

Anagram / palindrome family

Valid anagram → compare Counters (or sorted strings).
Palindrome check → two pointers from both ends.
Longest palindrome substring → expand-around-center, O(n²).

Common string patterns

Frequency map for anagrams, character counts.
Sliding window for substring problems with constraints.
Two pointers for palindromes, reversal in place.
Vertical scan for prefix problems (Longest Common Prefix).
Stack for matching brackets, undo operations.

Sorting

You almost never implement a sort in an interview — you call sorted() or list.sort(). But you must know:

Time: Python uses Timsort, O(n log n) average and worst, O(n) on already-sorted.
Space: O(n) extra for sorted; O(1) extra-ish for in-place list.sort().
Stable: yes, Timsort is stable.
Custom keys: sorted(items, key=lambda x: (x.priority, x.created_at)). Tuple keys give you multi-level sort.

When you'd choose a different sort

Need	Approach
Top K (don't need full sort)	Heap, O(n log k)
Counting small-range integers	Counting sort, O(n + k)
Almost-sorted data	Insertion sort, O(n) average — Timsort exploits this
Streaming / external	External merge sort

Sort-then-X is often the unlock

Merge intervals: sort by start, then sweep.
Find duplicates: sort, then look for adjacent duplicates.
Group anagrams: sort each word's chars, group by key.

If you're stuck, ask: "would sorting first make this easier?" — surprisingly often, yes.

Implement merge sort (if asked)

merge_sort.py

def merge_sort(a):
    if len(a) <= 1:
        return a
    mid = len(a) // 2
    left = merge_sort(a[:mid])
    right = merge_sort(a[mid:])
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result

Time O(n log n), space O(n). Stable. Comfortable to derive on the fly.

Quicksort is also worth knowing — average O(n log n), worst O(n²) on already-sorted unless randomized; in-place; not stable.

Depth-First Search (DFS)

DFS is for trees and graphs. The shape:

dfs_recursive.py

def dfs(node, visited):
    if node is None or node in visited:
        return
    visited.add(node)
    process(node)
    for neighbor in neighbors(node):
        dfs(neighbor, visited)

Recursive vs iterative

Recursive is shorter; iterative (with explicit stack) avoids Python's recursion limit and gives you control.

dfs_iterative.py

def dfs_iter(start):
    stack = [start]
    visited = set()
    while stack:
        node = stack.pop()
        if node in visited:
            continue
        visited.add(node)
        process(node)
        for neighbor in neighbors(node):
            if neighbor not in visited:
                stack.append(neighbor)

Trees: pre-order / in-order / post-order

tree_traversal.py

def preorder(node):  # root, left, right
    if not node: return
    visit(node)
    preorder(node.left)
    preorder(node.right)

def inorder(node):   # left, root, right — yields sorted order in a BST
    if not node: return
    inorder(node.left)
    visit(node)
    inorder(node.right)

def postorder(node): # left, right, root — useful for deletion, height
    if not node: return
    postorder(node.left)
    postorder(node.right)
    visit(node)

Common DFS problems

Number of Islands (count connected components in grid)
Path Sum (does any root→leaf path equal target?)
Max Depth of Tree
Validate BST
Course Schedule (cycle detection — DFS with three colors)
All Permutations / Combinations (backtracking is DFS over a state tree)

Backtracking — the DFS variant

permutations.py

def permutations(nums):
    result = []
    def backtrack(path, remaining):
        if not remaining:
            result.append(path[:])
            return
        for i, n in enumerate(remaining):
            path.append(n)
            backtrack(path, remaining[:i] + remaining[i+1:])
            path.pop()  # undo
    backtrack([], nums)
    return result

The "do, recurse, undo" shape is the heart of backtracking.

When DFS, when BFS?

DFS: any-path / all-paths / cycle detection / topological sort / tree traversal.
BFS: shortest path in unweighted graph / level-order traversal / minimum-step problems.

If they say "shortest" or "fewest steps," default to BFS. Otherwise DFS is usually simpler.

Breadth-First Search (BFS) — the companion

bfs.py

from collections import deque

def bfs(start):
    queue = deque([start])
    visited = {start}
    while queue:
        node = queue.popleft()
        process(node)
        for neighbor in neighbors(node):
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)

Common bug

Use deque for O(1) popleft. list.pop(0) is O(n) — slower BFS bug.

Big-O cheat sheet

What every senior candidate states unprompted:

Operation	List	Dict	Set	Heap
Lookup by index/key	O(1)	O(1)	O(1) (membership)	n/a
Insert	O(1) append, O(n) middle	O(1)	O(1)	O(log n)
Delete	O(n)	O(1)	O(1)	O(log n)
Min/max	O(n)	O(n)	O(n)	O(1) peek, O(log n) pop
Search	O(n)	O(1)	O(1)	O(n)

Time complexity classes (sense of speed):

O(1): constant — instant.
O(log n): logarithmic — binary search.
O(n): linear — single pass.
O(n log n): comparison sort, divide-and-conquer.
O(n²): nested loops, brute force pairs.
O(2ⁿ): exponential, brute-force subsets.
O(n!): factorial, permutations.

Interviewers expect you to state complexity at the end of every solution.

Common Python tools to have at your fingertips

python_toolkit.py

# Iteration
enumerate(items)           # (index, item) pairs
zip(a, b)                  # parallel iteration
reversed(items)            # reverse iterator

# Collections
from collections import Counter, defaultdict, deque, OrderedDict

# Heap (priority queue)
import heapq
heapq.heappush(heap, item)
smallest = heapq.heappop(heap)
heapq.nsmallest(k, items)
heapq.nlargest(k, items)

# Sorting
sorted(items, key=lambda x: x.priority, reverse=True)
items.sort(key=...)        # in-place

# String
s.startswith(prefix)
s.find(needle)             # -1 if not found
'-'.join(parts)
s.split(',')

# Math
abs(x), min(a, b), max(a, b), sum(items)
import math; math.inf, -math.inf

# Set operations
a | b   # union
a & b   # intersection
a - b   # difference

# functools.cache for memoization
from functools import cache
@cache
def fib(n):
    if n < 2: return n
    return fib(n-1) + fib(n-2)

A clean coding-interview answer template

What to say while you code:

Live narration template

"Restating: I want to [X]. Inputs are [Y]. Constraints are [Z]. Brute force would be [O(?)] — [describe]. I think we can do better with [pattern]. Walking through example [E]: at step 1 I'd [A]; at step 2 I'd [B]; result is [R]. Coding it up now... [code while talking]. Time complexity is [T] because [why]; space is [S] because [why]. Edge cases I want to verify: empty input, single element, duplicates, all-same. Want me to walk through one of those?"

Practiced once, that template gets you through 80% of coding interviews looking composed.

What to drill

The three commonly named — Longest Common Prefix, Valid Anagram, Subarray Sum Equals K — are walked through in 11-coding-problems. Beyond those, drill these LeetCode classics by topic:

Array: Two Sum, Best Time to Buy/Sell Stock, Maximum Subarray
Hash table: Group Anagrams, Top K Frequent Elements, Contains Duplicate
String: Valid Palindrome, Longest Substring Without Repeating Characters, Reverse Words in a String
Sorting: Merge Intervals, Sort Colors, Kth Largest Element
DFS: Number of Islands, Path Sum, Validate BST, Clone Graph

Two-three problems per topic, fully solved out loud, is enough.