The first pattern to reach for, and the one most over-applied. The trick a hash table buys is trading space for the ability to ask membership / index questions in O(1).
When itâs the right tool
- âHave I seen this before?â â set lookup.
- âWhere did I see this?â â map from value â index/object.
- âPair / triple summing to a targetâ â store complements as you scan.
- âGroup by some keyâ â bucket map.
- âIs this a permutation / anagramâ â counter equality.
Mental moves
- Single pass with a map of complements beats two-pointer on unsorted input. The classic two-sum: store
target - xkeyed to index. - Sequence detection: instead of sorting, put everything in a set; for each
x, only start counting ifx-1 â set. ThatâsO(n)instead ofO(n log n)for âlongest consecutive sequenceâ type problems. - Counter equality (
Counter(a) == Counter(b)) is the simplest anagram check. For sliding-window anagrams, maintain a counter andseencount instead of recomputing.
What the hash table costs you
- Worst-case O(n) on adversarial keys. Trust your stdlibâs randomization, or hash external input before using.
- Memory pressure: every entry has 50â100 bytes of overhead in Python. A
setof 10M ints is ~700MB. - Iteration order: insertion-ordered in modern Python, but donât lean on it semantically.
- No range queries. If you need âall keys between X and Y,â you wanted a BTree, not a hash.
The smell
If your solution has three nested loops over the same array, the answer is almost always âbuild a map first.â Conversely, if youâve built a map of size n and only query it once, you didnât need it.