The first pattern to reach for, and the one most over-applied. The trick a hash table buys is trading space for the ability to ask membership / index questions in O(1).

When it’s the right tool

  • “Have I seen this before?” → set lookup.
  • “Where did I see this?” → map from value → index/object.
  • “Pair / triple summing to a target” → store complements as you scan.
  • “Group by some key” → bucket map.
  • “Is this a permutation / anagram” → counter equality.

Mental moves

  • Single pass with a map of complements beats two-pointer on unsorted input. The classic two-sum: store target - x keyed to index.
  • Sequence detection: instead of sorting, put everything in a set; for each x, only start counting if x-1 ∉ set. That’s O(n) instead of O(n log n) for “longest consecutive sequence” type problems.
  • Counter equality (Counter(a) == Counter(b)) is the simplest anagram check. For sliding-window anagrams, maintain a counter and seen count instead of recomputing.

What the hash table costs you

  • Worst-case O(n) on adversarial keys. Trust your stdlib’s randomization, or hash external input before using.
  • Memory pressure: every entry has 50–100 bytes of overhead in Python. A set of 10M ints is ~700MB.
  • Iteration order: insertion-ordered in modern Python, but don’t lean on it semantically.
  • No range queries. If you need “all keys between X and Y,” you wanted a BTree, not a hash.

The smell

If your solution has three nested loops over the same array, the answer is almost always “build a map first.” Conversely, if you’ve built a map of size n and only query it once, you didn’t need it.