Understanding Performance of C++ from the Implementation Perspective (2) : STL Containers

One thing about C++ that attracts me is the flexibility to control the program. Therefore, I am writing a series of blogs based on my understanding of highly efficient C++ code. They come from various sources, including CppCon videos, StackOverflow, and the C++ standard. I have also benchmarked some of the claims. I hope this will be helpful to you.

Rule of Thumb

  1. If unordered exists in an STL container name, it is almost certainly implemented via a hash table.
  2. When in doubt, use vector.

Associative Containers: An Overview

STL provides two families of associative containers:

ContainerImplementationOrdered?Average Lookup
setRed-black treeYesO(log n)
mapRed-black treeYesO(log n)
unordered_setHash tableNoO(1)
unordered_mapHash tableNoO(1)

The core trade-off is order vs. speed. Tree-based containers maintain sorted order and guarantee O(log n) worst-case; hash-based containers give O(1) average but have no ordering and worst-case O(n) on hash collisions.


unordered_set and unordered_map

Both are implemented as a hash table (open addressing or separate chaining depending on the STL implementation; most use separate chaining with a bucket array).

Complexity:

OperationAverageWorst case
InsertO(1)O(n)
LookupO(1)O(n)
DeleteO(1)O(n)

The worst case occurs when all keys hash to the same bucket (degenerate collision). In practice this is rare with a good hash function and reasonable load factor.

When to use:

  • You need fast existence checks or key-value lookup and don’t care about order.
  • Keys are primitive types or types for which a hash is readily available.

Defining a Custom Hash

For primitive types (int, size_t, etc.) the standard library provides hash specializations automatically. For custom types such as pair<int,int> or a struct, you must supply your own.

A simple and effective polynomial hash for 2D or 3D indices:

struct PairHash {
    size_t operator()(const pair<int,int>& p) const {
        size_t h = 17;
        h = h * 53 + hash<int>{}(p.first);
        h = h * 53 + hash<int>{}(p.second);
        return h;
    }
};

unordered_set<pair<int,int>, PairHash> visited;
unordered_map<pair<int,int>, int, PairHash> dist;

For a 3D key, extend the pattern:

struct TripleHash {
    size_t operator()(const tuple<int,int,int>& t) const {
        size_t h = 17;
        h = h * 53 + hash<int>{}(get<0>(t));
        h = h * 53 + hash<int>{}(get<1>(t));
        h = h * 53 + hash<int>{}(get<2>(t));
        return h;
    }
};

The constant 53 (a prime) spreads bits well and reduces collision clustering.1

An Important Caveat: Rehashing

When the load factor (number of elements / number of buckets) exceeds a threshold (default 1.0 in most implementations), the hash table rehashes: it allocates a larger bucket array and reinserts every element. This is O(n) and can be surprising if it happens inside a hot loop.

You can pre-allocate to avoid rehashing:

unordered_map<int, int> freq;
freq.reserve(1024);       // pre-allocate buckets
freq.max_load_factor(0.25); // keep table sparse for faster lookup

set and map

Both are implemented as a red-black tree — a self-balancing binary search tree that guarantees O(log n) for all operations in the worst case, and maintains elements in sorted order.

Complexity:

OperationWorst case
InsertO(log n)
LookupO(log n)
DeleteO(log n)
Min/MaxO(log n)
In-order traversalO(n)

When to use:

  • You need elements in sorted order (e.g., iterating from smallest to largest).
  • You need range queries: lower_bound, upper_bound.
  • You need stable worst-case guarantees (no hash collision spikes).

Range Queries

The tree structure enables efficient range operations not possible with hash containers:

map<int, string> m;
// ... populate ...

// All entries with keys in [lo, hi]
auto it = m.lower_bound(lo);
while (it != m.end() && it->first <= hi) {
    cout << it->first << " -> " << it->second << "\n";
    ++it;
}

Performance Pitfall: Cache Misses

Red-black trees store nodes on the heap with pointer chasing. For large sets, this leads to many cache misses compared to a contiguous structure like vector or a hash table with a flat bucket array. For small n (< a few hundred elements), a sorted vector with binary search often outperforms set due to cache locality.

// For small, mostly-read sets, this can be faster:
vector<int> v = {3, 1, 4, 1, 5, 9};
sort(v.begin(), v.end());
v.erase(unique(v.begin(), v.end()), v.end());
bool found = binary_search(v.begin(), v.end(), 4); // O(log n), cache-friendly

vector: The Default Choice

vector is a contiguous dynamic array. It is almost always the right choice unless you have a specific reason to use another container, because:

  • Cache-friendly: elements are laid out sequentially in memory.
  • Random access in O(1): direct indexing.
  • Append in amortized O(1): push_back doubles capacity on resize.

Complexity:

OperationComplexity
Random accessO(1)
push_backO(1) amortized
Insert/delete at middleO(n)
Search (unsorted)O(n)
Search (sorted, binary)O(log n)

Avoiding Reallocations

Like unordered_map, vector resizes when it runs out of capacity. Pre-allocate when you know the approximate size:

vector<int> v;
v.reserve(10000); // allocate once, avoid repeated reallocations

size() vs capacity(): size is the number of elements currently stored; capacity is how much memory is allocated. After reserve(n), size is unchanged but capacity >= n.


Summary: Which Container to Pick?

Need key-value pairs?
  ├─ Yes → Need sorted keys or range queries?
  │           ├─ Yes → map
  │           └─ No  → unordered_map (faster)
  └─ No  → Need uniqueness only?
              ├─ Yes → Need sorted order?
              │           ├─ Yes → set
              │           └─ No  → unordered_set (faster)
              └─ No  → vector (default)

When performance matters, always measure. The theoretical complexity advantage of O(1) hash lookup over O(log n) tree lookup may be dwarfed by cache effects, rehashing, or a bad hash function in practice.


  1. See this StackOverflow discussion for more on polynomial hashing for multi-dimensional indices. ↩︎