Understanding Performance of C++ from the Implementation Perspective (2) : STL Containers

Wed, 19 Feb 2025 23:20:07 +0800

One thing about C++ that attracts me is the flexibility to control the program. Therefore, I am writing a series of blogs based on my understanding of highly efficient C++ code. They come from various sources, including CppCon videos, StackOverflow, and the C++ standard. I have also benchmarked some of the claims. I hope this will be helpful to you.

Rule of Thumb

If unordered exists in an STL container name, it is almost certainly implemented via a hash table.
When in doubt, use vector.

Associative Containers: An Overview

STL provides two families of associative containers:

Container	Implementation	Ordered?	Average Lookup
`set`	Red-black tree	Yes	O(log n)
`map`	Red-black tree	Yes	O(log n)
`unordered_set`	Hash table	No	O(1)
`unordered_map`	Hash table	No	O(1)

The core trade-off is order vs. speed. Tree-based containers maintain sorted order and guarantee O(log n) worst-case; hash-based containers give O(1) average but have no ordering and worst-case O(n) on hash collisions.

`unordered_set` and `unordered_map`

Both are implemented as a hash table (open addressing or separate chaining depending on the STL implementation; most use separate chaining with a bucket array).

Complexity:

Operation	Average	Worst case
Insert	O(1)	O(n)
Lookup	O(1)	O(n)
Delete	O(1)	O(n)

The worst case occurs when all keys hash to the same bucket (degenerate collision). In practice this is rare with a good hash function and reasonable load factor.

When to use:

You need fast existence checks or key-value lookup and don’t care about order.
Keys are primitive types or types for which a hash is readily available.

Defining a Custom Hash

For primitive types (int, size_t, etc.) the standard library provides hash specializations automatically. For custom types such as pair<int,int> or a struct, you must supply your own.

A simple and effective polynomial hash for 2D or 3D indices:

struct PairHash {
size_t operator()(const pair<int,int>& p) const {
size_t h = 17;
h = h * 53 + hash<int>{}(p.first);
h = h * 53 + hash<int>{}(p.second);
return h;
}
};
unordered_set<pair<int,int>, PairHash> visited;
unordered_map<pair<int,int>, int, PairHash> dist;

For a 3D key, extend the pattern:

struct TripleHash {
size_t operator()(const tuple<int,int,int>& t) const {
size_t h = 17;
h = h * 53 + hash<int>{}(get<0>(t));
h = h * 53 + hash<int>{}(get<1>(t));
h = h * 53 + hash<int>{}(get<2>(t));
return h;
}
};

The constant 53 (a prime) spreads bits well and reduces collision clustering.¹

An Important Caveat: Rehashing

When the load factor (number of elements / number of buckets) exceeds a threshold (default 1.0 in most implementations), the hash table rehashes: it allocates a larger bucket array and reinserts every element. This is O(n) and can be surprising if it happens inside a hot loop.

You can pre-allocate to avoid rehashing:

unordered_map<int, int> freq;
freq.reserve(1024); // pre-allocate buckets
freq.max_load_factor(0.25); // keep table sparse for faster lookup

`set` and `map`

Both are implemented as a red-black tree — a self-balancing binary search tree that guarantees O(log n) for all operations in the worst case, and maintains elements in sorted order.

Complexity:

Operation	Worst case
Insert	O(log n)
Lookup	O(log n)
Delete	O(log n)
Min/Max	O(log n)
In-order traversal	O(n)

When to use:

You need elements in sorted order (e.g., iterating from smallest to largest).
You need range queries: lower_bound, upper_bound.
You need stable worst-case guarantees (no hash collision spikes).

Range Queries

The tree structure enables efficient range operations not possible with hash containers:

map<int, string> m;
// ... populate ...
// All entries with keys in [lo, hi]
auto it = m.lower_bound(lo);
while (it != m.end() && it->first <= hi) {
cout << it->first << " -> " << it->second << "\n";
++it;
}

Performance Pitfall: Cache Misses

Red-black trees store nodes on the heap with pointer chasing. For large sets, this leads to many cache misses compared to a contiguous structure like vector or a hash table with a flat bucket array. For small n (< a few hundred elements), a sorted vector with binary search often outperforms set due to cache locality.

// For small, mostly-read sets, this can be faster:
vector<int> v = {3, 1, 4, 1, 5, 9};
sort(v.begin(), v.end());
v.erase(unique(v.begin(), v.end()), v.end());
bool found = binary_search(v.begin(), v.end(), 4); // O(log n), cache-friendly

`vector`: The Default Choice

vector is a contiguous dynamic array. It is almost always the right choice unless you have a specific reason to use another container, because:

Cache-friendly: elements are laid out sequentially in memory.
Random access in O(1): direct indexing.
Append in amortized O(1): push_back doubles capacity on resize.

Complexity:

Operation	Complexity
Random access	O(1)
push_back	O(1) amortized
Insert/delete at middle	O(n)
Search (unsorted)	O(n)
Search (sorted, binary)	O(log n)

Avoiding Reallocations

Like unordered_map, vector resizes when it runs out of capacity. Pre-allocate when you know the approximate size:

vector<int> v;
v.reserve(10000); // allocate once, avoid repeated reallocations

size() vs capacity(): size is the number of elements currently stored; capacity is how much memory is allocated. After reserve(n), size is unchanged but capacity >= n.

Summary: Which Container to Pick?

Need key-value pairs?
├─ Yes → Need sorted keys or range queries?
│ ├─ Yes → map
│ └─ No → unordered_map (faster)
└─ No → Need uniqueness only?
├─ Yes → Need sorted order?
│ ├─ Yes → set
│ └─ No → unordered_set (faster)
└─ No → vector (default)

When performance matters, always measure. The theoretical complexity advantage of O(1) hash lookup over O(log n) tree lookup may be dwarfed by cache effects, rehashing, or a bad hash function in practice.

See this StackOverflow discussion for more on polynomial hashing for multi-dimensional indices. ↩︎

STL | Ziji's Homepage

Understanding Performance of C++ from the Implementation Perspective (2) : STL Containers

Rule of Thumb

Associative Containers: An Overview

unordered_set and unordered_map

Defining a Custom Hash

An Important Caveat: Rehashing

set and map

Range Queries

Performance Pitfall: Cache Misses

vector: The Default Choice

Avoiding Reallocations

Summary: Which Container to Pick?

`unordered_set` and `unordered_map`

`set` and `map`

`vector`: The Default Choice