Deep dive into how Set, Dict, OrderedDict, HeapQ, List, and Deque are implemented internally in CPython
OrderedDict is a dict subclass that has methods specialized for rearranging dictionary order. It maintains insertion order while providing O(1) operations.
popitem(last=True) - Returns and removes a (key, value) pair in LIFO order if last is true or FIFO order if false.
move_to_end(key, last=True) - Moves an existing key to either end of an ordered dictionary. Useful in LRU Cache implementations.
d = OrderedDict.fromkeys('abcde')
d.move_to_end('b')
''.join(d) # 'acdeb'
d.move_to_end('b', last=False)
''.join(d) # 'bacde'
A hash table (normal dict) + a doubly linked list
This design gives:
typedef struct {
PyDictObject dict; // underlying dict
_ODictNode *od_first; // head of linked list
_ODictNode *od_last; // tail of linked list
Py_ssize_t od_state; // mutation counter
} PyODictObject;
It literally embeds a PyDictObject, and order is maintained outside the hash table.
typedef struct _odictnode {
PyObject *key;
Py_hash_t hash;
struct _odictnode *next;
struct _odictnode *prev;
} _ODictNode;
Critical Detail: Each key in the dict has exactly one node in this doubly linked list.
Unlike a naive design, OrderedDict does not store nodes inside dict values.
Instead:
So conceptually:
dict[key] → _ODictNode*
This avoids extra hash lookups.
od[key] = value
hash(key)_ODictNodekey → node into dictImportant: Order is updated only on first insertion, not on overwrite.
del od[key]
All steps are O(1).
Iteration does not touch the hash table order at all.
od_first → node → node → node → od_last
This is why:
popitem(last=True) is O(1)This is the secret sauce behind LRU caches.
od.move_to_end(key, last=True)
Internally:
No rehashing. No copying.
| Operation | Complexity | Explanation |
|---|---|---|
popitem(last=True) |
O(1) | Head/tail removal |
move_to_end(key, last=True) |
O(1) | Unlink + relink node |
od[key] = value |
O(1) avg amortized | Hash lookup + append node if new |
od[key] |
O(1) avg amortized | Dict lookup |
del od[key] |
O(1) avg amortized | Dict delete + unlink node |
The heapq module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. Min-heaps are binary trees for which every parent node has a value less than or equal to any of its children (heap invariant).
heapify(), heappush(), and heappop() for efficient insertion and removal| Function | Complexity | Why |
|---|---|---|
heappush(heap, x) |
O(log n) | Sift-up |
heappop(heap) |
O(log n) | Sift-down |
heapreplace(heap, x) |
O(log n) | Pop + push combined |
heappushpop(heap, x) |
O(log n) (best-case O(1)) | Conditional sift |
heapify(x) |
O(n) | Floyd's algorithm |
Core design choice: A heap is just a Python list.
There is:
heap = []The heap invariant (min-heap):
heap[i] <= heap[2*i + 1]
heap[i] <= heap[2*i + 2]
Python design constraints:
So heapq is optimized to:
import heapq
li = [25, 20, 15, 30, 40]
# Convert the list into a heap
heapq.heapify(li)
print("Heap queue:", li)
Everything in heapq reduces to two internal helpers:
_siftdown → used when inserting
Moves an element up the tree.
_siftup → used when removing
Moves an element down the tree.
Called after inserting at the end.
Simplified logic (from heapq.py):
def _siftdown(heap, startpos, pos):
newitem = heap[pos]
while pos > startpos:
parentpos = (pos - 1) >> 1
parent = heap[parentpos]
if newitem < parent:
heap[pos] = parent
pos = parentpos
else:
break
heap[pos] = newitem
>> 1 instead of / 2⏱ O(log n) comparisons, minimal writes.
Used after popping the root.
def _siftup(heap, pos):
endpos = len(heap)
startpos = pos
newitem = heap[pos]
childpos = 2*pos + 1
while childpos < endpos:
rightpos = childpos + 1
if rightpos < endpos and heap[rightpos] < heap[childpos]:
childpos = rightpos
heap[pos] = heap[childpos]
pos = childpos
childpos = 2*pos + 1
heap[pos] = newitem
_siftdown(heap, startpos, pos)
This is a classic CPython micro-optimization:
_siftup pushes the "hole" to a leaf_siftdown fixes ordering on the way updef heappush(heap, item):
heap.append(item)
_siftdown(heap, 0, len(heap) - 1)
That's literally it.
def heappop(heap):
lastelt = heap.pop()
if heap:
returnitem = heap[0]
heap[0] = lastelt
_siftup(heap, 0)
return returnitem
return lastelt
Important detail:
def heapify(x):
for i in reversed(range(len(x)//2)):
_siftup(x, i)
Explanation:
Use:
heapq.heappush(heap, -x)
That is, store and pop elements by their negative values.
A Set in Python is used to store a collection of items with the following properties:
| Operation | Complexity (avg) | Worst | Notes |
|---|---|---|---|
x in s |
O(1) | O(n) | Hash lookup |
s.add(x) |
O(1) | O(n) | May resize |
s.remove(x) |
O(1) | O(n) | Raises KeyError |
s.discard(x) |
O(1) | O(n) | No error |
len(s) |
O(1) | — | Stored count |
s.clear() |
O(n) | — | Frees table |
x not in s |
O(1) | O(n) | Negated lookup |
| Operation | Reason |
|---|---|
Indexing (s[0]) |
No order |
| Slicing | No order |
| Stable iteration | Hash-table scan |
| Sorting in-place | Not meaningful |
A Python set is a hash table, not a tree, not a bitset.
It is very similar to dict, but:
typedef struct {
PyObject_HEAD
Py_ssize_t fill; // active + dummy entries
Py_ssize_t used; // active entries only
Py_ssize_t mask; // table size - 1
setentry *table; // hash table
Py_hash_t hash; // for frozenset only
Py_ssize_t finger; // iteration state
} PySetObject;
| Field | Meaning |
|---|---|
used |
Number of live elements |
fill |
Live + deleted ("dummy") |
mask |
table_size - 1 (power of two) |
table |
Pointer to entry array |
finger |
Iterator progress |
hash |
Cached hash (frozenset only) |
typedef struct {
PyObject *key;
Py_hash_t hash;
} setentry;
Special values:
key == NULL → empty slotkey == dummy → deleted slot (dummy is a global singleton marker)Index calculation:
i = hash & mask;
Table size is always a power of two, so modulo is a cheap bitmask.
Python sets use open addressing with perturbation probing (not chaining, not linear probing).
Probe sequence:
i = (i * 5 + 1 + perturb) & mask;
perturb >>= 5;
Why this design?
This same scheme is used by dict.
hash(x)hash & mask| Case | Cost |
|---|---|
| Average | O(1) |
| Worst (pathological hashes) | O(n) |
Python randomizes hashes to prevent DOS attacks.
(key, hash)used and fillResize when:
fill * 3 >= mask * 2 (~66%)
This keeps probe lengths short.
Deletion does not clear the slot.
Instead:
entry->key = dummy;
Why?
Too many dummies → triggers resize.
When resizing:
Cost: O(n), but amortized O(1) per operation.
Iteration is a linear scan over the table:
for i in range(table_size):
if entry[i].key is real:
yield entry[i].key
RuntimeErrorO(len(set))Python's dict (dictionary) is one of the most fundamental and heavily optimized data structures in the language. It implements an associative array that maps hashable keys to arbitrary values.
A dict is essentially a set with values attached:
Both use the same underlying hash table implementation with perturbation probing.
Modern Python dicts (3.6+) use a compact dict design:
typedef struct {
PyObject_HEAD
Py_ssize_t ma_used; // number of items
PyDictKeysObject *ma_keys; // keys table
PyObject **ma_values; // values array
} PyDictObject;
Before Python 3.6, dicts stored entries in a sparse array. The new design uses two arrays:
Benefits:
| Operation | Average | Worst | Notes |
|---|---|---|---|
d[key] |
O(1) | O(n) | Hash lookup |
d[key] = value |
O(1) | O(n) | May resize |
del d[key] |
O(1) | O(n) | Marks entry as deleted |
key in d |
O(1) | O(n) | Hash lookup |
d.get(key, default) |
O(1) | O(n) | Safe lookup |
| Iteration | O(n) | O(n) | Linear scan of entries |
collections.defaultdict is a subclass of dict that provides a default value for missing keys:
from collections import defaultdict
# Instead of KeyError, returns default value
d = defaultdict(int)
d['missing'] # Returns 0
# Common use case: grouping
groups = defaultdict(list)
for item in items:
groups[item.category].append(item)
Implementation: DefaultDict simply overrides __missing__() to call the default_factory function when a key is not found.
Python dicts provide dynamic views:
d = {'a': 1, 'b': 2}
keys = d.keys() # dict_keys view
values = d.values() # dict_values view
items = d.items() # dict_items view
These views are dynamic - they reflect changes to the dictionary in real-time and don't create a copy.
Python's list is a dynamic array that can grow or shrink as needed. It's one of the most commonly used data structures in Python.
A list is a dynamic array with over-allocation for efficiency.
typedef struct {
PyObject_VAR_HEAD
PyObject **ob_item; // pointer to array of object pointers
Py_ssize_t allocated; // how many slots allocated
} PyListObject;
Key points:
ob_size: Current number of elements (from PyObject_VAR_HEAD)allocated: Total capacity (always ≥ ob_size)ob_item: Array of pointers to Python objectsWhen a list needs to grow, it doesn't just add one slot. Instead, it over-allocates:
Growth pattern:
new_allocated = (size >> 3) + (size < 9 ? 3 : 6) + size
This means:
Why over-allocate? Makes append operations amortized O(1) instead of O(n).
| Operation | Complexity | Notes |
|---|---|---|
list[i] |
O(1) | Direct array access |
list.append(x) |
O(1) amortized | May trigger reallocation |
list.insert(i, x) |
O(n) | Shifts all elements after i |
list.pop() |
O(1) | Remove from end |
list.pop(i) |
O(n) | Shifts all elements after i |
list.remove(x) |
O(n) | Search + shift |
x in list |
O(n) | Linear search |
list.sort() |
O(n log n) | Timsort algorithm |
list.reverse() |
O(n) | In-place reversal |
Slicing list[a:b] |
O(b-a) | Creates new list |
Lists store pointers, not values directly:
# Each list element is a pointer (8 bytes on 64-bit)
lst = [1, 2, 3] # Stores 3 pointers to int objects
# The actual integers are separate Python objects
# List overhead: ~56 bytes + 8 bytes per element
Example of list growth:
lst = []
for i in range(10):
lst.append(i)
print(f"size={len(lst)}, allocated={sys.getsizeof(lst)}")
Output shows allocated space grows in jumps: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88...
This means appending 10 items only triggers ~4 reallocations instead of 10.
collections.deque (double-ended queue) is optimized for fast appends and pops from both ends. Unlike lists, deques use a different internal structure.
A deque is implemented as a doubly-linked list of blocks (arrays).
Structure:
typedef struct {
PyObject_VAR_HEAD
block *leftblock; // leftmost block
block *rightblock; // rightmost block
Py_ssize_t leftindex; // index in leftblock
Py_ssize_t rightindex; // index in rightblock
size_t state; // modification counter
Py_ssize_t maxlen; // max size (or -1)
} dequeobject;
Each block:
#define BLOCKLEN 64 // elements per block
typedef struct block {
struct block *leftlink;
struct block *rightlink;
PyObject *data[BLOCKLEN];
} block;
Hybrid approach combining benefits of:
Instead of one node per element, deque uses blocks of 64 elements, reducing memory overhead and improving cache performance.
| Operation | Complexity | Notes |
|---|---|---|
deque.append(x) |
O(1) | Add to right end |
deque.appendleft(x) |
O(1) | Add to left end |
deque.pop() |
O(1) | Remove from right end |
deque.popleft() |
O(1) | Remove from left end |
deque[i] |
O(n) | Must traverse blocks |
deque.rotate(k) |
O(k) | Rotate elements |
x in deque |
O(n) | Linear search |
len(deque) |
O(1) | Stored count |
| Operation | List | Deque |
|---|---|---|
| Append right | O(1) amortized | O(1) |
| Append left | O(n) | O(1) |
| Pop right | O(1) | O(1) |
| Pop left | O(n) | O(1) |
| Index access | O(1) | O(n) |
| Memory | Contiguous | Blocks |
Use deque when:
Use list when:
Deques can have a maximum length:
from collections import deque
# Bounded deque - automatically drops from opposite end
dq = deque(maxlen=3)
dq.extend([1, 2, 3]) # deque([1, 2, 3])
dq.append(4) # deque([2, 3, 4]) - 1 was dropped
This is perfect for implementing:
A hash table implements a mapping: key → value
With the goal of:
It achieves this by turning a key into an array index using a hash function.
A hash function maps a key to an integer: h = hash(key)
Good hash functions aim for:
Hash tables store data in an array of size N. To map a hash to an index:
index = hash % N
In real implementations (including Python):
index = hash & (N - 1)
This only works when N is a power of two (8, 16, 32, 64, 128...). Bitwise AND is much faster than modulo.
Each bucket holds a linked list: index 3 → (k1,v1) → (k2,v2) → (k3,v3)
Simple, but pointer-heavy and cache-unfriendly. Used in many textbooks, not used in Python.
All entries live inside the array itself. If a slot is taken, you probe for another slot.
Perturbation Probing (Python's choice):
i = hash & mask
perturb = hash
while slot not usable:
i = (i * 5 + 1 + perturb) & mask
perturb >>= 5
Why this is brilliant:
You cannot just clear a slot. Why?
A collides → placed later
B collides → placed even later
If A is deleted and slot cleared,
lookup(B) will stop early → BUG
Solution: DUMMY markers
Instead of clearing: slot = DUMMY
This preserves probe chains. Too many DUMMY slots triggers a resize.
Load factor = used_slots / table_size
Python keeps it below ~66%
When exceeded:
Cost: O(n), but amortized O(1) per operation.
| Structure | Access | Cache | Allocations |
|---|---|---|---|
| Hash table | O(1) | Excellent | Minimal |
| Tree | O(log n) | Poor | Many |
| Linked list | O(n) | Poor | Many |
Hash tables align perfectly with modern CPUs.
Python randomizes hash seeds at startup. This prevents attackers from crafting keys that all collide, which stopped real-world DoS attacks.
In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.
Most operations on a binary search tree (BST) take time directly proportional to the height of the tree, so it is desirable to keep the height small.
A binary tree with height h can contain at most:
2^0 + 2^1 + ... + 2^h = 2^(h+1) - 1 nodes
This implies that for any tree with n nodes and height h:
n ≤ 2^(h+1) - 1
Which means:
h ≥ ⌈log₂(n+1) - 1⌉ ≥ ⌊log₂(n)⌋
In other words, the minimum height of a binary tree with n nodes is log₂(n), rounded down: ⌊log₂(n)⌋
However, the simplest algorithms for BST item insertion may yield a tree with height n in rather common situations. For example, when items are inserted in sorted key order, the tree degenerates into a linked list with n nodes.
The difference in performance may be enormous:
For n = 1,000,000:
Minimum height: ⌊log₂(1,000,000)⌋ = 19
Worst-case height: 1,000,000
That's a 50,000x difference!
Self-balancing binary trees solve this problem by performing transformations (such as tree rotations) at key insertion times, to keep the height proportional to log₂(n). Although a certain overhead is involved, it is not bigger than the always necessary lookup cost.
| Language | Data Structure | Implementation |
|---|---|---|
| C++ STL | set, map |
Red-Black Tree |
| Java | TreeSet, TreeMap |
Red-Black Tree |
| Python | No native support | Use bisect module or PyPI: rbtree, pyavl |
Self-balancing BSTs can be used in a natural way to construct and maintain:
An AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees for any node cannot be more than one.
Balance Factor: For each node:
balance_factor = height(left_subtree) - height(right_subtree)
For an AVL tree: -1 ≤ balance_factor ≤ +1
The values above nodes represent the difference between left and right subtree heights:
Example AVL Tree:
12 (+1)
/ \
8 (+1) 18 (+1)
/ \ /
5 (+1) 11(0) 17(0)
/
4(0)
Each node shows its balance factor in parentheses.
Insertion follows the same basic rules as in a Binary Search Tree (BST):
Two basic operations can rebalance a BST without violating the BST property:
Right Rotation:
y x
/ \ / \
x T3 → T1 y
/ \ / \
T1 T2 T2 T3
Left Rotation:
x y
/ \ / \
T1 y → x T3
/ \ / \
T2 T3 T1 T2
1. Left-Left Case: Node is inserted in left subtree of left child
Solution: Single right rotation at the unbalanced node
z y
/ \ / \
y T4 → x z
/ \ / \
x T3 T3 T4
/ \
T1 T2
2. Left-Right Case: Node is inserted in right subtree of left child
Solution: Left rotation on left child, then right rotation on root
z z x
/ \ / \ / \
y T4 → x T4 → y z
/ \ / \ / \ \
T1 x y T3 T1 T2 T3 T4
/ \ / \
T2 T3 T1 T2
3. Right-Right Case: Node is inserted in right subtree of right child
Solution: Single left rotation at the unbalanced node
z y
/ \ / \
T1 y → z x
/ \ / \ \
T2 x T1 T2 T3 T4
/ \
T3 T4
4. Right-Left Case: Node is inserted in left subtree of right child
Solution: Right rotation on right child, then left rotation on root
z z x
/ \ / \ / \
T1 y → T1 x → z y
/ \ / \ / \ \
x T4 T2 y T1 T2 T3 T4
/ \ / \
T2 T3 T3 T4
Follow these steps:
left_height - right_height| Operation | Complexity | Explanation |
|---|---|---|
| Search | O(log n) | Tree height is O(log n) |
| Insert | O(log n) | Search + constant rotation operations |
| Delete | O(log n) | Search + constant rotation operations |
| Rotation | O(1) | Only pointer changes |
To ensure the tree remains AVL after deletion:
Important differences from insertion:
O(log n) for recursion call stack (since we use recursive insertion/deletion).
Python's standard library does not support self-balancing BSTs natively because:
bisect module for sorted lists, or PyPI packagesWhen to use PyPI BST implementations (rbtree, pyavl):
Key Takeaways:
| Data Structure | Best For | Worst Operation | Memory |
|---|---|---|---|
| List | Random access, iteration | Insert/delete in middle: O(n) | Contiguous, over-allocated |
| Deque | Queue, stack operations | Random access: O(n) | Blocks of 64 elements |
| Dict | Key-value mapping | Worst-case lookup: O(n) | Sparse indices + dense entries |
| Set | Membership testing, uniqueness | Worst-case lookup: O(n) | Hash table, no values |
| OrderedDict | Ordered key-value, LRU cache | Extra memory for linked list | Dict + doubly linked list |
| HeapQ | Priority queue, top-k problems | Not for random access | Just a list |
Understanding these internals helps you write more efficient Python code and make better decisions about which data structure to use for your specific use case.