← back to databases

Indexing

Wikipedia · wpDatabase index · CC BY-SA 4.0

An index is an auxiliary data structure that speeds up lookups at the cost of extra storage and slower writes. B+ trees are the default: they keep keys sorted, support range queries, and guarantee O(log n) lookup. Hash indexes give O(1) point lookups but cannot do ranges.

10 | 20 3 | 7 15 | 25 1,2,3 5,7,8 10,12 15,18 20,25 B+ tree: sorted keys, linked leaves for range scans.

B-trees and B+ trees

A B-tree of order m allows each node up to m children and m-1 keys. All leaves are at the same depth, giving O(log n) lookup. A B+ tree stores all data in the leaves and links them in a sorted chain, making range scans efficient. Most relational databases use B+ trees for their primary indexes.

Scheme

Hash indexes

A hash index hashes the key to a bucket number. Lookup is O(1) on average. The tradeoff: hash indexes cannot answer range queries (WHERE x BETWEEN 10 AND 20) because hashing destroys order. Use them only for equality lookups.

Scheme
Neighbors

Cross-references

Foundations (Wikipedia)