An index is an auxiliary data structure that speeds up lookups at the cost of extra storage and slower writes. B+ trees are the default: they keep keys sorted, support range queries, and guarantee O(log n) lookup. Hash indexes give O(1) point lookups but cannot do ranges.
B-trees and B+ trees
A B-tree of order m allows each node up to m children and m-1 keys. All leaves are at the same depth, giving O(log n) lookup. A B+ tree stores all data in the leaves and links them in a sorted chain, making range scans efficient. Most relational databases use B+ trees for their primary indexes.
Scheme
; B-tree lookup simulation.; A node is (keys children) where children is one longer than keys.; Leaf nodes have empty children.
(define tree
(list (list 1020) ; root keys
(list ; root children
(list (list 37) (list)) ; left child (leaf)
(list (list 1215) (list)) ; middle child (leaf)
(list (list 2530) (list))))) ; right child (leaf)
(define (btree-search node target)
(let ((keys (car node))
(children (cadr node)))
(if (null? children)
; Leaf: scan keys
(if (member target keys) #t#f)
; Internal: find correct child
(let loop ((ks keys) (cs children))
(cond
((null? ks) (btree-search (car cs) target))
((< target (car ks)) (btree-search (car cs) target))
((= target (car ks)) #t)
(else (loop (cdr ks) (cdr cs))))))))
(display "Search 7: ") (display (btree-search tree 7)) (newline)
(display "Search 10: ") (display (btree-search tree 10)) (newline)
(display "Search 11: ") (display (btree-search tree 11)) (newline)
(display "Search 25: ") (display (btree-search tree 25))
Hash indexes
A hash index hashes the key to a bucket number. Lookup is O(1) on average. The tradeoff: hash indexes cannot answer range queries (WHERE x BETWEEN 10 AND 20) because hashing destroys order. Use them only for equality lookups.
Scheme
; Hash index: O(1) lookup, no range queries.
(define num-buckets 4)
(define (hash-key k)
(modulo k num-buckets))
; Build the index: list of buckets, each bucket is a list of (key . value) pairs
(define (make-index entries)
(let ((buckets (make-vector num-buckets (list))))
(for-each
(lambda (entry)
(let ((b (hash-key (car entry))))
(vector-set! buckets b (cons entry (vector-ref buckets b)))))
entries)
buckets))
(define idx (make-index (list (cons 1"Alice") (cons 5"Bob") (cons 9"Carol") (cons 2"Dave"))))
; Lookup
(define (hash-lookup index key)
(let ((bucket (vector-ref index (hash-key key))))
(let loop ((pairs bucket))
(cond
((null? pairs) #f)
((= (caar pairs) key) (cdar pairs))
(else (loop (cdr pairs)))))))
(display "Lookup 5: ") (display (hash-lookup idx 5)) (newline)
(display "Lookup 9: ") (display (hash-lookup idx 9)) (newline)
(display "Lookup 7: ") (display (hash-lookup idx 7))
Python
# Demonstrating index usage with SQLiteimport sqlite3
conn = sqlite3.connect(":memory:")
c = conn.cursor()
c.execute("CREATE TABLE items (id INTEGER PRIMARY KEY, name TEXT, price REAL)")
# Insert 1000 rowsfor i inrange(1000):
c.execute("INSERT INTO items VALUES (?,?,?)", (i, "item_" + str(i), i * 1.5))
# Without index on price
c.execute("EXPLAIN QUERY PLAN SELECT * FROM items WHERE price > 500")
print("Without index:", c.fetchone())
# Create index
c.execute("CREATE INDEX idx_price ON items(price)")
# With index on price
c.execute("EXPLAIN QUERY PLAN SELECT * FROM items WHERE price > 500")
print("With index: ", c.fetchone())
Neighbors
Cross-references
⚙ Algorithms Ch.5 — trees: B-trees are balanced search trees optimized for disk