← back to Operating Systems

File Systems

Wikipedia · wpFile system · CC BY-SA 4.0

A file system organizes persistent data on disk into files and directories. The key abstraction: a file is a named sequence of bytes. The file system maps names to blocks on disk, tracks free space, and (in modern systems) journals writes for crash recovery.

Files and directories

A file has data (the bytes) and metadata (name, size, permissions, timestamps). A directory is a file that maps names to file references. The directory tree gives every file a unique path.

Inodes

In Unix file systems, an inode stores all metadata about a file except its name. The directory maps names to inode numbers. An inode contains pointers to the data blocks on disk. For large files, indirect pointers add extra levels of indirection.

Inode 42 size: 13KB perms: rw-r--r-- links: 1 direct[0] direct[1] direct[2] indirect Block 8 Block 14 Block 23 Indirect Blk 31 Blk 45 31 45
Scheme

Allocation strategies

Contiguous: file occupies consecutive blocks. Fast reads, but causes fragmentation. Linked: each block has a pointer to the next. No fragmentation, but random access is slow. Indexed: an index block holds all pointers. Best of both, at the cost of index overhead. Unix inodes use indexed allocation with direct and indirect pointers.

Free space management

The file system tracks which blocks are free using a bitmap (one bit per block) or a free list (linked list of free blocks). Bitmaps are compact and support fast allocation of contiguous blocks.

Journaling

A journal (or log) records intended changes before applying them. If the system crashes mid-write, recovery replays or rolls back the journal. This turns a potentially corrupted file system into one that recovers in seconds instead of hours of fsck.

Neighbors
  • 🔢 Discrete Math Ch.4 — directed graphs: file system directories are DAGs (hard links can create cycles, hence the restriction)
  • 💾 Databases Ch.6 — WAL and crash recovery: journaling filesystems use the same write-ahead logging technique as databases
  • ⚙ Algorithms Ch.4 — B-trees: most filesystem metadata structures (ext4, NTFS, HFS+) are B-tree variants

Foundations (Wikipedia)