System Administration Guide
Chapter 2, Administering filesystems

How UNIX systems maintain files and filesystems

How UNIX systems maintain files and filesystems

Filesystem data is not stored on the hard disk in locations that correspond to individual files. On the contrary, the data is probably scattered across the disk. The data is spread around because the operating system does not really deal with files, but rather with units of data. For example, when you create a file, it might be stored on one part of the disk. If you edit that file and delete a few sentences here and there, you now use less disk space than you did before. This space amounts to a series of gaps in the area where your file was stored. Because disk space is a precious commodity, the system allocates those small amounts of disk space to other files.

Each filesystem contains special structures that allow the operating system to access and maintain the files and data stored on the filesystem:

Data blocks
A ``block'' is a 1024-byte unit of data stored on the disk. (DTFS filesystems use variable block sizes to maximize use of space.) A data block can contain either directory entries or file data. A directory entry consists of an inode number, a filename, and a version number for undelete(C) (file versioning). 

An ``inode'' (information node) contains all the information about a file (except file data), including its location, size, file type, permissions, owner, and the number of directory entries linked to the file. The inode also contains the locations of all the data that make up a file so the operating system can collect it all when needed. The only information the inode does not contain is the name of the file and the contents; directories contain the actual filenames. In DTFS filesystems, inodes contain the inode number of its parent directory and the inode's filename. In addition, inodes are not statically allocated at filesystem creation as with other filesystem types. The number of free inodes in DTFS filesystems varies depending on the amount of free space available.

One special data block, the ``superblock'', contains overall information about the filesystem, just as the inode contains information about a specific file. The superblock contains the information necessary to mount a filesystem and access its data, including the size of the filesystem, the number of free inodes, and information about free space available. When the filesystem is mounted, the system reads information from the disk version of the superblock into memory.

To minimize seeking data on the hard disk, recently used data blocks are held in a cache of special memory structures called ``buffers''. Buffers make the operating system more efficient. Depending on the filesystem type and the setting of kernel parameters, the buffer cache is ``flushed'' (written to the disk) at set intervals. 
Several configurable filesystem mechanisms affect how transactions are managed and committed. Some involve tradeoffs in performance against data integrity, others tradeoff performance against system recovery time.

Intent logging: When this feature is enabled, filesystem transactions are recorded in a log and then committed to disk. This increases system recovery speed with a very small performance penalty.

Checkpointing: When enabled, each filesystem is marked clean at regular intervals. If a filesystem is clean when the system halts, it will not be necessary for it to be checked by fsck. Like intent logging (which works in tandem with checkpointing), there is a small performance penalty.

Sync-on-close (DTFS): When enabled, file data is immediately written to disk when a file is closed, imitating DOS behavior. This feature significantly degrades system performance.

See also: