OpenBSD File System

Here I explain the data structures and the code flow of the file system subsystem of OpenBSD. I will start describing the data structures to subsequently move to the flow of the read() system call.

Data Structures

Following the read path

When an application call read, the kernel takes control through the sys_kernel function. Once here, the first step is to find the file object belonging to the file descriptor passed by the application. Then, it populates an I/O vector and calls dofilereadv.

dofilereadv takes care of checking some errors and finally calls the read operation of the file structure. Note that this read operation could be of any type of stream (like sockets)). Since we are focused on the file system, we will focus on the vnode functions.

The read interface for file with vnodes is vn_read. This function takes care of some errors and locking. Then, it calls VOP_READ which takes care of reading from a vnode. Note that vn_read is just an interface for a file.

VOP_READ populates the required arguments to call the actual file system read function. This function will be different depending on what type of file system holds the file. We will continue assuming an FFS file system and its ffs_read function.

I did not read ffs_read with much detail since this function is file system dependent and my intention is to show the normal read patch rather than how FFS works. What is important, is that once the function resolves where to find the data requested, it calls bread. An important detail is that the size passed to bread is the file system block size.

bread is one of the few interfaces to request blocks from a vnode. It does not do much and in fact it only calls bio_doread and wait until the I/O is complete.

bio_doread seems to be the function that connect vnodes with the underlying layers. It calls getblk to get the buf associated to the the pair vnode, lbn. If the buffer is empty, it issues an I/O to fill it using the VOP_STRATEGY.

getblk looks for the buffer in the buffer cache. The buffer cache is implemented as a tree in the vnode indexed by the block number within the file. If the buffer is found, it is marked as a hit and returned. Otherwise, buf_get is called to get a fresh buffer. In case the buffer is found but is busy doing I/O, it sleeps and try again when awaken.

buf_get is in charge of returning a fresh buffer. As a side task, it controls the size of the buffer cache. Moreover, in case the number of clean buffers fall below a threshold, buf_get calls a flusher to increment this number.