Here I explain the data structures and the code flow of the file system
subsystem of OpenBSD. I will start describing the data structures to
subsequently move to the flow of the read()
system call.
process
:
This structure describes a process in the system. This is in contrast to
a thread of execution represented by proc
. The process
structure
represents a group of threads all belonging to a single process.proc
:
This represents a thread of execution in the system. This has all the
context required to execute including the address space and set of
registers.filedesc
:
Table with opened files by a thread. There is one instance of this per
proc
structure. This structure has a pointer to a file array that holds
one file structure per open file in the thread. The index in this array
corresponds to the file descriptor given to the user.file
:
This structure keeps the information for a single opened file in a
thread. It has an offset within the file and an object pointer to the
underlying data structure. This could be a vnode, socket, kqueue, and
others. Moreover, it has a set of operations that change depending on the
type of file. In the case of vnode, for example, the read points to
vn_read
.vnode
:
This structure links the abstraction of file to the structure of some
particular FS. It has a set of operations that points to functions
implemented by some particular FS like ext2, UFS, or FFS. It also has a
private object pointer that holds the private data of the underlaying file
system like the inode in the case of FFS.read
pathWhen an application call read
, the kernel takes control through the
sys_kernel
function. Once here, the first step is to find the file object belonging
to the file descriptor passed by the application. Then, it populates an
I/O vector and calls
dofilereadv
.
dofilereadv
takes care of checking some errors and finally calls the read operation of
the file structure. Note that this read operation could be of any type of
stream (like sockets)). Since we are focused on the file system, we will
focus on the vnode
functions.
The read interface for file with vnodes is
vn_read
.
This function takes care of some errors and locking. Then, it calls
VOP_READ
which takes care of reading from a vnode. Note that vn_read
is just an
interface for a file.
VOP_READ
populates the required arguments to call the actual file system read
function. This function will be different depending on what type of file
system holds the file. We will continue assuming an FFS file system and
its ffs_read
function.
I did not read ffs_read
with much detail since this function is file
system dependent and my intention is to show the normal read patch rather
than how FFS works. What is important, is that once the function resolves
where to find the data requested, it calls bread
. An important detail is
that the size passed to bread is the file system block size.
bread
is one of the few interfaces to request blocks from a vnode. It does not
do much and in fact it only calls bio_doread
and wait until the I/O is
complete.
bio_doread
seems to be the function that connect vnodes with the underlying layers.
It calls getblk
to get the buf associated to the the pair vnode, lbn.
If the buffer is empty, it issues an I/O to fill it using the
VOP_STRATEGY
.
getblk
looks for the buffer in the buffer cache. The buffer cache is implemented
as a tree in the vnode indexed by the block number within the file. If the
buffer is found, it is marked as a hit and returned. Otherwise, buf_get
is called to get a fresh buffer. In case the buffer is found but is busy
doing I/O, it sleeps and try again when awaken.
buf_get
is in charge of returning a fresh buffer. As a side task, it controls the
size of the buffer cache. Moreover, in case the number of clean buffers
fall below a threshold, buf_get
calls a flusher to increment this
number.