Files
The file system consists of Regular files, Directories, Character special files and Block special files. Regular files are files that the user sees, generally ASCII or binary. A file will execute only if it has the correct format. A simple binary file from early UNIX has for example five sections: header, text, data, relocation bits and the symbol table. The header consists of a magic number that marks the file as executable or not. The text and data is the program itself which are loaded into memory when the program is executed and relocated by the relocation bits. The symbol table is for debugging. There are other kinds of binary files as well. Every operating system must recognize its own executable file.
Access
Early operating systems could only access a file in a sequential manner, meaning they were forced to read the whole file and could not skip parts or read out of order.
Meta data
Files have additional information about the file itself called the file's attributes or metadata. This information could be when the file was created, last modified and many more properties.
Operations
There are a number of operations an operating system allow for storage and retrieval. The following operations are the most common.
Operation | Description |
---|---|
Create | Creation of the file without no content but some of the attributes are set. |
Delete | When a file is not needed anymore we could free up its disk space by deleting it. |
Open | Loads the file's attributes and addresses for faster access later on into memory. This must be called before using the file. |
Close | When we no longer need to access the file we should free up its space in memory. |
Read | Fetch the data more a file. |
Write | Write data to the file. |
Append | Append data to a file. This is a special case of write. |
Seek | Repositions the file access pointer to a specific place in the file to allow reads from that position. |
Rename | Change the name of the file. |
Set attributes | Set attributes that are settable. |
Get attributes | List attributes. |
Directories
Directories are used to organize files in multiple layers. Similarly to files, directories also have a number of operations allowed for storage and retrieval.
Operation | Description |
---|---|
Create | Create the directory. Dot and DotDot directories are automatically created. |
Delete | Delete the directory. Only empty directories can be deleted if not recursively forced. |
Opendir | Lists all files and subdirectories in the directory. This puts the information in memory. |
Closedir | When we no longer need to access the directory we should free up its space in memory. |
Readdir | Returns the next entry in the directory. |
Rename | Change the name of the directory. |
Link | Linking allows a file to appear in more than one directory. |
Unlink | Removes a directory entry. This is what UNIX uses to delete files. The file is removed if it is only present in one directory, else the link is removed. |
Implementation
File systems exists on the disk. Disks could be divided into several different partitions with independent file systems. We could for example have Windows on one partition with its own file system and Linux on another partition with another file system one the same computer. Sector 0 on the disk is called the MBR , which is used to boot the computer. This sector also contains information about all other partitions. Every partition starts with a boot block even if they are not bootable, because they might in the future. So the first thing the MBR program does is to execute the boot block once it has located the active partition. One of the partitions is marked as active.
Files
Contiguous allocation
One of the simplest schemes is contiguous allocation. A file that is 50KB with 1KB disk blocks would allocate 50 consecutive blocks. This technique is very simple to implement because we only need to keep track of two things for each file, that is the disk address and the number of disk blocks. The performance is also very good because we only one seek is needed. The one drawback is that the disk over time becomes fragmented, holes in the disk address space. We could keep track of the holes to reuse the space but then we need to know the final size of the file to choose a hole that fits. So if the is small at the beginning and we place it at a hole that perfectly fits what do we do if the size of this file increases?
Linked-list allocation
Another approach is to keep every block of a file as a linked list. This way the first block would point to the next, but does not need to be right after the first block. This would allow us to use every disk block. No space is lost due to fragmentation. Sequential reading would be fast, but random access slower. To get to block we would have to read blocks. To avoid this we could keep each pointer the disk blocks in a table in memory. However this also comes with the disadvantage of requiring very much memory for big tables since the whole table needs to be in memory at all time. The table is called FAT .
I-nodes
Yet another approach is to associate each file with a data structure called i-node (index-node). With the i-node we can locate every disk block of the file. The i-node lists the attributes and disk addresses of the file. In contrast with linked list, we only need to keep the i-node in memory when accessing the file. I-nodes are also much more memory efficient than linked list. With files occupying bytes we need to reserve bytes in total.