Difference between revisions of "UNIX file system"
| m (Add term, link) |  (Early versions had minor differences) | ||
| Line 1: | Line 1: | ||
| − | The '''UNIX file system''' was one of the first [[file system]]s to internally separate [[directory|directories]] (the catalogues of  | + | The '''UNIX file system''' was one of the first [[file system]]s to internally separate [[directory|directories]] (the catalogues of [[file]]s) from the meta-data about a file (such as the information about where the data of the file was stored). | 
| It did this through the mechanism of the '''inode''', a separate structure, in which most of the information about a file was kept. Directories were implemented as an abstraction on top of the previous layer), using 'file' objects provided by the inode layer; they only held mappings from file-names (visible to users) to inode numbers.   | It did this through the mechanism of the '''inode''', a separate structure, in which most of the information about a file was kept. Directories were implemented as an abstraction on top of the previous layer), using 'file' objects provided by the inode layer; they only held mappings from file-names (visible to users) to inode numbers.   | ||
| − | The inodes themselves were held in a separate area of the disk, outside the area used to hold the blocks of the files themselves, the '''ilist'''. A special block, the '''root block''' (also called the '''super-block''') - initially held in block 0 on the [[disk partition]] - held information about which blocks in the partition held the inodes, which blocks were available to hold data, and other general information about the file system on that partition. | + | The inodes themselves were held in a separate area of the disk, outside the area used to hold the blocks of the files themselves, the '''ilist'''. A special block, the '''root block''' (also called the '''super-block''') - initially held in block 0, later block 1, on the [[disk partition]] - held information about which blocks in the partition held the inodes, which blocks were available to hold data, and other general information about the file system on that partition. | 
| ==Details== | ==Details== | ||
| − | + | Starting with [[UNIX Fourth Edition]], information about which blocks were 'free', and available to be allocated to files, was kept in a 'free list', the head of which was kept in the root block. (Prior to V4, this information was kept in a bit [[array]].) When a partition was initialized as a file system, the free list was constructed to hold all the blocks available for files. Similarly, starting in V4, the root block also held a [[cache]] of free inodes; when this was exhausted, a sweep of the inode table on the disk refilled it. (Hopefully!) | |
| For the sake of efficiency (there are many more small files than large ones), the location of the first few blocks of a file was kept in the inode itself. If a file needed to grow past that, '''indirect blocks''' were used; these were blocks (stored in the area of the disk used for file data) which held an array of the block numbers of the actual blocks of the file; the block numbers of the indirect blocks were held in the inode. For even larger files, the last indirect block listed in the inode did not contain the block numbers of data blocks, but rather the block numbers of 'double-indirect blocks', which then held the block numbers of 'ordinary' indirect blocks. | For the sake of efficiency (there are many more small files than large ones), the location of the first few blocks of a file was kept in the inode itself. If a file needed to grow past that, '''indirect blocks''' were used; these were blocks (stored in the area of the disk used for file data) which held an array of the block numbers of the actual blocks of the file; the block numbers of the indirect blocks were held in the inode. For even larger files, the last indirect block listed in the inode did not contain the block numbers of data blocks, but rather the block numbers of 'double-indirect blocks', which then held the block numbers of 'ordinary' indirect blocks. | ||
| Line 13: | Line 13: | ||
| The file system allowed 'holey' files; in the tables of block numbers (either in the inode, or in both types of indirect blocks), a block number of 0 indicated that block had never been written to, and thus never allocated. The UNIX file I/O system allowed users to write data wherever they wanted within a file, leaving gaps if they so desired. Such 'missing' blocks contained all zeros when read. | The file system allowed 'holey' files; in the tables of block numbers (either in the inode, or in both types of indirect blocks), a block number of 0 indicated that block had never been written to, and thus never allocated. The UNIX file I/O system allowed users to write data wherever they wanted within a file, leaving gaps if they so desired. Such 'missing' blocks contained all zeros when read. | ||
| − | Inodes also held information such as the length of the file, whether the file was actually a directory (treated specially by the system, in that only the system could write to it, to prevent users from damaging the file system), the owner of the file, the file's protection, and last modification and access times, etc. | + | Inodes also held information such as the length of the file, whether the file was actually a directory (treated specially by the system, in that only the system could write to it, to prevent users from damaging the file system), the owner of the file, the file's protection, and last modification and access times, etc. After V4, the inode indicated whether it was a 'special file', for a [[peripheral|device]], rather than an ordinary file; before V4, inodes 1-40. were reserved for special files. | 
| − | Finally, the inode held a count of the number of [[hard  | + | Finally, the inode held a count of the number of [[hard link]]s to the file; a given file could appear in more than one directory, and the count was needed to [[garbage collection|garbage collect]] the file's blocks when the last directory entry for it was deleted. All directory entries for a file were equal; none had special status. | 
| ==Evolution== | ==Evolution== | ||
Revision as of 21:01, 24 May 2022
The UNIX file system was one of the first file systems to internally separate directories (the catalogues of files) from the meta-data about a file (such as the information about where the data of the file was stored).
It did this through the mechanism of the inode, a separate structure, in which most of the information about a file was kept. Directories were implemented as an abstraction on top of the previous layer), using 'file' objects provided by the inode layer; they only held mappings from file-names (visible to users) to inode numbers.
The inodes themselves were held in a separate area of the disk, outside the area used to hold the blocks of the files themselves, the ilist. A special block, the root block (also called the super-block) - initially held in block 0, later block 1, on the disk partition - held information about which blocks in the partition held the inodes, which blocks were available to hold data, and other general information about the file system on that partition.
Details
Starting with UNIX Fourth Edition, information about which blocks were 'free', and available to be allocated to files, was kept in a 'free list', the head of which was kept in the root block. (Prior to V4, this information was kept in a bit array.) When a partition was initialized as a file system, the free list was constructed to hold all the blocks available for files. Similarly, starting in V4, the root block also held a cache of free inodes; when this was exhausted, a sweep of the inode table on the disk refilled it. (Hopefully!)
For the sake of efficiency (there are many more small files than large ones), the location of the first few blocks of a file was kept in the inode itself. If a file needed to grow past that, indirect blocks were used; these were blocks (stored in the area of the disk used for file data) which held an array of the block numbers of the actual blocks of the file; the block numbers of the indirect blocks were held in the inode. For even larger files, the last indirect block listed in the inode did not contain the block numbers of data blocks, but rather the block numbers of 'double-indirect blocks', which then held the block numbers of 'ordinary' indirect blocks.
The file system allowed 'holey' files; in the tables of block numbers (either in the inode, or in both types of indirect blocks), a block number of 0 indicated that block had never been written to, and thus never allocated. The UNIX file I/O system allowed users to write data wherever they wanted within a file, leaving gaps if they so desired. Such 'missing' blocks contained all zeros when read.
Inodes also held information such as the length of the file, whether the file was actually a directory (treated specially by the system, in that only the system could write to it, to prevent users from damaging the file system), the owner of the file, the file's protection, and last modification and access times, etc. After V4, the inode indicated whether it was a 'special file', for a device, rather than an ordinary file; before V4, inodes 1-40. were reserved for special files.
Finally, the inode held a count of the number of hard links to the file; a given file could appear in more than one directory, and the count was needed to garbage collect the file's blocks when the last directory entry for it was deleted. All directory entries for a file were equal; none had special status.
Evolution
The UNIX file system, as described above, lasted quite a long time; minor detail changes were made over time (e.g. in Unix V7, block numbers were increased in size from 16 bits to 32), but in general the above description of the details applied to most versions of UNIX before BSD 4.1b.
As UNIX became used for larger and larger machines, two aspects of the file system became problematic: performance, and robustness.
The former was basically caused by the tendency of the UNIX file system to scatter the blocks of a file (especially large data files, where performance was more of an issue) across the entire disk, once the file system had been in use for a while. The latter was caused by the fact that there was only a single copy of much critical information, e.g. the root block.
A major modification to the UNIX file system was therefore done, the BSD Fast File System. It kept the basic concept of the two-level system, with inodes and directories, but made extensive detail changes (e.g. having multiple copies of the root block, scattered across the disk) to address the efficiency and robustness issues.

