Difference between revisions of "UNIX file system"

From Computer History Wiki
Jump to: navigation, search
(Add more about the changes over time)
m (Details: Fix miswording)
Line 9: Line 9:
 
Information about which blocks were 'free', and available to be allocated to files, was kept in a 'free list', the head of which was kept in the root block. (When a partition was initialized as a file system, the free list was constructed to hold all the blocks available for files.) The root block also held a [[cache]] of free inodes; when this was exhausted, a sweep of the inode table on the disk refilled it. (Hopefully!)
 
Information about which blocks were 'free', and available to be allocated to files, was kept in a 'free list', the head of which was kept in the root block. (When a partition was initialized as a file system, the free list was constructed to hold all the blocks available for files.) The root block also held a [[cache]] of free inodes; when this was exhausted, a sweep of the inode table on the disk refilled it. (Hopefully!)
  
For the sake of efficiency (there are many more small files than large ones), the location of the first few blocks of a file was kept in the inode itself. If a file needed to grow past that, '''indirect blocks''' were used; these were blocks (stored in the area of the disk used for file data) which held an array of the block numbers of the next blocks of the file; the block numbers of the indirect blocks were held in the inode. For even larger files, the last indirect block noted in the inode did not hold the block numbers of data blocks, but the block numbers of 'double-indirect blocks', which held the block numbers of ordinary indirect blocks.
+
For the sake of efficiency (there are many more small files than large ones), the location of the first few blocks of a file was kept in the inode itself. If a file needed to grow past that, '''indirect blocks''' were used; these were blocks (stored in the area of the disk used for file data) which held an array of the block numbers of the actual blocks of the file; the block numbers of the indirect blocks were held in the inode. For even larger files, the last indirect block listed in the inode did not contain the block numbers of data blocks, but rather the block numbers of 'double-indirect blocks', which then held the block numbers of 'ordinary' indirect blocks.
  
The file system allowed 'holey' files; in the tables of block numbers, a block number of 0 indicated that block had never been written to, and thus never allocated. The UNIX I/O system allowed users to write data wherever they wanted within a file, leaving gaps if they so desired. Such 'missing' blocks contained all zeros when read.
+
The file system allowed 'holey' files; in the tables of block numbers (either in the inode, or in both types of indirect blocks), a block number of 0 indicated that block had never been written to, and thus never allocated. The UNIX file I/O system allowed users to write data wherever they wanted within a file, leaving gaps if they so desired. Such 'missing' blocks contained all zeros when read.
  
Inodes also held information such as the length of the file, whether the file was actually a directory (treated specially by the system, in that only the system could write to it, to prevent users from damaging the file system), the owner of the file, the file's protection, and last modification and access times.
+
Inodes also held information such as the length of the file, whether the file was actually a directory (treated specially by the system, in that only the system could write to it, to prevent users from damaging the file system), the owner of the file, the file's protection, and last modification and access times, etc.
  
 
Finally, the inode held a count of the number of '''hard links''' to the file; a given file could appear in more than one directory, and the count was needed to [[garbage collect]] the file's blocks when the last directory entry for it was deleted. All directory entries for a file were equal; none had special status.
 
Finally, the inode held a count of the number of '''hard links''' to the file; a given file could appear in more than one directory, and the count was needed to [[garbage collect]] the file's blocks when the last directory entry for it was deleted. All directory entries for a file were equal; none had special status.

Revision as of 22:02, 7 August 2017

The UNIX file system was one of the first file systems to internally separate directories (the catalogues of files) from the meta-data about a file (such as the information about where the data of the file was stored).

It did this through the mechanism of the inode, a separate structure, in which most of the information about a file was kept. Directories were implemented as an abstraction on top of the previous layer), using 'file' objects provided by the inode layer, which only held mappings from file-names (visible to users) to inode numbers.

The inodes themselves were held in a separate area of the disk, outside the area used to hold the blocks of the files themselves, the ilist. A special block, the root block (initially, block 0 on the disk partition) held information about which blocks in the partition held the inodes, which blocks were available to hold data, and other general information about the file system on that partition.

Details

Information about which blocks were 'free', and available to be allocated to files, was kept in a 'free list', the head of which was kept in the root block. (When a partition was initialized as a file system, the free list was constructed to hold all the blocks available for files.) The root block also held a cache of free inodes; when this was exhausted, a sweep of the inode table on the disk refilled it. (Hopefully!)

For the sake of efficiency (there are many more small files than large ones), the location of the first few blocks of a file was kept in the inode itself. If a file needed to grow past that, indirect blocks were used; these were blocks (stored in the area of the disk used for file data) which held an array of the block numbers of the actual blocks of the file; the block numbers of the indirect blocks were held in the inode. For even larger files, the last indirect block listed in the inode did not contain the block numbers of data blocks, but rather the block numbers of 'double-indirect blocks', which then held the block numbers of 'ordinary' indirect blocks.

The file system allowed 'holey' files; in the tables of block numbers (either in the inode, or in both types of indirect blocks), a block number of 0 indicated that block had never been written to, and thus never allocated. The UNIX file I/O system allowed users to write data wherever they wanted within a file, leaving gaps if they so desired. Such 'missing' blocks contained all zeros when read.

Inodes also held information such as the length of the file, whether the file was actually a directory (treated specially by the system, in that only the system could write to it, to prevent users from damaging the file system), the owner of the file, the file's protection, and last modification and access times, etc.

Finally, the inode held a count of the number of hard links to the file; a given file could appear in more than one directory, and the count was needed to garbage collect the file's blocks when the last directory entry for it was deleted. All directory entries for a file were equal; none had special status.

Evolution

The UNIX file system, as described above, lasted quite a long time; minor detail changes were made over time (e.g. in Unix V7, block numbers were increased in size from 16 bits to 32), but in general the above description of the details applied to most versions of UNIX before BSD 4.1b.

As UNIX became used for larger and larger machines, two aspects became problematic: performance, and robustness. The former was basically caused by the tendency of the UNIX file system to scatter the blocks of a file (especially large data files, where performance was more of an issue) across the entire disk, once the file system had been in use for a while. The latter was caused by the fact that there was only a single copy of much critical information, e.g. the root block.

A major modification to the UNIX file system was therefore done, the BSD Fast File System. It kept the basic concept of the two-level system, with inodes and directories, but made extensive detail changes (e.g. having multiple copies of the root block, scattered across the disk) to address the efficiency and robustness issues.