UNIX* System V and 4.1C BSD
From Computer History Wiki
This is a multipart paper posted on usenet comparing 4.1c BSD and Unix SYSV:
Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!linus!decvax!harpo!seismo!hao!cires!nbires!ut-ngp!ut-sally!jbc From: j...@ut-sally.UUCP Newsgroups: net.sources Subject: Compare.A Message-ID: <63@ut-sally.UUCP> Date: Sat, 6-Aug-83 22:02:53 EDT Article-I.D.: ut-sally.63 Posted: Sat Aug 6 22:02:53 1983 Date-Received: Sun, 7-Aug-83 17:49:00 EDT
UNIX* System V and 4.1C BSD John Chambers Office of Academic Computing & Biostatistics University of Texas Medical Branch, Galveston John Quarterman Computation Center University of Texas at Austin {ihnp4,decvax!{eagle,allegra}}!ut-ngp!{jbc,jsq} {jbc,jsq}@{ut-sally.UUCP,{utexas-11,utexas-780}.ARPA} Presented at the July 1983 USENIX Conference in Toronto. cO Copyright 1983 by the Regents of the University of Texas. ABSTRACT This paper compares System V (the UNIX system which Western Electric is currently licensing) and 4.1C BSD (the final precursor to 4.2BSD, the research UNIX system developed for DARPA by the University of California at Berkeley), based on experience with both systems on a DEC VAX-11/780. The comparison covers several areas and includes comments organized by manual section on numerous specific features (languages, shells, text editing and formatting, devices, etc.), plus more general and detailed discussions of such topics as: installation and configuration; sources and documentation; groups and identifiers; file systems; interprocess communications; networks; performance (including some tentative benchmarks); and vendor support. Common features are mostly left to the manuals, in order to better concentrate on differences. This is meant to be a qualitative comparison, intended to serve only as a guide for further study. ________ * UNIX is a Trademark of Bell Telephone Laboratories, Inc. CONTENTS 1. Introduction....................................... 2 1.1 Intent....................................... 2 1.2 Format of the Paper.......................... 2 1.3 Disclaimers and Acknowledgments.............. 3 2. Manual Sections.................................... 3 2.1 Commands..................................... 3 2.1.1 User convenience..................... 3 2.1.2 Programming support environments..... 4 2.1.3 Shells............................... 6 2.1.4 Formatting and typesetting........... 6 2.1.5 Graphics............................. 7 2.1.6 Ingres............................... 7 2.1.7 Text editors......................... 7 2.1.8 Electronic mail...................... 8 2.1.9 Printing............................. 8 2.2 System Calls................................. 9 2.2.1 Vfork and fork....................... 9 2.2.2 Reboot............................... 9 2.2.3 Setpgrp.............................. 10 2.2.4 Group system calls................... 10 2.2.5 Ioctls............................... 10 2.2.6 Open and fcntl....................... 10 2.2.7 4.1C BSD file system calls........... 11 2.2.8 Timing............................... 12 2.2.9 IPC.................................. 12 2.3 Libraries and Subroutines.................... 12 2.3.1 Common Object File Format routines............................. 12 2.3.2 Utmp routines........................ 12 2.3.3 F77 library.......................... 12 2.3.4 Knuth algorithms..................... 12 2.3.5 Software signals and matherr......... 13 2.3.6 Stdio buffering...................... 13 2.3.7 Printf............................... 13 2.3.8 String routines...................... 14 2.3.9 Network library...................... 14 2.4 Devices...................................... 14 2.4.1 Tty.................................. 14 2.4.2 DH-11................................ 14 2.4.3 KMC-11B.............................. 15 2.4.4 VPM.................................. 15 2.4.5 Synchronous terminal................. 15 2.4.6 BLIT................................. 15 2.4.7 Ptys................................. 15 2.4.8 Generalized disk driver.............. 15 2.4.9 Generalized tape driver.............. 16 2.5 File Formats................................. 16 - i - 2.5.1 A.out................................ 16 2.5.2 Ar................................... 16 2.5.3 Fs................................... 17 2.5.4 Termcap and descendants.............. 17 2.6 Games........................................ 17 2.6.1 System V games....................... 17 2.6.2 4.1C BSD ASCII graphics games........ 17 2.6.3 PDP-11 compatibility................. 18 2.7 Miscellany................................... 18 2.7.1 File system hierarchy................ 18 2.8 Maintenance.................................. 18 2.8.1 Init, getty, and login............... 18 2.8.2 Shutdown, halt, and reboot........... 19 2.8.3 Backups.............................. 19 2.8.4 Fsck, fsdb, etc...................... 20 2.8.5 Monitoring and debugging............. 20 2.8.6 Accounting........................... 21 3. Installation and Configuration..................... 21 3.1 Installation................................. 21 3.2 Configuration................................ 22 3.3 Transition................................... 23 4. Sources and Documentation.......................... 23 4.1 Make......................................... 24 4.2 SCCS......................................... 24 4.3 Sources...................................... 24 4.4 Documentation................................ 25 5. Groups and Identifiers............................. 25 5.1 Groups....................................... 25 5.2 Identifiers.................................. 26 6. File Systems....................................... 26 6.1 System V..................................... 27 6.1.1 New file system block size........... 27 6.1.2 Faster access........................ 27 6.2 4.1C BSD..................................... 27 6.2.1 Reimplementation for efficiency...... 27 6.2.2 Other modifications.................. 28 6.2.3 Extended (network) file system....... 28 7. Interprocess Communications (IPC).................. 29 7.1 System V..................................... 29 7.2 4.1C BSD..................................... 29 8. Networks........................................... 30 8.1 System V..................................... 30 8.1.1 X.25................................. 30 8.1.2 PCL network.......................... 30 8.1.3 NSC network.......................... 30 - ii - 8.1.4 RJE to IBM........................... 31 8.2 4.1C BSD..................................... 31 8.2.1 General networking framework......... 31 8.2.2 Variety of hardware and protocols supported............................ 31 8.2.3 Internet (TCP/IP).................... 32 8.2.4 Berkeley protocols................... 32 8.3 UUCP......................................... 32 8.4 USENET....................................... 33 9. Performance........................................ 33 9.1 Some Qualitative Remarks..................... 33 9.1.1 Paging vs. swapping.................. 33 9.1.2 Terminal I/O......................... 34 9.2 Tentative Benchmarks......................... 34 9.2.1 Load simulation...................... 35 9.2.2 File system throughput............... 36 10. Vendor Support..................................... 36 10.1 Western Electric............................. 36 10.2 U.C. Berkeley................................ 36 10.3 DEC.......................................... 36 10.4 Third Parties................................ 37 10.4.1 OEMs................................. 37 10.4.2 Emulations........................... 37 10.4.3 Consultants.......................... 37 10.4.4 Authors.............................. 38 11. Conclusion......................................... 38 11.1 Selection Criteria........................... 38 11.2 Combinations................................. 38 11.3 Future Directions............................ 39 11.3.1 UNIX standards committee............. 39 11.3.2 Berkeley features and Bell........... 39 11.3.3 Bell licensing and Berkeley.......... 39 Appendix A: Terminology........................... 39 Appendix B: Load Simulation Job................... 41 - iii - UNIX* System V and 4.1C BSD John Chambers Office of Academic Computing & Biostatistics University of Texas Medical Branch, Galveston John Quarterman Computation Center University of Texas at Austin {ihnp4,decvax!{eagle,allegra}}!ut-ngp!{jbc,jsq} {jbc,jsq}@{ut-sally.UUCP,{utexas-11,utexas-780}.ARPA} Presented at the July 1983 USENIX Conference in Toronto. cO Copyright 1983 by the Regents of the University of Texas. ABSTRACT This paper compares System V1 (the UNIX system which Western Electric is currently licensing) and 4.1C BSD2 (the final precursor to 4.2BSD, the research UNIX system developed for DARPA3 by the University of California at Berkeley), based on experience with both systems on a DEC VAX-11/7804. The comparison covers several areas and includes comments organized by manual section on numerous specific features (languages, shells, text editing and formatting, devices, etc.), plus more general and detailed discussions __________ * UNIX is a Trademark of Bell Telephone Laboratories, Inc. 1. See Appendix A for the official names of Bell UNIX Systems. 2. See Appendix A for details about Berkeley Software Distributions (BSD). 3. Defense Advanced Research Projects Agency (DARPA), formerly ARPA. 4. VAX, PDP, UNIBUS, MASSBUS, and SBI are Trademarks of Digital Equipment Corporation (DEC). - 2 - of such topics as: installation and configuration; sources and documentation; groups and identifiers; file systems; interprocess communications; networks; performance (including some tentative benchmarks); and vendor support. Common features are mostly left to the manuals, in order to better concentrate on differences. This is meant to be a qualitative comparison, intended to serve only as a guide for further study. 1. Introduction 1.1 Intent This paper describes certain differences between System V and 4.1C BSD, leaving details of common functions to the manuals. This is a qualitative comparison, intended to serve only as a guide for further study. While performance is not a major theme of this paper, some tentative benchmarks are included to indicate the relative performance of the two systems. These benchmarks should not be considered conclusive, since 4.1C is not 4.2 and since we have not had sufficient production experience with System V. This paper supersedes a previous paper, ``UNIX System III and 4.1BSD, A Practical Comparison'', by the same authors. In some cases, features are noted herein as having been introduced in System V or 4.1C BSD when they were actually introduced in System III or 4.1BSD. This usually occurs when comparisons are being made with V7/32V and is done simply to decrease the verbiage. 1.2 Format of the Paper The first section following the Introduction contains subsections corresponding to sections of the UNIX Programmer's Manual* in order to provide a framework for comparison of detailed features of the operating systems. There follow several sections on subjects which are wider than a single manual entry or which we consider important. __________ * The 4.1C BSD title; see section on Documentation. - 3 - Finally, there is a summary section which includes some comments on recent cooperation among UNIX system developers. 1.3 Disclaimers and Acknowledgments The authors of this paper are in no way affiliated with the University of California, Bell Laboratories, or Western Electric, and are solely responsible for the opinions presented herein. 4.1C BSD is not a regular Berkeley Software Distribution and inquiries should not be sent to Berkeley concerning it. Facilities in 4.1C may be represented differently in 4.2. In cases in which we know what the differences will be we have noted them but we do not claim to have caught every case. When Berkeley is ready to distribute 4.2BSD, they will announce it. We would like to acknowledge Dr. Michael Molloy of the Computer Science Department, University of Texas at Austin, for the use of the departmental VAX-11/780 and for his assistance, as well as Bill Lee of the U.T. Austin Computation Center for his continued moral and material support. We would also like to thank the following for reviewing the paper: Sam Leffler of the University of California at Berkeley, Nina McCloskey of AT&T Technology and Licensing Division, Armando Stettner of DEC, Dan Franklin of Bolt, Beranek, and Newman (BBN), and Doug Gwyn of the U.S. Army Ballistic Research Laboratory (BRL). 2. Manual Sections The subsections of this section generally follow the order of the UNIX Programmer's Manual. 2.1 Commands The general utility commands supplied with the two systems exhibit relatively minor differences, mainly in terms of the options available. A few commands are included in each distribution which do not occur in the other; many of these are of questionable usefulness anyway and the reader is referred to the manuals for further details. Certain larger packages, however, such as language support facilities, are rather different and are discussed in the following sections. 2.1.1 User convenience Several utilities are considered important for the convenience of the frequent user. - 4 - Berkeley UNIX provides the page and more file perusal commands, used to examine a file a screenful at a time. No equivalent is available in Bell UNIX. The Berkeley ls command understands proper multicolumn formatting of a directory listing (when stdout is a tty). Under System V, ls generates a listing with one entry per line; a multicolumn listing must be obtained by piping the output into the paste command, e.g. ls | paste - - - - - The Berkeley w program may be used to monitor user activity; the System V equivalent uses a command file, /etc/whodo, to generate similar information. However, it is rather inconvenient to have to specify the absolute pathname and few users actually have /etc as part of their default path. (We note, of course, that the superuser's PATH environment variable does include /etc, perhaps to suggest that only system administrators and the like should be interested in such information.) 2.1.2 Programming support environments Several changes have been made to the C programming support environment (Software Generation System in WECo parlance) in System V. Most of the #include files have been rearranged and expanded, and it is advisable to recompile all C programs. Pcc, the portable C compiler, includes reasonable enumerations, changes to structure and union handling (nonunique structure member names), correct handling of the void data type, and several bug fixes. The cc command itself has added the W flag to allow options to be explicitly specified for a particular compilation subpass. Certain bugs which are known to remain are documented in the System Release Description. Two new tools are included: cxref, which generates cross-reference listings and obsoletes both cref and xref, and cflow, which builds a graph of external references occurring in a collection of assorted source files (C, assembler, etc.). The System V f77 programming support environment also includes two new tools: asa interprets the standard ASA carriage control characters, and fsplit may be used to split FORTRAN sources (f77, efl, ratfor) on a procedure-per-file basis. In addition, the load-time library has been greatly extended and enhanced. - 5 - The libraries for both C and f77 are available in profiled versions, which must be loaded explicitly in place of the default, non-profiled ones. These profiled libraries allow program execution profiling at the library function level rather than the user program function level. Further, the symbolic debugger sdb is very much improved and may be used easily with either C or f77 programs. The as assembler and ld linker have been modified to utilize the new Common Object File Format, which is discussed below. Note that any change to a source file for a program thus necessitates recompilation of all sources before the objects may be relinked using ld, since the old and new object formats are radically different. The C compiler in 4.1C (pcc) is very similar to the one in System III, including void, union, enum, and structure elements named per structure, some of which were added after 32V. Berkeley added very long identifiers in 4.1BSD, while System III and System V retained the old 7/8 character identifiers. The as assembler, the ld linker, and associated libraries are similar to the ones in 32V, although in 4.1 ld was reworked to be four to five times faster and this improvement is preserved in 4.1C and 4.2. The dbx symbolic debugger is new. 4.1C BSD has some bug fixes and other improvements to f77 (an overlaid version of this compiler is available for 2.8bsd). 4.2 has an extensively reworked version of f77 and its associated libraries: early versions of this new FORTRAN package were apparently the source for the new System V FORTRAN facilities. Both systems support Ratfor and the Extended FORTRAN Language (EFL), but 4.1C additionally provides the struct utility, used to convert FORTRAN sources into reasonably clean Ratfor. System V has bs, essentially derived from BASIC. There is no equivalent in 4.1C BSD; however, the University of British Columbia BASIC sytem is compatible with 4BSD. Similarly, System V includes the classic sno SNOBOL system, while 4.1C includes PASCAL, FRANZ LISP, APL, and fp. APL is a user contributed software package from Purdue. Fp (Functional Programming language compiler/interpreter) implements the applicative language proposed by John Backus - 6 - in his Turing award lecture. 4.2 may include Icon as user contributed software. There is a COBOL compiler commercially available for 4BSD, and possibly for System V. 2.1.3 Shells System V supports the Bourne shell (sh), with few noticeable changes from V7. 4.1C BSD has much the same Bourne shell plus the Cshell (csh), often a new user's first command language. The Cshell has most of the capabilities of the Bourne shell (though the syntax is different), plus the history, alias and directory stack features. History and alias allow editing and replaying of saved commands. Such features are the main reason many users prefer the Cshell (although some cite its extensive C-like control structures as another reason). The 4.1C Cshell also has a set of job control features (requiring the Berkeley `new tty' terminal driver) which allow the user to suspend and resume subprocesses. The 4.1C resource limitation facilities are normally accessed via the csh limit command. The only close equivalent in System V sh is the ulimit command, used to control the size of the file a child process may write. 2.1.4 Formatting and typesetting 4.1C offers the -me macro package, while System V has the -mm package, somewhat augmented from PWB. The -ms macros have been removed from System V but are still found in 4.1C. In 4.2, they have been extended to provide support for tables of contents and the like. System V includes additional macro support for generating slides and viewgraphs. An improved interface to the Versatec is provided in System V, along with new ioctl calls for state control. The vcat filter for troff which was documented but absent in System III seems to have disappeared entirely in System V. Both systems have Versatec drivers expecting a single interrupt address, whereas the Versatec itself has two configured into the hardware. 4.1C at least has comments in the code to tell you this (and #ifdefs to deal with it). The 4.1 Versatec user programs expected a unit wide enough to handle four pages abreast; this problem has been fixed in 4.2 (but not 4.1C) by extensions to the printer - 7 - spooling facility. The Berkeley font library seems rather more extensive than that provided with System V. These fonts are used by the Versatec filters to simulate the mounted fonts of a C/A/T phototypesetter, the standard destination device for non-device independent troff. The best version of troff comes with neither of these systems. This is the Typesetter Independent Troff (TITroff) package (commonly known as DITroff, for Device Independent Troff). It is available separately from Western Electric and includes useful graphics packages (pic and ideal) which can be used to augment the basic typesetting facilities. In 4.2, the printer spooling facilities have hooks for TITroff so that the package can be used immediately when obtained (though TITroff itself is still distributed separately by Western Electric). See below under Printing. The Writer's Workbench facilities style, diction and explain, which analyze surface characteristics and readability of written text, are supplied with 4.1C. This is apparently a Bell Research Group package and is available separately from Western Electric. Style ignores macros from -ms, -me, -mm, and -ma, although the manual page only mentions -ms and -mm. 4.2 also includes an improved refer and bib. 2.1.5 Graphics 4.1C has rather rudimentary graphics capability. In contrast, System V has the PWB graphics package, including ged, a graphical editor, and numerous data generation, transformation, and display commands. This graphics capability has been used extensively in conjunction with the accounting packages. 2.1.6 Ingres The relational database system Ingres is part of 4.1C, and a commercial version of Ingres is available for 4BSD. We do not know if it will work under System V. 2.1.7 Text editors Both systems have the traditional UNIX editor, ed, and System V has adopted the Berkeley vi (screen) and ex (line) editors, which are also of course in 4.1C. System V documents a new screen editor named se but it was not included on the distribution. Apparently it does not utilize the terminal independence capability provided by - 8 - termcap but, rather, uses its own terminal description file, /usr/lib/se.term (also not on the distribution). Recent versions of the Rand Editor e and UNIX Emacs can presumably be made to run correctly on System V, although this was not our experience. Though the distributed versions of the two commonly available versions of Emacs have problems running on 4.1C, since they still attempt to use the obsolete MPX IPC facility, at least one (Gosling's) has already been adapted at Berkeley to use the superior Berkeley IPC mechanism. (No problems were noted running them under 4.1.) 2.1.8 Electronic mail System V has a rudimentary mail system, not much altered from V7 or System III. 4.1C has a more elaborate one, with most of the commonly useful mail functions. 4.1C actually provides two mail delivery routes, one unprotected and the other encrypted. There is a new mail delivery program called sendmail (a descendant of the delivermail of 4.1) which provides a central mail handling system capable of dealing with multiple networks and addressing formats. The Rand mh system can be used as an alternative front end to the Berkeley mail system and will be provided with 4.2 as user contributed software. Some people use Emacs for this purpose. We understand that the MMDF mail system from the University of Maryland can be used with either the Bell or the Berkeley version of the Unix System but we have no direct experience with it. 2.1.9 Printing The lpr command and lpd daemon have been modified in 4.1C to use the file /etc/printcap (similar to /etc/termcap) to define the characteristics of the various printers attached to a system. Printers may be added or deleted without changing the programs and output filter programs are supported on a per-device basis. It is possible to treat a printer on another machine as if it were local (from the user's viewpoint) and have lpd ship files across a network to it. The Berkeley IPC mechanism is used for queueing requests, editing the queue, monitoring queue activity, etc. In 4.2, lpr, etc., provide support for various raster devices (such as Varian or Versatec), laser printers (such as Imagen), and numerous ordinary printers. Specifying a new type of device in /etc/printcap is relatively easy. - 9 - The user specifies a printer either as a command line option to lpr or in the PRINTER environment variable. The System V lpr is considered obsolete and has been replaced by a spooling system similar in flavor to that provided with 4.1C but without the extensive network support. The LP-11 is still considered the canonical printing device, although a particular destination may be specified by the LPDEST environment variable. MDQS (Multiple Device Queueing System) is available from BRL and provides support for queueing output to a variety of different devices. 2.2 System Calls The user interface to most of the system calls is the same, i.e., the interface routines in the C library have the same calling sequence, but the actual system call numbers differ. 4.1C has introduced a number of new system calls, some intended to eventually replace older ones completely. Many of the older ones are now simulated by interface routines that call the new, extended ones. 2.2.1 Vfork and fork The fork system call in System V has been changed to require only one pass through the process table per invocation. A resulting improvement in performance is claimed; however, we did not attempt to measure this. 4BSD includes the vfork version of the fork system call, which allows creation of a new process without the need for copying the entire address space of the parent. This makes sense in any environment where processes get very large, as in the paging environment provided by 4.1C (see comments below), but the implementation also imposes certain restrictions which can mislead the unwary. Performance statistics relating to the use of vfork are widely available and are outside the domain of this presentation. 2.9BSD has vfork for the PDP-11. 4.3BSD will eliminate the need for vfork by a reimplementation of fork. 2.2.2 Reboot 4.1C has the reboot system call, which is quite convenient for persons engaged in system development work. (See below on the reboot command.) System V documents a reboot system call for the WECo 3B20S but nothing seems to be available for DEC machines. - 10 - 2.2.3 Setpgrp 4.1C has elaborated the setpgrp system call to be more compatible with the job control functions of the Cshell. 2.2.4 Group system calls 4.1C has a new method of dealing with the concept of groups and group ids (see the major section below on Groups). 2.2.5 Ioctls The ioctl system call is essentially identical in the two systems. The interesting differences are in the terminal driver ioctls. Both drivers utilize the ``line discipline'' notion, allowing dynamic choice among several protocols by the user process. Berkeley offers several new features in 4.1C BSD over the V7 terminal driver. Some of these are accessed as a new line discipline (the ``new tty'' discipline), while a few others are implemented as additional ioctl calls. There is a line discipline in 4.1C for an RS232 interface to an Hitachi tablet (this is undocumented). All of these are useful features, but the tty ioctls have become somewhat baroque. The System V terminal driver is radically different from the V7 one. Many functions which always should have been orthogonal now are. As one example, the conversion of carriage return to new line on input and of new line to carriage return and line feed on output are now separately controllable functions. Of course, this driver is hopelessly incompatible with any previous one (except System III) and with the Berkeley one. Additionally, there is peripheral processor support for this line discipline in the KMC-11B (see below). System V also provides support for a virtual terminal protocol, allowing drivers for selected terminals to be compiled directly into the kernel. The terminal type may be manipulated by two related ioctls, LDSETT and LDGETT; a type specifier may then be passed to, say, getty (see below). Unfortunately, this feature is not well-documented and it is particularly advisable to study the terminal driver code and the file /usr/include/termio.h. 2.2.6 Open and fcntl The open system call in System V presents essentially the same interface as in System III but now claims substantially improved performace due to the use of a hashed inode table. The _dup2 function of V7 and 4BSD has been replaced and elaborated upon in System V by the fcntl system call. 4.1C preserves the 32V FIOEXCL ioctl call to give control over - 11 - the inheritance of file descriptors across an exec; this is provided by fcntl in Systems III and V. In conjunction with an additional argument (mode) to the open system call, fcntl permits access to the O_NDELAY (non-blocking I/O) capability. (The System III O_NDELAY bug appears to be fixed in System V.) 4.1C uses an ioctl to set up non-blocking I/O but also has various open modes in addition to the old read and write modes, plus the optional third argument for some of them. Non-blocking opens, for instance, are supported. 4.2 has adopted exactly the same open and fcntl interfaces as System V, even going so far as to duplicate the names of the mode bits. A different include file is used for open, however. 2.2.7 4.1C BSD file system calls 4.1C has a number of new system calls affecting file I/O, in addition to the modifications to the open call noted above. There are now system calls for mkdir, rmdir, and rename. Equivalents of old calls that apply to file descriptors instead of file names have been added: fchown and fchmod. Symbolic links require some specific calls: lstat, symlink, readlink. File truncation is supported by truncate and ftruncate, and file locking by flock. Scatter/gather I/O is supported by readv and writev. The notion of ``file descriptor'' has been generalized to include various other kinds of descriptors, such as socket descriptors for use with IPC endpoints. Many of the system calls, e.g. close, that apply to file descriptors also have meaning with other types of descriptors, and there are several new system calls to deal with descriptors, such as getdtablesize. The most generally useful of the new descriptor system calls is select, which is used to do synchronous multiplexing of operations by determining (among other things) whether it is possible to read or write data on any of a set of descriptors. See also the major sections below on File Systems and IPC. - 12 - 2.2.8 Timing In 4.1C, all times are returned in a machine independent format, viz., seconds and microseconds. There is also improved timezone flexibility. (Systems in Australia and Europe should no longer experience difficulties with timezones.) 4.1C uses a simulated 100 Hertz line clock to report times more accurately than before. The new system call to replace ftime and time is called gettimeofday. Profiling using prof is also affected. The getitimer and setitimer system calls allow the use of three interval timers, one for real time, one for virtual time (i.e., the time the process is actually running), and one for user and system virtual time. The latter allows interpreters to be profiled, that is, keeping track of when the interpreted program, rather than the interpreter, is running. These timers had a bug in 4.1C, but work properly in 4.2. 2.2.9 IPC The old MPX and FIFO IPC mechanisms have been largely superseded by the new mechanisms discussed below in the section on IPC. 2.3 Libraries and Subroutines There are many changes in this section. 2.3.1 Common Object File Format routines System V adds a number of routines to provide specialized open, close, seek, and read operations for files written in the Common Object File Format (see below). 2.3.2 Utmp routines In accordance with substantial changes to the format of the utmp structure (see below) in System V, a collection of routines similar to those provided for manipulating the password file have been added to deal with the /etc/utmp and /etc/wtmp files. 2.3.3 F77 library See above under Programming support environments for f77. 2.3.4 Knuth algorithms In addition to the binary search (bsearch) and linear search (lsearch) algorithms available in System III, System V provides routines for searching hash tables (hsearch) and binary trees (tsearch). A related UNIX-specific utility, ftw, provides the ability to recursively descend a directory tree, applying a user-supplied function at each node. (In other words, it is the subroutine equivalent of the find command.) - 13 - 2.3.5 Software signals and matherr System V preserves the ssignal (and the associated gsignal) facility from System III, allowing the user to raise and dispose of software signals. A related topic is the inclusion in System V of matherr, an error-handler invoked by functions in the math library. The user may supply his own version of matherr to control the disposition of such errors. 4.1C preserves the software signals added in 4.1 to support the job control features of the Cshell; these are related to the ``new tty'' line discipline. (4.2 provides a new signal interface.) 2.3.6 Stdio buffering 4.1C buffers output even if the output file is a terminal, but flushes all terminal or pipe output when the process attempts to read from a terminal or a pipe. 4.1C adapts its buffer size according to the block size of the file system containing the relevant files, to produce a marked speed improvement. The System V stdio buffers have been increased and the string-oriented output functions have been changed to provide pseudo-line buffered output to terminals if buffering has not been specified explicitly. As in 4.1C, output is flushed on terminal reads. Both systems keep stderr unbuffered. (The calling program may, of course, determine the buffering via setbuf.) In 4.2 perror does a single write (using writev to gather the arguments) to alleviate the problem of single- character network transfers, but stderr is still unbuffered. 2.3.7 Printf System V has dropped the System III (undocumented) vprintf and vfprintf, retaining the V7 (undocumented) doprnt routine, such as is still used in 4.1C. The old (undocumented) %r format has apparently not been restored, however. In a similar vein, certain Berkeley programs assume that sprintf returns the address of the buffer, an undocumented feature that has been changed and properly documented in System V (the number of characters written is returned). System V printf follows the System III standard, which abolished the old capital letter formats ("%X", "%F") for long variables in favor of the prepended-`l' ("%lx", "%lf") format so that capital letters can be used in the - 14 - hexadecimal and floating point formats to mean capital letters in the output stream. 4.1C has basically the V7 printf and scanf. The printf and scanf formats are still not fully compatible in either system. 2.3.8 String routines As in System III, System V changes the V7 (and 4BSD) index and rindex functions to strchr and strrchr, respectively, as well as adding a few additional string routines reminiscent of certain SNOBOL pattern primitives. System V also provides new routines to perform basic memory-to-memory operations (copy, compare, etc.) based on byte count rather than a terminating null character. 4.1C provides the same functions via the bcopy bcmp, and bzero routines, which are now in the C library as well as the kernel. 2.3.9 Network library The 4.1C C library contains a collection of routines used for translating network-related names and numbers, such as gethostbyname, which takes the name of a host and returns its address. There are also a few routines for manipulating the byte order of network addresses, such as htonl, which converts a network host address from network to host byte order, and some routines brought up from the kernel that are used for manipulating byte arrays, such as bzero, which clears a byte array. 2.4 Devices Details of device drivers are beyond the scope of this paper. We only mention a few corresponding to the most important devices. 2.4.1 Tty See above under ioctl for a discussion of the terminal driver changes. 2.4.2 DH-11 System V provides no support for the DH-11 terminal controller. Although DEC no longer supports this device, many installations either still own DEC DHs or emulations from other vendors. Also, DEC now supports the Emulex DH-11 emulation (CS21/H). The replacement is a combination of DZ-11s controlled by KMC-11Bs. The System III dh driver is probably portable to System V, but of course you must acquire a System III distribution. - 15 - 2.4.3 KMC-11B System V no longer supports the KMC-11A microprocessor. The KMC-11B may be used in conjunction with DZ-11s for offloading terminal I/O processing; it now performs batched character transfers, an improvement over the character-at- a-time behavior (a bug) exhibited by System III. The KMC-11B, as well as the KMS11 (KMC11 plus DMS11- DA), is also used as the ``Programmable Communications Device'' (PCD) on which link-level protocols are implemented under VPM. 2.4.4 VPM The Virtual Protocol Machine (VPM) is a package which supports a high-level definition language for level 2 protocols to be handled by an interpreter running in a PCD. In this manner, IBM RJE, a synchronous pseudo-terminal interface, and several network protocols, including X.25 but not TCP/IP, may be supported. 2.4.5 Synchronous terminal System V documents support for a synchronous terminal interface utilizing the Virtual Protocol Machine, but it was not included in the distribution. 2.4.6 BLIT At the time of this writing, the driver for the Teletype 5620 bit-mapped terminal was available only as a System V-compatible binary object. 2.4.7 Ptys 4.1C has a pseudo-terminal driver to support network connections. This is actually a driver for two devices, a slave (/dev/ttyp?) and a master (/dev/ptyp?) end, where the slave end looks exactly like an ordinary terminal and the master end (used by network daemons such as rlogind and telnetd) has a few extra ioctls to aid in simulating a terminal. This device is quite important in the common situation in which there are few directly-connected terminals and most users log in over a local network. Pseudo-terminals are also important for such programs as Emacs and script. 2.4.8 Generalized disk driver System V provides a generalized driver gd for several moving-head disks (RM05, RM80, RP04/5/6/7). The System V driver may be derived from the Berkeley hp driver, which supports all MASSBUS drives. - 16 - The drive type is determined in both systems by examining the device type register and then using different parameter tables per drive. 2.4.9 Generalized tape driver A generalized tape driver gt similar to the generalized disk driver is provided by System V. This driver offers a general interface to the TE16 and TU78 style tape drives. The Berkeley ht drivers support all MASSBUS tape drives except the TU78, which is supported by the mt driver. Again, device information is determined by examining the device hardware type register. 2.5 File Formats A few file formats are worth mention. In particular, System V has reorganized several standard file formats, with important consequences. 2.5.1 A.out The details of the binary object file format for commands are sufficiently different between the two systems that it is not possible to run an object file from one system on the other. The Common Object File Format (COFF) has been adopted in System V as the standard output format for programs such as as and ld in an effort to provide uniformity across several processors and compatibility with certain other operating systems. The traditional UNIX a.out header is included as a part of the COFF header information. Other COFF information includes such things as the architecture of the host on which the file was created, line numbers if a symbolic debugging option was in effect at compile time, and so forth. The convert utility may be used to convert pre-System V objects to COFF. 2.5.2 Ar The System V format for file archives has changed somewhat from previous releases of UNIX. In particular, an archive now includes a symbol directory created from the symbol tables of all archive members which are in COFF to allow ld to perform random access on the archive. In addition, numeric data in headers (archive, symbol directory, member) is stored as 4-byte quantities and should be portably accessed using the sputl and sgetl library calls from libld.a. - 17 - The portable ar command creates archive headers specific to the host. Arcv is still available to convert old-style archives from PDP-11 to VAX-11/780 format. However, convert should be used to convert pre-System V archives to the newer format. 4.1C has the ranlib program for inserting an index at the beginning of an archive, so that the archive can be randomly accessed. The Berkeley ar (introduced in 4.1BSD) produces ASCII output, making the archives portable without the need of any special libraries. 2.5.3 Fs We consider this topic sufficiently important to have a major section to itself later in the paper. 2.5.4 Termcap and descendants 4.1C includes the termcap facility, which maps common terminal control functions to the specific escape sequences for a particular terminal, and the curses library of cursor motion optimization functions. These are used by a number of programs, including the vi editor, to achieve terminal independence. System V has adopted termcap but not curses. Termcap has spawned various lookalikes in 4.1C. Printcap is used by lpr to determine the characteristics of printers (see above under Printing). Disktab is used by newfs to determine how to configure a new format file system (see below under File Systems). Remote is used by tip (the successor to cu) to determine the characteristics of a remote system (see below under UUCP). 2.6 Games Both systems provide a variety of games, ranging from the ever-popular hunt-the-wumpus to chess and automated Dungeons and Dragons. 2.6.1 System V games On System III, most of the games were shell scripts which echoed the message: this game does not work on the VAX This deplorable situation has been largely corrected in System V. 2.6.2 4.1C BSD ASCII graphics games 4.1C has numerous games which use termcap and curses to produce ASCII graphics on various terminals. Examples are rogue (a role-playing game in the manner of Dungeons and Dragons), worms, rain, - 18 - canfield, and mille. System V has the snake game, but for some reason has removed the termcap support, making it work on only a few terminal types. 2.6.3 PDP-11 compatibility 4.1C provides a package which allows the use of the PDP-11 compatibility feature of the VAX processor. This package was supposedly included originally to support the PDP-11 binary of the game dungeon (zork). The fact that it is still included under games seems fitting. 2.7 Miscellany 2.7.1 File system hierarchy The most notable addition here is /usr/ucb, which 4BSD uses to contain objects for commands developed at Berkeley (though not all such commands), and /usr/local, used to contain commands and libraries (in /usr/local/lib) that exist at Berkeley but are not distributed (making this a convenient place for commands developed at other sites). See below under Sources and Documentation for comments on the source trees. 2.8 Maintenance 4BSD has a reputation in some quarters as being unsuitable for production use because it is a research system. This reputation is undeserved, as its maintenance facilities are more highly developed than those of System V. 2.8.1 Init, getty, and login Init and getty are not substantially different in 4.1C from 4.1. Login has been modified to handle rlogin-style remote logins without passwords on machines under the same administration. There are also the modifications from 4.1 related to shutdown (disallowing logins before a shutdown) and security (prohibiting login by superuser on certain terminals, such as dialups). On the other hand, the finite state machine approach used for System III init has been greatly elaborated in System V. Init is driven from the file /etc/inittab, as in System III. This file is used to specify the identity, behavior, and arbitrary id of the processes to be associated with each state init can occupy. A typical entry might specify that in a state commonly corresponding to multi-user - 19 - mode, a getty should be respawned on each terminal line when the death of a previous getty is detected. Single-user mode is a distinguished state, with the option of having a virtual system console connected to any terminal. State change instructions are issued to the ancestral init via the telinit command. The System V getty is likewise driven from a file, /etc/gettydefs. This file includes, for each speed, initial and final flags used for setting the mode of a terminal line, the login herald, and the next speed to try. In actuality, the speed is only a label for which getty searches, so that it is possible to make terminal-specific entries which include control sequences in the login message, etc. System V login has mainly been changed to deal with the new utmp structure. In addition, environment variables may be set in the login response. In passing, we note that the old UNIX standby, who, has been turned into a general utility for summarizing /etc/utmp and /etc/wtmp. To this end, it now has no fewer than ten different options. 2.8.2 Shutdown, halt, and reboot 4.1C has the convenient commands shutdown (bring the system down politely, informing the users), halt (stop the system immediately), and reboot (shutdown and bring up a new system). When coming up, 4.1C automatically performs fsck on all the file systems (running one fsck subprocess per disk arm, in parallel, for speed) and brings the system up in multi- user mode. To bring 4.1C up from a dead start, it is only necessary to turn the power switch on. (To get into single user mode, one types ^C or uses another available method.) The normal method for bringing down System V is to run the shell procedure shutdown. Other facilities exist for terminating running processes, including killall, invoked by shutdown, and fuser, which selectively identifies and kills processes which are using specified files. A reboot command is documented for the WECo 3B20S release of System V but none seems to be available for the VAX. System V has nothing equivalent to the 4.1C BSD halt command. 2.8.3 Backups 4.1C uses dump for file system backups, in the V7 manner. The user interface to restor has been modified, however, to resemble that of tar, making it much easier to use, as it is now possible to restor by file (or even directory) name, rather than by inode number. - 20 - The 4.1C dump also allows backups over networks. It runs at tape speed, and is fast enough (especially with a 6250bpi tape drive) that disk-to-disk backups seem superfluous. Several backup paths are available under System V. The volcopy utility from System III may be used to copy a complete file system. The new finc offers a fast incremental backup of those files meeting certain selection criteria (last access, modification, etc.). Frec may be used to recover files by inumber from a volcopy or finc backup. Finally, ff, a fast version of find, may be used in combination with cpio. Dump and restor are not present in System V. 2.8.4 Fsck, fsdb, etc. A slightly improved version of fsdb, the interactive file system debugger under System III, is offered in System V. Fsck, the file system checker, has been augmented by dfsck, invoked by checkall, which allows simultaneous file system checks on two different drives. Note that dfsck relies on the system being configured to use System V's multiple physical I/O buffer facility. Also, the use of dcopy to reorganize the file system for faster access (see below under File System) will contribute to faster checking. 4.1 added the -p option to fsck, which applies default rules to preen file systems (usually on reboot), and incidentally allows concurrent checking of file systems on different disk arms to speed rebooting. This is retained in 4.1C. 4.1C and 4.2, unlike 4.1, but like System V, requires a reboot after fsck modifies the root filesystem. In 4.1C and 4.2, unlike System V, the reboot is handled automatically. 2.8.5 Monitoring and debugging System V provides various facilities inherited from System III for monitoring system activity and dealing with problems. An operating system profiling package is available which uses the pseudo-device /dev/prf to access the operating system and collect performance statistics by monitoring selected text addresses. Extensive error logging and reporting is performed by a daemon which accesses the /dev/error interface to the system error collection routines. These reports are often valuable in analyzing suspected hardware difficulties. - 21 - The crash program provides a reasonably clean interactive utility for debugging core images of the operating system after a crash. It may also be used to browse through a running system. 4.1C has syslog to collect kernel error messages into /usr/adm/messages. Arrangements have also been made to send many error messages directly to the controlling terminal of the process that caused them. There are provisions for analyzing the state of the paging system after a crash with analyze. There is a paper on debugging the kernel with adb that tells how to use numerous canned shell scripts to examine various tables. Adb itself has the -k option for setting its memory maps appropriately for the kernel. 2.8.6 Accounting System V provides accounting software appropriate for a production system in the form of several tools used to create complex combined reports. The graphics facilities may be used to automatically produce charts showing various system parameters (disk reads and writes per head, number of swaps in and out, kernel buffer statistics, etc.). These have useful impact in justifying your facility to upper-level management. 4.1C has kernel hooks to collect similar accounting information (including paging statistics) but lacks the graphical output facilities. The facilities provided proved quite adequate for the purposes of actual system management in a non-billing environment, however. 3. Installation and Configuration The installation and configuration documents are sufficiently complete that few problems should be encountered when following their instructions. Known problems are noted below. 3.1 Installation Both systems are delivered in the traditional Unix format, viz. a set of half inch magnetic tapes containing copies of all the binaries, source code, and documentation, plus accompanying hardcopy documentation (Western Electric sells manuals ready for use, while Berkeley supplies duplication-quality masters). 4.2 will come with a console cassette and floppy, so it will no longer be necessary to hand-code initial bootstraps. - 22 - System V binary licenses are available, and the Berkeley distribution is also available on two RK07 disk packs. For those unfamiliar with VAXen, Installing and Operating 4.1C BSD contains sections on VAX hardware terminology and disk formatting which have no counterparts in Setting Up the UNIX System (the System V installation guide). Both systems provide a disk formatter. The format command provided with System V will format RP06 and RM03/05 disks. The formatter of 4.1C and 4.2 formats almost any non-DEC UNIBUS or MASSBUS drive, and also includes RM03s and RM05s, i.e., any disk with the BSC bit in the header. It cannot handle RP06, RP07, or UDA50s, but the DEC formatter can do those. We had no real problems booting either system. (The System III boot bugs seem to be fixed in System V.) 3.2 Configuration Both systems are relatively easy to configure. System V includes driver support for most devices of interest, including the RM05, RM80, and RP04/5/6/7 disks. 4.1C BSD supports all of the devices just mentioned, plus many others, and also understands the full interconnection architecture of the VAX, so that it is possible to have, say, two RP06's on one MASSBUS and another on a second, and the system may be permitted to decide at configuration time which MASSBUS's the RP06's are on. System V is configured by running the config program against a system description file; entries in the file are checked against a list of supported devices in /etc/master. The vcf command may then be used in standalone mode to verify address and interrupt information in the kernel object against the actual hardware present on the system. 4.1C and 4.2 are configured with a config program too, but this one works markedly differently. The sources are arranged in such a way that several different kernels (for the same or different machines) can easily be made from the same sources. Things such as network node names are parameterized at run time so that the same kernel can easily run on several machines with the same CPU and peripherals. If desired, a generic kernel (such as on the distribution tape) can be configured that will find likely devices at - 23 - startup. System V still requires hand-setting of numerous parameters, such as the number of process, file, and inode table slots, while 4BSD (4.1 and later) decides appropriate values for these parameters on the basis of one number: the number of users the machine is to support. (The default rules can be overridden, of course.) 3.3 Transition The transition from 4.1BSD to 4.1C BSD (or 4.2) should not be very troublesome. Though the file system implementation is quite different, the user interface is almost identical, especially since system calls that have been replaced are simulated. The long file names are, of course, not a problem. The new directory format might appear to be, but there should be few programs other than system ones (which are supplied) that read directories directly. Directions for the transition are given in Installing/Operating 4.2bsd under the section ``Upgrading a 4BSD System,'' and A 4.1a User's Guide to 4.1c provides useful user orientation. There are no provisions for upgrading from a system previous to 4BSD, such as 3.0 or 32V, though this could presumably be done with sufficient investment of effort. System V is distributed with a document, Transition Aids, designed to assist the system administrator in changing from System III to System V. Especially crucial transition topics include: hardware support changes (esp. lack of DH-11 support); whether to convert to a 1K file system; conversion to the new archive format; and conversion of objects or (preferably) recompilation of sources for user programs to accomodate the new headers and object file format. 4. Sources and Documentation There has been a large amount of reorganization of sources, documentation, and associated support since 32V. - 24 - 4.1 Make System V includes the extended make, which features many additional default rules to handle common conditions, to the point that many compilations require no makefile. Additions are also present which handle archives and SCCS files (see below) and make use of environment variables and defaults. Most system programs may be rebuilt by using the collection of :mk command files located in the source tree. The 4.1C BSD make seems to be very much in the flavor of V.7. Rebuilding the whole source tree is as easy as in System V, however, and is recommended to be done frequently. 4.2 SCCS System V includes the PWB Source Code Control System (SCCS), not available in 4.1C BSD. 4.2 is rumored to include RCS, a public-domain rendering of SCCS. 4.3 Sources System V preserves the changes to the names of source directories and files which System III introduced (the kernel ``sys'' subdirectory becomes ``os'', and ``dev'' becomes ``io''). However, since there is an appropriate makefile (or :mk command file) for almost everything it is possible to go to the appropriate parent directory for a software package and let make do the work. The 4.1C sources, both user and kernel, have been radically reorganized in order to simplify recompilation of the entire system and to promote portability. There is generally a source directory subtree corresponding to each directory containing objects, e.g., /usr/src/usr.bin for /usr/bin, making sources easy to locate. Good use has been made of symbolic links, in order to avoid duplication of sources, and to allow keeping certain pieces (such as the kernel sources) on whatever file system is appropriate, e.g., /usr/include/sys is a symbolic link to /sys/h, and /sys is itself a symbolic link to /usr/sys. The kernel sources have all the VAX-specific code separated out into different directories and files from the portable code. The user sources have also been similarly organized for portability. The C library, in particular, has been redone. One would expect 4.1C to be as portable to another 32-bit machine as 32V or System V. There is a rather widespread problem in Berkeley code consisting of the use of the type int when long, or even - 25 - off_t, or especially time_t is meant. This works fine, as long as you never try to run such code on a machine where int is smaller than 32 bits. (This problem is not evident in the kernel, but rather, in application programs.) This problem is perhaps less prevalent in 4.1C than in 4.1. Fairness requires mentioning that there are also numerous places in the documentation where it is asserted that int is 32 bits, on the grounds that machines with smaller word sizes are not sufficient for many of the functions 4BSD supports. 4.4 Documentation Berkeley provides the traditional Unix Programmer's Manual, volumes I, IIA, and IIB, plus an additional volume of papers written at Berkeley and related directly to the Berkeley parts of the system. The documents come as duplication quality masters of 8-1/2 by 11 inch pages suitable for ordinary three-ring looseleaf binders. The first volume has of course been updated and is also kept on-line for easy access. System V has largely reorganized the system documentation. Volume one has been divided into a User's Manual, an Administrator's Manual, and, peripherally, the Operating System Error Message Manual. Most of the classic UNIX papers which appear in volume two in the Berkeley distribution have been pieced together to form such things as a Document Processing Guide and a Programming Guide. All in all, there are twelve documents furnished with the purchase of a System V license; extra copies are for sale by Western Electric Software Sales and Marketing. It is disappointing to note that not all of the documentation is provided on the distribution tape, a feature considered critical by some (e.g. the sight- impaired). 5. Groups and Identifiers 4.1C changes the implementation of groups and related identifiers sufficiently to motivate this section. 5.1 Groups System V uses the old V7/32V group scheme: a user may have access to a login group (specified in /etc/passwd) and also to several other groups (as permitted by /etc/group), but may be in only one group at a time. - 26 - In 4.1C, the same files in the same formats determine what groups a user is allowed in, but the user is immediately put in all of them at login: there is no newgrp command. The groups command lists the groups you are in. The maximum number of simultaneous groups is a system compile-time parameter, and the default is eight. The setgroups system call can be used (by superuser) to set the groups for a process. In both systems, each file has a single group associated with it to determine group read, write, execute, and setgid permissions. System V creates a new file in the effective group of the process, whereas 4.1C creates the file in the group of its parent directory. Both systems have chgrp (both command and system call) to change the group of an existing file. 4.1C allows the user to change the group of a file he owns to be any of the groups to which he belongs. System V allows the user to change the user and group id of any file he owns, thus giving the file away. 4BSD does not, apparently because of the existence of disk space quotas, which System V lacks. 5.2 Identifiers Berkeley has extended the setuid and setgid system calls in 4.1C to allow setting the effective id to the value of the real id, as well as the reverse. This is very useful for things like network server daemons, which may now switch permissions between superuser privileges and those of an ordinary user, and back, in a single process. This (along with the socket IPC and non-blocking I/O) allows many daemons to be implemented as one process where formerly two were required. Group ids and process ids are 32 bit integer quantities in 4.1C. The high order 16 bits of the process id are not yet used, but probably will be with the development of distributed applications. 6. File Systems Both systems have file systems different from their predecessors and each other. Though the comments in this section may make the differences seem extreme, the user interface is not much changed in either case from 32V, and we have had no trouble transferring files between the two systems with either tar or cpio (though cpio had to be ported to 4.1C first, of course). - 27 - 6.1 System V 6.1.1 New file system block size System V has introduced a revised file system which allows a choice of a 512 or 1K byte block. The information concerning the type of a file system is recorded in its superblock, so it is possible to have both kinds of file system on the same system. Robustness is enhanced by carefully controlling the order in which inode and directory information is written out in order to prevent serious file system inconsistencies in the event of a crash. 6.1.2 Faster access Other enhancements claimed to improve efficiency include multiple (3-7) physical I/O buffers (upon which dfsck, a multi-drive version of fsck, depends); a larger number of system buffers (up to 400); free list management of the file table; and hashing of the in-core inode table. A utility, dcopy, is provided to allow reordering of a file system to optimize access time by compressing directory ``holes'' and spacing file blocks at the disk rotational gap. Its frequent use is recommended. 6.2 4.1C BSD 6.2.1 Reimplementation for efficiency 4.1C has a file system that uses a block size and a fragment size that are settable per file system. The basic block size (usually 4096 or 8192 bytes) is the largest block size used in a file, and all blocks but the last are this size. The last one may be any multiple of the fragment size (usually 512 or 1024, and no more than a factor of eight less than the basic block size). Inodes are divided among several cylinder groups on a file system, and blocks in a file are usually localized in a single cylinder group. In-core inode copies are hashed. The standard I/O library has been modified to use the block size returned by the modified stat call to determine the size of its transfers. Various changes were made for robustness, as well, beyond those found in 4.1. For example, static information from the superblock (such as the block and fragment sizes) is duplicated in each cylinder group. Measurements made at Berkeley indicate the new file system is up to a factor of 16 faster than the old (4.1) one under ideal conditions, and a factor of 10 is not unusual in - 28 - actual use. 4.1C keeps defaults for the various parameters needed by the new file system in /etc/disktab (a termcap-like file), where newfs (a frontend to mkfs) uses them in constructing a file system, storing them in the super block. Various other programs, such as bad144, which handles bad sector marking, also use /etc/disktab. This file is a kludge used because the information is not yet kept on the disk and accessible by an ioctl. 6.2.2 Other modifications In addition, 4.1C has very long file names (compile time parameter of 255 characters) analogous to the long C identifiers, a reworked directory implementation, symbolic links, and mkdir, rmdir, and rename as system calls. The use of file names that are actually 255 characters long is not, of course, recommended. The idea is to set the limit high enough that ordinary use will never hit it. A simulation library for the new directory format has been distributed several times over USENET; it is a good idea to use it even if conversion to 4.2 is never planned, since it solves several old Unix directory access problems (e.g. insuring null termination of file names extracted from a directory). A symbolic link is simply a file containing a pathname, which is interpreted by the kernel after the pathname of the link itself. Thus cross-device links and links to directories are possible. The motivation for moving mkdir, rmdir, and rename into the kernel was to make them extensible in a network environment. In the case of rename, robustness during system crashes was also a factor. 6.2.3 Extended (network) file system Neither 4.1C nor 4.2 have the extended file system that makes it possible to mount, on one machine, a file system existing on a disk connected to another machine, with file transfers then proceeding over the network connecting the two machines. There are several implementations of such a facility but none will appear in 4.2. - 29 - 7. Interprocess Communications (IPC) This is one of the areas where the systems diverge the most. 7.1 System V System V provides several somewhat different paths for achieving interprocess communication, mostly developed for real time support. The fifo, or named pipe, has been retained from System III, allowing a process to open a pipe by name rather than needing a parent to set up appropriate file descriptors. The message queue operations associate a unique identifier with a system message queue and data structure that includes information about the last processes to send and receive messages, the times at which these events occurred, etc. The semaphore operations associate a unique identifier with a set of semaphores and a data structure that includes time and pid of last operation, number of processes suspended while waiting for a particular change in the semaphore's value, etc. The shared memory operations associate a unique identifier with a shared memory segment (which may be attached to the data segment of a process) and a data structure containing the size of the segment, time and pid, etc. As an adjunct to the above, process segment locking (text, data, or both) via the plock system call is also provided. The number of message queues, size of each queue, number of semaphores, number of shared memory segments, etc. are all parameters which are determined by the system administrator at system configuration time. 7.2 4.1C BSD 4.1C has dropped the V7 multiplexed files (MPXs) that were retained in 4.1 in favor of a new interprocess communication facility. This new socket IPC integrates the pipes, file and device I/O, and network I/O into one interface, which allows blocking or non-blocking I/O, multiplexing several I/O streams in one process by use of - 30 - non-blocking I/O and the select system call, and scatter/gather I/O. The socket IPC solves most of the traditional Unix IPC problems, and is more general than the various mechanisms which have preceded it, such as pipes, MPXs, Rand ports, BBN await/capac, etc. The mmap shared memory facility described in the 4.2BSD System Manual is not supported in 4.1C, and will not be in 4.2. It will, however, appear in 4.3BSD, along with the revised fork system call that makes vfork obsolete by only copying pages when they are modified. Various other memory management-related changes will also come with 4.3. 8. Networks With the increased use of networks of small workstations and larger file or compute servers, this subject is gaining importance. 8.1 System V While it is said that System VI will incorporate the Berkeley network code, most network support in System V is implemented using KMC-11Bs. 8.1.1 X.25 System V documents the use of its VPM facility to support X.25 in a KMC-11B peripheral processor, and the same technique can be used for other networks. However, the X.25 support package was not included on our distribution tape, and the documentation leads one to believe this was intentional. Rumor has it that there is a current project to implement X.25 under the 4.2 network framework. 8.1.2 PCL network System V provides a driver for the PCL- 11B network bus, used to interconnect multiple CPUs for fast parallel communications. A local network of UNIX machines is made practical by the inclusion of the net command, which allows commands to be executed on remote system. It is very reminiscent of berknet. 4.2 has a PCL driver. 8.1.3 NSC network System V documents an interface specification for the NSC A-410 processor and its associated software, used to access an NSC local net (Hyperchannel). Neither a driver nor applications software was provided with - 31 - the distribution, however. 4.1C and 4.2 have an NSC driver. 8.1.4 RJE to IBM System V implements software which communicates with IBM JES by emulating a 360 remote workstation. It relies on a VPM script running in a PCD, say, the KMC-11B. Facilities are provided for queueing jobs, monitoring the status of the RJE, and notifying users of the arrival of output. 8.2 4.1C BSD Networking is one of the strongest points of 4.1C. 8.2.1 General networking framework The network mechanisms were designed with the intention of supporting a variety of network protocols and hardware. The socket IPC provides an interface common to both networks (the internet domain in particular) and internal Unix facilities (the Unix domain). The internal networking mechanisms support easy implementation of further protocols or interface drivers, and are clearly documented. 8.2.2 Variety of hardware and protocols supported Hardware currently supported includes several kinds of ethernet* interfaces (3COM, Interlan, Xerox 3Mb experimental), several ARPANET IMP interfaces (ACC LH/DH, DEC IMP11-A, SRI) a ring network interface (Proteon 10Mb), and various others, such as DMC-11, NSC Hyperchannel, and Ungerman-Bass with DR-11/W. 4.2 (but not 4.1C) has a PCL driver. ISO/OSI** Network, Transport, and lower layer protocols supported include 3Mb and 10Mb ethernet, Proteon proNET 10Mb ring, and the DoD internet family (TCP/IP and relatives). __________ * Ethernet is a trademark of Xerox Corporation. ** International Standards Organization Open Systems Interconnection: a meta-protocol designed to promote compatibility among networks. - 32 - 8.2.3 Internet (TCP/IP) Both TCP and UDP are available for use with IP either on the local network or over the internet. ICMP is supported, and there are some gateway facilities. The socket IPC, together with a network library, provides many of the functions of the Session layer. The socket type SOCK_STREAM, which provides a reliable, ordered, byte stream, is currently supported by TCP, while SOCK_DGRAM, providing datagram service, is supported by UDP. There is no internet protocol to support SOCK_SEQPACKET, for sequenced packets. SOCK_RAW allows direct access to, for instance, the IMP interface, for debugging and development of new protocols. Only the superuser is permitted to use this socket type. At the Applications layer, the Internet protocols Telnet, FTP, SMTP, and TFTP are supported. 8.2.4 Berkeley protocols The berknet facilities of 4.1 are officially removed from 4.1C and 4.2. There are also various new protocols developed at Berkeley, including remote login among machines under the same administration without passwords (rlogin/rlogind). Remote shells and remote procedure calls (courierd) are supported, as are file copy (rcp/rshd), and status protocols such as the one that supports rwho and ruptime (rwhod). This latter takes advantage of the broadcast packet facility of Ethernets and rings to exchange status information about who is on what system and what systems are up on the local network. (The idea is easily extensible to networks without broadcast packets.) Some of these protocols use UDP and some use TCP. These protocols make use of several machines over a local network or networks quite convenient. 8.3 UUCP Both systems support UUCP, though the details diverge: System V allows uucp copy addresses to specify paths across multiple systems (as for mail), while 4.1C still permits copies only between adjacent systems. Naturally, all systems in a multisystem path must be running uucp with forwarding to properly effect forwarding. 4.2 has all well-known uucp bugs fixed. It also supports more than half a dozen auto-dialers. The spooling directories have been made sane. - 33 - The cu command is retained in System V, but dropped by 4.1C in favor of tip. Tip gets parameters for a remote system from /etc/remote (yet another termcap-like file) so it is possible to just type tip system-name and be connected to the remote system (whose name, system- name, is in /etc/remote), without having to know the phone number or devices involved. If the cu interface is desired, linking tip to cu is all that is required to get it. Various cu-like commands are supported directly by tip. System V includes a library routine, dial, used to establish a dialout connection. This routine is used by cu, but, curiously, uucico still relies on the same old conn.c code. 8.4 USENET Neither system includes USENET news program sources. 4.2 will provide both USENET programs (readnews, postnews, et al.) and notesfiles, as user contributed software. In any case, the best version of news is clearly Bnews 2.10, which has been distributed over USENET. One method to get it might be to set up a UUCP connection to a neighboring USENET site and copy it over. The USENET and UUCP networks have become widespread enough that connection to them is certainly beneficial. 9. Performance This is a sticky issue which we will not treat in detail, as this is not a performance evaluation presentation. We will give a few tentative benchmarks, and mention two qualitative performance areas. 9.1 Some Qualitative Remarks Two important areas where performance varies widely due to system configuration and usage are paging vs. swapping and terminal I/O. 9.1.1 Paging vs. swapping System V, like System III, 32V, V7, and all the PDP-11 Unixes, swaps, while 4.1C, like 4.1, pages. With enough memory on a VAX-11/780, it is difficult - 34 - to tell the difference for a load of small processes, because System V just doesn't swap. If it is desirable to run huge graphics processes or many Emacs editors or the like, the telling point is not so much the performance as the virtual address space provided by the 4.1C paging system. Such things as LISP require the large address space paging provides, and Ingres is much more usable with it, since it can run as one process instead of half a dozen. We certainly do not intend to indicate, however, that we think paging and swapping produce equivalent performance. There are many technical papers on comparative performance that indicate paging gives much better performance; it is merely that our (admittedly idiosyncratic) experience was that under a light load it is hard to tell the difference without measuring it. 9.1.2 Terminal I/O Using DH11 terminal controllers, 4.1C provides reasonable terminal I/O performance. Berkeley has modified the DZ11 driver sufficiently that even these (basically interrupt per character) devices are usable. It should be remembered that DEC does not provide DH11 controllers for VAXen. This affects DEC maintenance, though similar hardware is available from other vendors. If you need numerous terminals running at 9600 baud or higher, the System V combination of DZ11s and KMC11 terminal controllers seems preferable. The other side of this coin is that little choice has been left for System V users, since DH-11 driver support is not included in the distribution and since DZs alone are unlikely to yield acceptable response. 9.2 Tentative Benchmarks These measurements were taken on a VAX-11/780 with six megabytes of memory and a single RP07 disk. The disk was partitioned into three sections which had similar sizes under the two operating systems. The various kernel parameters were chosen by configuring 4.1C for 32 users, and, for System V, by picking the largest parameter values suggested in the documentation. The resulting numbers of buffers, inode and file structures, etc., were similar. Memory was interleaved. No particular care was taken beyond these steps to tune either system to its maximum performance. - 35 - The numbers given here should not taken too literally, but only as indicative. 9.2.1 Load simulation This was done using a program that forks ptys instances of itself, and then each pty* repeats a job forever, or rather, until the run is over, as decided by the original parent program. The job is a brief shell script that uses commands common to both systems, as found on each system. Sources to compile with cc were taken from System V to avoid the long filename problem. They were carefully picked to avoid any System V peculiarities, such as getopt. Files to use with nroff were also taken from System V, for no particular reason. The output was redirected to a file to avoid terminal I/O considerations. Repeated runs were taken until the job throughput stopped decreasing due to file system degradation. The following figures were obtained: ptys 1 2 4 8 16 System V jobs/hour/pty 46 25 13 6.4 3.2 jobs/hour 46 50 51 51 51 4.1C BSD jobs/hour/pty 60 32 16 8 4 jobs/hour 60 64 64 64 64 The total number of jobs per hour increased slightly from one to two ptys, and then remained constant, as all CPU cycles were absorbed. The number of jobs per pty is, of course, just the total divided by the number of ptys. We interpret these results to mean that 4.1C is noticeably faster than System V. We do not state the obvious figure of 25%, because the results could easily be varied by, for instance, increasing the amount of file I/O a job uses (to take advantage of the faster 4.1C file system), __________ * No pty devices were used: this term is used only for convenience. - 36 - or by using larger processes (to force System V to swap, which it never did with the above job). Tuning either kernel could, of course, vary the results either way. Definitive benchmarks will have to await the release of 4.2BSD. 9.2.2 File system throughput Our experience has been that the 4.1C filesystem is markedly faster than the System V one. However, the actual figures vary so much according to the size of the files used, the transfer block size, the filesystem block size, whether memory is interleaved or not, etc. (though under all conditions we have tried 4.1C is faster than System V), that it would take some months and another paper the size of this one to deal with the problem. Rather than present partial and possibly misleading figures, we have decided to not present any. 10. Vendor Support The amount and variety of support for UNIX has increased dramatically over the last few years. 10.1 Western Electric Western Electric supports System V on VAXes and the larger PDP-11s, providing software assistance and user training. (User training is now available, though software assistance has apparently not yet been fully implemented.) 10.2 U.C. Berkeley The University of California at Berkeley cannot, because of the nature of the institution and of the funding used to support the development of Berkeley UNIX, provide commercial support. They do, however, accept bug reports, which may affect future versions. See below on DEC. 10.3 DEC Digital Equipment Corporation has announced they will support UNIX in the manner they have supported VMS as a VAX operating system (and they will also support it on PDP-11s (V7M)). This is apparently basically Berkeley UNIX, i.e., 4BSD (and 2BSD). - 37 - 10.4 Third Parties The number of organizations dealing with UNIX these days is quite large. 10.4.1 OEMs Many companies bringing out new Motorola 68000-based systems recently have chosen System III as the base for their operating system, with the apparent intention of moving to System V. To some extent, this will no doubt lock them into System V, and persons wanting to buy something close to a small turnkey system will probably wind up with essentially Bell UNIX. Other manufacturers with microprocessors likely targeted for System V ports are Intel, National, and Zilog. There are several ports of 4.1 to the 68000, and at least one of 4.2. There are also at least two ports of 4.1 to the National Semiconductor 16032. Several of the vendors offering System III based 68000 systems claim to support ``Berkeley enhancements,'' the interpretation of which varies between vendors, but usually seems to include vi, ex, termcap, and curses, and sometimes more. 10.4.2 Emulations Several emulations of UNIX are available from third parties, either software vendors or universities. Typically these are designed to provide a UNIX environment on top of another operating system, generally VMS. The quality of emulation varies from implementation to implementation, as does the concept of what ``UNIX'' should look like. On a slightly different note, a package will be available from BRL in the very near future which emulates System V on top of 4.2BSD. 10.4.3 Consultants There is a new class of companies that produce neither hardware nor software but instead provide assistance in obtaining and supporting both. These mostly try to cater to the markets for both systems. There is a large amount of free software available for 4.1 (and thus 4.1C) that was written principally at academic institutions. Much of it is portable to System V, though something like Interlisp that requires a huge address space is not, and there are problems with many things like Emacs because of the use of long identifiers. - 38 - Most commercial vendors attempt to produce and sell software packages to run on either variety of UNIX. Bell is among these vendors, with the TITroff package, the S statistical package, etc. Many of the commercial vendors using System III (System V) have produced graphical, menu-driven interfaces for the naive user, so that it is never necessary to deal directly with any UNIX shell. These mostly require bit-map terminals, varieties of which are also available from other vendors. The famous Bell Blit bitmap terminal is available from Teletype (model 5620). Unfortunately, as noted previously, the Unix software is available only as a System V binary. 10.4.4 Authors A number of books designed to assist the new UNIX user have recently appeared. Most of these either attempt to steer a neutral course by describing what is essentially V7, making them less useful in either a 4.2 or System V context, or they closely follow System III (V) in hopes of describing what will come to be a ``standard''. The 4.1C (4.2) user is left with the traditional task of reading the manuals. 11. Conclusion A brief summary may be useful. 11.1 Selection Criteria One may choose either Berkeley or Bell Unix on the basis of a particular needed function, such as network support, because of performance in one area or another, because of the support of a particular vendor, or for some other reason. We have touched on all these areas above, we hope in sufficient detail to indicate the capabilities of the two systems, so that areas for further investigation will be clear. 11.2 Combinations For companies with the resources, the best solution is probably to run either 4.1C BSD or System V and port the desired facilities of the other. This is the traditional route. An alternative is the aforementioned package from BRL or something similar. - 39 - Even companies with no desire to merge the two systems would be well-advised to get some sort of expert support (whether in-house or not), as neither Bell nor Berkeley can be counted on to offer the really broad support traditionally supplied by hardware vendors for their operating systems. This situation may change in the case of System V as more sites begin running the system and demanding the support which has been promised, but at the moment only time will tell. The same applies to DEC's support of 4BSD. 11.3 Future Directions A few recent developments may indicate a trend away from continued fragmentation of the UNIX community, and especially from the divergence of the systems offered by Berkeley and Bell. 11.3.1 UNIX standards committee The /usr/group UNIX standards committee appears to be making progress in standardizing at least the most basic facilities of the operating system, and has representatives from most segments of the community. 11.3.2 Berkeley features and Bell The inclusion of vi, ex, and termcap in System V, as well as the adoption of a 1Kbyte block file system, shows that Bell is aware of the work Berkeley has been doing for years in researching new directions. Perhaps System VI will go further and adopt, for instance, csh, and paging. 11.3.3 Bell licensing and Berkeley Unfortunately, until recently it has not been possible for Berkeley to include software from Bell licenses later than 32V, because the price would have been prohibitive for many of the Berkeley licensees. Though the recent reform of Western Electric's licensing scheme apparently came too late to affect 4.2BSD, perhaps we will see Berkeley adopt some later-day Bell developments. Appendix A: Terminology The official names of the various versions of the Unix System developed by Bell Laboratories and previously or currently available from Western Electric are: o+ UNIX Time-Sharing System, Sixth Edition (V6); o+ UNIX Programmer's Work Bench (PWB), V6 plus SCCS, etc.; - 40 - o+ UNIX Time-Sharing System, Seventh Edition (V7), the PDP-11 version of the first portable UNIX system; o+ UNIX/32V Time-Sharing System Version 1.0 (32V), like V7, but for the VAX; o+ UNIX System III (System III), combining PWB, V7, and 32V; o+ UNIX System V (System V), now being licensed. There have been numerous Berkeley Software Distributions of the various Berkeley versions of the Unix System. o+ 2BSD is used herein as a generic term for the PDP-11 distributions. o+ 2.8BSD is the latest PDP-11 distribution in general use. o+ 2.81BSD was a an intermediate system that was never officially distributed, but is in use at several ARPANET sites with a port of the 4.1A network software incorporated into it. o+ 2.9BSD is the distribution just now being licensed, and is said to make a PDP-11 look like a VAX 4BSD system. o+ 3.0BSD was the first paging system for the VAX, derived from 32V. o+ 4.0BSD was the second Berkeley VAX distribution. o+ 4BSD is used herein as a generic term for any Berkeley VAX distribution from 4.0BSD on. o+ 4.1BSD is the VAX distribution in most common use, and contains numerous improvements over 4.0BSD. o+ 4.1A BSD, 4.1B BSD, 4.1C BSD were versions intermediate between 4.1 and 4.2. None of them were available outside of Berkeley except for beta test, and none of them can be ordered from Berkeley. o+ 4.2BSD will presumably be licensed soon. - 41 - Appendix B: Load Simulation Job This is contents of the shell file that was used in the load simulation: mkdir $1; cd $1 cc -o simple -p ../simple.c simple nroff -man ../prof.1 prof simple tar -cvf /dev/null ../simple.c simple mon.out rm simple mon.out nroff -man ../termio.7 cc -o cmp ../cmp.c cd .. rm -rf $1