C Changes
From Computer History Wiki
This is the contents of the 'C Changes' help file, which is believed to refer to Typesetter C.
Type: >help c_old to find out about older changes to C **************************************************** Changes to C preprocessor: 1. Defines may be continued from line to line with backslash-newline 2. You can undefine things with # undef xxx 3. You can undefine standard things from the command line with cc -Uxxx (like cc -Dxxx, but reversed). 4. The <> brackets on include now have a distinct meaning again from ""; they do not search the local directory, but do search -I and standard places. 5. There is a new preprocessor command # if expression where expression should be an integer expression that can be evaluated at compile time. Defined things may be used. Strings that are not integers are either errors, or if they are undefined names are taken as zero. Strings defined as themselves are 1 (i.e. # if unix works). Strings defined as anything else are expanded and retried. There are no assignments, floating point, pointers, etc. but the full range of integer operators work. 6. There is another new command # else. ************************************* Much new stuff in C. (1) The notion of a `union' type is introduced. The type is spcified by the construction union�����_____ { ... } or union�����_____ name { ... } which is isomorphic to a structure declaration. The declaration union { int i; double d;} u; makes u a cell which may hold either an integer i or a double d. To name the double, use `u.d'; to name the integer, use `u.i'. It is undefined to store into u.d and then to access u.i, and vice versa. The actual amount of storage allocated is the maximum required by the members. This facility is intended to solve some problems encountered in dealing with data structures such as trees, wherea node may contain pointers either to other nodes or to leaves. You will not go far wrong if you think of a union declaration as a structure declaration in which all the fields have offset 0 from the start of the structure. (2) Generalized conversion operators have been introduced. The syntax is (type)expression where `type' is the name of a type to which the expression is converted. The semantics of the conversion are the same as if the expression were assigned to a variable of the named type. The type is specified by giving a type keyword (like `int' etc.) possibly followed by a declarator (in the sense of the C manual) except that the name is left out. In other words, the whole thing looks like a declaration of a single variable with the variable deleted. For example: (int)3.14159 convert to integer (int *) p convert to integer pointer (int ()) convert to function (meaningless, but that's how it would be said) (int (*)())p convert to pointer to functiion The (type) construction behaves just like an ordinary unary operator in precedence and is likewise right-associative. (3) The (type) construction can also be used in the form sizeof(type) which is a (compile-time) number equal to the number of bytes of storage occupied by an object of the named type. (4) It is now possible, and in fact encouraged, to write initializations using an '=', thus: int x = 10; double y[] = {1,2,3}; There is no change in meaning from what was previously available. (5) The assignment-type operators may now be written with the binary suboperator first and the '=' second: a += b; a <<= b; Doing things this way removes such lexical unpleasantness as the expression 'a=-b' which is really ambiguous. It would be nice to remove the older forms altogether, but this won't happen instantly. (6) The treatment of the keyword `extern' at the top level is more in line with GCOS and IBM compilers; the declaration extern int x; does not reserve storage, and in fact no reference to x is generated unless it is used. (It will not be loaded from a library just because of such a declaration). (7) This one may affect a number of programs; we'll see how it goes. Using an undefined name in an initializer gets a warning. It is a rather suspicious sort of thing to do anyway (the compiler is forced to impute a type to the name without sufficient information) and it led to a bug in initialization inside functions not fixable without adding a wart on the wart. (8) The following bugs in C have been fixed. In the declaration f() register x; {...} no diagnostic was produced, except an internal error when x was later used. There were two separate optimizer (c2) bugs which caused c2 to loop (on rather strange input). Most but not all #define names the same in their first eight characters but different later were taken to be different. To avoid false confidence, only the first eight are now significant, uniformly. A function name with a subscript was not diagnosed. Register declarations in inner blocks were sometimes unjustly treated as plain auto declarations. Arrays declared with typedef sometimes had the internal notion of the array size wrong. In certain initializations, a missing ".even" caused an assembler "odd address" diagnostic. Certain differences of rather strange pointer expressions got an internal diagnostic. (reported in Minisystems Newsletter). unsigned =>> ... compiled wrong code. long = !long compiled wrong code. Use of ?: in initializers led to an unjustified syntax complaint. ************************************* 1. The <> brackets in "include" statements are no longer needed. Any include file is looked for first relative to the directory where the source file is; then relative to any directories named in -I arguments, in order; and then in /usr/include. In a few days we will start giving warning messages for # include <>. 2. You can now use -Dxx=yy on the command line to define 'xx' with a value. ************************************* Several significant, but, it is hoped, upward compatible changes in C have been installed. 1. Type `unsigned'. A new fundamental data type with keyword `unsigned,' is available. It may be used alone: unsigned u; or as an adjective with `int' unsigned int u; with the same meaning. There are not yet (or possibly ever) unsigned longs or chars. The meaning of an unsigned variable is that of an integer modulo 2^n, where n is 16 on the PDP-11. All operators whose operands are unsigned produce results consistent with this interpretation except division and remainder where the divisor is larger than 32767; then the result is incorrect. The dividend in an unsigned division may however have any value (i.e. up to 65535) with correct results. Right shifts of unsigned quantities are guaranteed to be logical shifts. When an ordinary integer and an unsigned integer are combined then the ordinary integer is mapped into an integer mod 2^16 and the result is unsigned. Thus, for example `u = -1' results in assigning 65535 to u. This is mathematically reasonable, and also happens to involve no run-time overhead. When an unsigned integer is assigned to a plain integer, an (undiagnosed) overflow occurs when the unsigned integer exceeds 2^15-1. It is intended that unsigned integers be used in contexts where previously character pointers were used (artificially and nonportably) to represent unsigned integers. 2. Block structure. A sequence of declarations may now appear at the beginning of any compound statement in {}. The variables declared thereby are local to the compound statement. Any declarations of the same name existing before the block was entered are pushed down for the duration of the block. Just as in functions, as before, auto variables disappear and lose their values when the block is left; static variables retain their values. Also according to the same rules as for the declarations previously allowed at the start of functions, if no storage class is mentioned in a declaration the default is automatic. Implementation of inner-block declarations is such that there is no run-time cost associated with using them. 3. Initialization Declarations, whether external, at the head of functions, or in inner blocks may have initializations whose syntax is the same as previous external declarations with initializations. The only restrictions are that automatic structures and arrays may not be initialized (they can't be assigned either); nor, for the moment at least, may external variables when declared inside a function. The declarations and initializations should be thought of as occurring in lexical order so that forward references in initializations are unlikely to work. E.g., { int a a; int b c; int c 5; ... } Here a is initialized by itself (and its value is thus undefined); b is initialized with the old value of c (which is either undefined or any c declared in an outer block).