PDP-11 C stack operation
PDP-11 C stack operation is explained in detail by an old 'C Calling ('Internal Workings of PDP-11 C Programs')' help file, which I wrote in about 1978.
To give a tiny bit of background, C on the PDP-11 makes heavy use of the stack; on subroutine calls, arguments are passed on the stack (as part of the usage of the stack for the call stack), and automatic data is also kept on the stack (all addressed using the frame pointer, R5).
To give an example, this test routine:
/* Show C stack usage. */ foo(a, b) int a, b; { int x, y; x = 0; bar(1, a); return(y); }
produces this output:
.globl _foo .text _foo: ~a=4 ~b=6 ~x=177770 ~y=177766 jsr r5,csv sub $4,sp clr -10(r5) mov 4(r5),(sp) mov $1,-(sp) jsr pc,*$_bar tst (sp)+ mov -12(r5),r0 L1:jmp cret
The only things a PDP-11 C subroutine needs in its environment are i) a stack; ii) the arguments, and return point, on the top of the stack. Two special elements of run-time support, csv and cret, set up and tear down the stack frame (on entry and exit, respectively); csv will set up the frame pointer (the old contents of which are saved via the "jsr r5" which starts csv), making no assumptions about the old contents of R5.
C routines respect all registers except R0 and R1 (which are also used to hold return values; R1 only when a long is returned), and expect the same of routines they call.
Note that the top location on the stack is a scratch word (set up by csv). To call another subroutine, the arguments are pushed, the routine is called (which pushes the return PC), and on return (which pops the return PC), the arguments are discarded by the caller.
Details
THE INTERNAL WORKINGS OF PDP-11 C PROGRAMS Noel Chiappa - MIT/LCS/CSR This is a description of the internal workings of any given compiled C program output by the UNIX C compiler. C is a stack frame language, using R5 as the stack frame pointer. For simplicity, R5 will hereafter be called the FP (frame pointer). Note that arguments are generally passed on the stack and answers returned in the registers. Recall also that C global names generally start with an "_" tacked on in front of the declared name. Generally only routines and EXTERNALS (both implicit and declared) are given the honour of global names. On entry (in the UNIX environment - for a discussion of stand alone C, see the end), the SP points to the lowest location on a stack that looks as follows: Address Word <Or> Address Hibyte Lobyte 177776 0 ARGV[ARGC-1][n] . . &ARGV[ARGC-1][0]-1 ARGV[ARGC-1][0] 0 . ARGV[ARGC-2][n] ARGV[ARGC-2][n-1] . . &ARGV[0][0] ARGV[0][1] ARGV[0][0] &&ARGV[ARGC] 177777 &&ARGV[ARGC-1] &ARGV[ARGC-1][0] . . &&ARGV[0] &ARGV[0][0] SP--> 0[SP] ARGC On entry, a routine called CRT0 is executed. It comes in several flavors, depending on the surrounding environment: CRT0 Ordinary vanilla FCRT0 For programs with the floating point hardware simulator MCRT0 If the PROFIL option is in use. The basic effect of CRT0 is to set the SP one word lower, move ARGC into that, put &&ARGV[0][0] into the location above that, and leave the SP pointing to the bottom of the stack. It then does a JSR PC, _MAIN. The basic effect is to leave the bottom of the stack looking like this: . . &&ARGV[0] &ARGV[0][0] 2(SP) &&ARGV[0] SP--> 0(SP) ARGC This concludes the special handling. MAIN acts just like all other C routines, so the following discussion applies to it too. C routines expect their arguments on the stack and return values in the low register(s).(Now you know why you can only return one value!) All arguments are passed by value, so in general you only pass simple variables, with no arrays or structures or suchlike. They are in reverse order, with the first arguments at the top of the stack, and the last lowermost. (For longs, reals and doubles, the format is standard PDP-11 format; the highest order word is in the highest number word/register.) The topmost position (@SP) is of course the return PC. Routines do not remove their arguments from the stack. The first move of all (compiled) C routines is to do a JSR FP, CSV. This is a general routine that does stack frame set up and saves the old register set. It first sets FP to the current SP. (Remember that the JSR will have saved the old FP on the stack.) It then pushes registers 2 through 4. (Remember about being only able to use 3 REGISTER variables?) It then does (of all idiotic things) a JSR PC, @R0.(R0 is where it saved the return point which had been held in FP. It's idiotic because they throw away the PC that the JSR stores, so a JMP @R0 would have done just as well. I suspect that they use a JSR for the side effect, possibly having to do with a C protocol about the top of the stack being a scratch location. Oh well.) At this point the stack looks like: . . <4+2N>(FP) ArgN . . 4(FP) Arg0 2(FP) Old PC (From calling routine) FP--> 0(FP) Old FP . Old R4 . Old R3 . Old R2 SP--> 0(SP) Old PC (From CSV - unused) Note that this top word is unused - any C routine that uses the stack will write over it. The next thing done is to subtract an appropriate amount from SP to allocate space for automatic storage. (Static storage will be discussed in a moment.) All references to arguments are thus positive relative to the FP, and references to auto storage are negative relative to the FP. Temporaries are on top of that, but are generally accessed via the SP. Static data comes in two flavors - global and local. Global can be initialized, and if initialized lives in what is called the DATA segment. Local static cannot and lives in the BSS segment. (BSS stands for Block Started by Symbol; it was originally IBM 7094 terminology for "a block of reserved storage".) It is where uninitializeable static lives, as opposed to initialized, which is in the DATA segment.) Unitialized global static also lives in the BSS segment. In programs that only use I space, the order is TEXT, DATA and BSS, with TEXT starting at 0, and the DATA and BSS segments contiguous after it. (Note that in shared pure files DATA will start on a 4K boundary.) In programs that use both I and D spaces, DATA will also start at 0. The rest of the internal workings of any given compiled C routine should be obvious to anyone with sufficient PDP-11 Assembly Language experience[1][2]. Generous use of the compiler -S option for a while will soon make it possible for you to start grubbing directly via ADB. [3] is also highly recommended to all who want to know how this garbage comes to be. At the end of each routine, the routine stores its return value (if any) in the appropriate register(s) and does a JMP CRET. The companion routine to CSV, CRET does the inverse of the former. CRET is a cleanup routine that goes through and restores register 2 through 4, restores SP (it is set to the current FP, which is, as you will remember, the old top of stack), restores the FP from the stack, and does an RTS PC, thereby popping the old PC and leaving the stack as it was at the tme of the call. If EXIT or _EXIT is called explicitly, they simply put their argument in R0 (for use by the EXIT call - note that if not explicitly specified this may well be garbage) and do an EXIT call. The difference is that EXIT makes a call to _CLEANUP before dying. As appropriate, it stores the old FP and gets a new one from the value of SP just before the JSR. Failing that, when MAIN exits, CRT0 calls _EXIT, with the returned value as an argument. The reason that CRT (C Run Time support) is what starts up is that UNIX C compiler automatically links in a CRT file of some sort unless specifically told not to via the -c option. The stack pointer and arguments will have been set up by the UNIX system during the EXEC system call. If you want to use C in a stand alone program, you will have to provide your own substitute for the initial startup, and you may want to provide your own version of such things as CSV, etc. [1] Digital Equipment Corporation, "PDP-11 Processor Handbook," D.E.C. [2] Ritchie, D.M., "The UNIX Assembler," Bell Labs Memo, available as part of the UNIX documentation. [3] Ritchie, D.M., "A Tour Through the UNIX C Compiler," Bell Labs Memo, available online in UNIX.
External links
- csv.s - original V6 csv and cret source code
- csv.s - long-return-safe source
- crt0.s