Difference between revisions of "UNIX a.out file"
(Fairly complete) |
(No difference)
|
Revision as of 18:56, 6 April 2022
UNIX a.out files are used to hold binary files of various sorts, both relocatable binary and also executable binary files (ready to be loaded into main memory and run).
They generally hold, in order:
- a header giving overall information about the contents
- object code
- initialized data
- relocation information
- a symbol table
The relocation information is used during linking of relocatable binary files into an executable binary file; the last two may not exist in finalized executable binary files.
PDP-11 a.out format
The format of the a.out header on a PDP-11 (always 8 words long) is:
Offset | Contents |
---|---|
0 | A magic number (below) |
2 | Program text size |
4 | Initialized data size |
6 | Uninitialized (BSS) data size |
010 | Symbol table size |
012 | Entry location |
014 | Unused |
016 | Flag indicating relocation information has been suppressed |
Magic number values are: 0407 (text is not write-protected and not shared), 0410 (text is write-protected, and one copy in main memory will be shared by all processes executing that file), or 0411 (as for 0410, but instruction and data space are separate, with both beginning at 0; i.e. 'split I&D'). The origin of the '0407' will be obvious to anyone familiar with PDP-11 object code; in the early days of UNIX, an executable binary file was loaded into memory without stripping off the header, and started at location '0'; the '0407' is a BR instruction which skips over the header to the first location after it.
The 'entry location' was always 0 on the PDP-11.
The symbol table consists of an array of 6-word entries. The first four words contain the symbol in ASCII, padded on the end with '0' bytes. The next word is a field indicating the type of symbol (below). The final word is the value (possibly not final in relocatable binary files). Symbol types are:
Value | Symbol type |
---|---|
00 | undefined |
01 | absolute |
02 | text |
03 | data |
04 | BSS |
24 | register assignment |
37 | file name (produced by the linker, 'ld') |
40 | undefined external |
41 | absolute external |
42 | text external |
43 | data external |
44 | BSS external |
If the symbol's type is un-defined external, and the value field is non-zero, the symbol names a common region, of a size indicated by the value.
If a word in the text or data section involves a reference to an un-defined external symbol (as indicated by the relocation bits for that word - below), the value of the word as stored in the file is an offset from the associated external symbol. When the file is bound into an object file by the linker, the eventual value of the symbol will be added into the word.
If relocation information is present, there is one word per word of text or initialized data. Bits 3-1 of a relocation word indicate the entity referred to by the word associated with the relocation word:
Value | Reference type |
---|---|
00 | the reference is absolute |
02 | the reference is to the text segment |
04 | the reference is to initialized data |
06 | the reference is to BSS |
10 | the reference is to an un-defined external symbol |
For references to un-defined external symbols, he remaining bits of the relocation word (15-4) contain the index of the symbol's entry in the table (numbered sequentially from 0).