Difference between revisions of "UNIX a.out file"

From Computer History Wiki
Jump to: navigation, search
(Fairly complete)
 
m (External links: forgot a cat)
 
Line 96: Line 96:
  
 
[[Category: UNIX]]
 
[[Category: UNIX]]
 +
[[Category: PDP-11 File Formats]]

Latest revision as of 19:00, 6 April 2022

UNIX a.out files are used to hold binary files of various sorts, both relocatable binary and also executable binary files (ready to be loaded into main memory and run).

They generally hold, in order:

  • a header giving overall information about the contents
  • object code
  • initialized data
  • relocation information
  • a symbol table

The relocation information is used during linking of relocatable binary files into an executable binary file; the last two may not exist in finalized executable binary files.

PDP-11 a.out format

The format of the a.out header on a PDP-11 (always 8 words long) is:

Offset Contents
0 A magic number (below)
2 Program text size
4 Initialized data size
6 Uninitialized (BSS) data size
010 Symbol table size
012 Entry location
014 Unused
016 Flag indicating relocation information has been suppressed

Magic number values are: 0407 (text is not write-protected and not shared), 0410 (text is write-protected, and one copy in main memory will be shared by all processes executing that file), or 0411 (as for 0410, but instruction and data space are separate, with both beginning at 0; i.e. 'split I&D'). The origin of the '0407' will be obvious to anyone familiar with PDP-11 object code; in the early days of UNIX, an executable binary file was loaded into memory without stripping off the header, and started at location '0'; the '0407' is a BR instruction which skips over the header to the first location after it.

The 'entry location' was always 0 on the PDP-11.

The symbol table consists of an array of 6-word entries. The first four words contain the symbol in ASCII, padded on the end with '0' bytes. The next word is a field indicating the type of symbol (below). The final word is the value (possibly not final in relocatable binary files). Symbol types are:

Value Symbol type
00 undefined
01 absolute
02 text
03 data
04 BSS
24 register assignment
37 file name (produced by the linker, 'ld')
40 undefined external
41 absolute external
42 text external
43 data external
44 BSS external

If the symbol's type is un-defined external, and the value field is non-zero, the symbol names a common region, of a size indicated by the value.

If a word in the text or data section involves a reference to an un-defined external symbol (as indicated by the relocation bits for that word - below), the value of the word as stored in the file is an offset from the associated external symbol. When the file is bound into an object file by the linker, the eventual value of the symbol will be added into the word.

If relocation information is present, there is one word per word of text or initialized data. Bits 3-1 of a relocation word indicate the entity referred to by the word associated with the relocation word:

Value Reference type
00 the reference is absolute
02 the reference is to the text segment
04 the reference is to initialized data
06 the reference is to BSS
10 the reference is to an un-defined external symbol

For references to un-defined external symbols, he remaining bits of the relocation word (15-4) contain the index of the symbol's entry in the table (numbered sequentially from 0).

External links