Difference between revisions of "ARPANET Host-to-Host Protocol"

From Computer History Wiki
Jump to: navigation, search
m (Typo.)
(Technical details: They are from RFC 636.)
 
(One intermediate revision by the same user not shown)
Line 72: Line 72:
 
| 17% || NXR || Non-existent Receive link || Tells sender the link it used is non-existent (e.g., data message)
 
| 17% || NXR || Non-existent Receive link || Tells sender the link it used is non-existent (e.g., data message)
 
|-
 
|-
| 20% || NXS || Non-existent Send link || Tells receiver the link it used is non-existent (e.g., ALL)
+
| 18% || NXS || Non-existent Send link || Tells receiver the link it used is non-existent (e.g., ALL)
 
|}
 
|}
  
In addition to the 14 opcodes defined in the official AHHP specification document (NIC #8246), examining the source for [[Incompatible Timesharing System|ITS]] and [[WAITS]] shows that they knew about five additional opcodes, marked with a '%' above.
+
In addition to the 14 opcodes defined in the official AHHP specification document (NIC #8246), [https://www.rfc-editor.org/rfc/rfc636.html RFC 636] added five additional opcodes, marked with a '%' above.  Examining the source for [[Incompatible Timesharing System|ITS]], [[WAITS]], and [[TENEX]] shows that they knew about these; [[ELF]] and [[Unix]] did not.
  
 
STR and RTS (called 'request for connection'- with the confusing acronym 'RFC') control messages were used to establish a connection; it was suggested that they be queued for later processing, if there was not an immediate consumer for them. The ALL, GVB and RET control commands were used for [[flow control]] on the connection. The meaning of the 'interrupt' was not defined by the AHHP, merely made available to users for such use as they saw fit.
 
STR and RTS (called 'request for connection'- with the confusing acronym 'RFC') control messages were used to establish a connection; it was suggested that they be queued for later processing, if there was not an immediate consumer for them. The ALL, GVB and RET control commands were used for [[flow control]] on the connection. The meaning of the 'interrupt' was not defined by the AHHP, merely made available to users for such use as they saw fit.

Latest revision as of 17:14, 19 January 2025

The ARPANET Host-to-Host Protocol (often abbreviated as AHHP), was the core of ARPANET's original protocol suite (NCP); it provided uni-directional reliable byte streams, called 'connections', used by applications to talk to each other (usually a pair of connections, one in each direction).

Technically, the functionality of those streams was provided by the underlying ARPANET Host-to-IMP Protocol (HIP); AHHP just used those, and added additional semantics, such as opening and closing connections. IHP transferred 'messages' (ARPANET jargon for packets) between the local host and a distant host. The top layer consisted of two semi-separate protocols: the AHHP, and the Initial Connection Protocol (ICP). (The latter made use of the former.)

At this point in time, machines with differing word sizes were common; the ARPANET was prepared to carry messages of varying word size. Two communicating machines therefore had to agree on a 'connection byte size', and all messages sent over a connection had to contain an integral number of bytes of that size.

The connection was identified by a 32-bit long 'socket' number at each end, along with the addresses of the two hosts. The polarity of the uni-directional connections was indicated by the low-order bit of the socket number (0 = receive; 1 = send).

However, socket numbers did not appear in the messages of a connection; instead, the messages belonging to a connection were identified by the link (provided by HIP). (The link number appeared in every message, as links were used to carry all the messages.) One of AHHP and ICP's functions was to allow the two sides to manage which links an application instance was using. When a connection was set up, between one host/socket and another, it used a particular link, specified by the receiver, and no other connection could use that link until the connection was closed.

Links were like virtual circuits in their properties, in that messages sent on one were received reliably (although there was an error message to the host from its IMP when that didn't happen), and in order, at the other end of the link; but they had no open/close - a host just started using a link. (The ICP, a higher-level protocol, was performed to open a connection.)

One link, 0, was special - it was the 'control link'. All AHHP 'control messages', including those involved in opening and closing a connection, were sent over the control link - only data messages belonging to an open connection went over the connection's link. The control link always used a connection byte size of 8 bits. While the data only flowed in a single direction on a connection, the AHHP control messages associated with a connection flow in both directions.

Technical details

All NCP messages, including control messages, included a 40-bit header, right after the HIP header:

Field Length Function
M1 8 Must be zero
S 8 Connection byte size
C 16 Byte Count
M2 8 Must be zero

(The total length of 40 bits, rather than 32, was chosen because, combined with the 32-bit length of the initial generation of HIP header, that produced a length of 72 bits; that worked well with machines that handled 8-bit units of data, or had word lengths of 18 or 36 bits; 72 was the least common multiple of these lengths.) As mentioned, for control messages, S had to be 8.

All AHHP control messages started with another 8-bit field, the 'opcode', which identified the function being performed. The opcode was generally followed by a set number of fixed-width fields; the details varied from opcode to opcode. The list of defined opcodes was:

Number Label Name Function
0 NOP No operation
1 RTS Receiver to sender Request connection
2 STR Sender to receiver Establish the connection
3 CLS Close Terminate the connection
4 ALL Allocate Increase the sending host's space counters
5 GVB Give back Request the sending host to return some of its space allocations
6 RET Return From the sending Host in response to a GVB
7 INR Interrupt by receiver
8 INS Interrupt by sender
9 ECO Echo request
10 ERP Echo reply
11 ERR Error detected
12 RST Reset
13 RRP Reset reply
14% RAR Reset Allocation by Receiver Indicates the RAS has been done
15% RAS Reset Allocation by Sender Tells the receiver to clear allocations
16% RAP Reset Allocation Please Sent to sender to ask it to send an RAS
17% NXR Non-existent Receive link Tells sender the link it used is non-existent (e.g., data message)
18% NXS Non-existent Send link Tells receiver the link it used is non-existent (e.g., ALL)

In addition to the 14 opcodes defined in the official AHHP specification document (NIC #8246), RFC 636 added five additional opcodes, marked with a '%' above. Examining the source for ITS, WAITS, and TENEX shows that they knew about these; ELF and Unix did not.

STR and RTS (called 'request for connection'- with the confusing acronym 'RFC') control messages were used to establish a connection; it was suggested that they be queued for later processing, if there was not an immediate consumer for them. The ALL, GVB and RET control commands were used for flow control on the connection. The meaning of the 'interrupt' was not defined by the AHHP, merely made available to users for such use as they saw fit.

Initial Connection Protocol

The ICP made use of the AHHP in the process of setting up the connection(s) used by most applications. Most applications had a 'well-known socket' (specified as part of that application's protocol specification), to which a distant host wishing to use that application started out by making contact with. That initial contact was made from an effectively random socket, 'U', at the client (it was often not entirely random, but allocated according to rules local to the client host; those rules, of course, were not known to the server host, and indeed, were irrelevant to it).

The function of the ICP was to pass back, to the client which was attempting to connect to that application, at the server, the socket ('S', at the server) to be used to communicate with the new instance of the application. To do this, a connection was temporarily opened, from the application's well-known socket at the server, back to the socket of the client which was attempting to connect with it. That connection was used solely to send 'S' to the client, after which that connection was closed.

The exact sequence used is as follows ('W' is the well-known socket of that application):

Sent from client Sent from server
RTS receive socket U, send socket W, link1
  STR send socket W, receive socket U, byte size 32
ALL link1
  Data on link1: one 32-bit socket number S
CLS, W, U CLS U, W

After this exchange, connections are opened between the client and server, using sockets 'U' and 'S' (sometimes algorithmically related minor variants of those, where the algorithm for so doing is also given in the application's protocol specification). (The entity communicating through the well-known socket, W, is generally a different piece of software from the application itself - often one whose only task was to perform the ICP, and then hand off the necessary data - U and S - to an instance of the application. That module is sometimes called the 'logger', in early NCP documentation.)

Further reading

  • Alex McKenzie; Jon Postel "Host-to-Host Protocol for the ARPANET", October 1977, NIC #8246, Network Information Center; reproduced in RFC 6529
    • Host-to-Host Protocol for the ARPANET (archived - Note: this is not a scan of the original document (see the scan of the 'ARPANET Protocol Handbook' for that); the format of the ALL message in the 'Control Command Summary' is erroneous in this reproduction
  • Jon Postel, "Official Initial Connection Protocol", June 1971, NIC #7101, UCLA-NMC (this does not seem to be online, but an early version, which is almost identical to the final version, can be found here)

External links