Repairing un-documented MOS memory boards

From Computer History Wiki
Revision as of 17:33, 26 August 2024 by Jnc (talk | contribs) (Grammar tweaks to clarify)
Jump to: navigation, search

It is usually possible, even without schematics, to repair MOS memory boards where the board is basically working, but just has some failing bits.

The first step is to see if the failures are usually static (i.e. always at the same address). If not, the issue may be the power supply, or circuitry on the board which creates needed voltages; this page does not cover those cases. However, static errors probably indicate a failing/failed chip, a common failure mode of old MOS memory chips.

The first step in working with one of them, without a circuit diagram, is to create a table which translates memory chip to bits.

First, one has to understand the high-level layout of the memory: examine the memory chips to see if they are YYKx1 - i.e. one bit wide. (The techniques below will work on multi-bit chips, using obvious changes, but since single-bit chips are the most common on older memory cards, the writeup here focuses on them.) Knowing the width of the bus, and the number of chips, will usually give the number of banks of memory (needed for the table). For example, on a UNIBUS or QBUS memory card, the bus is 16 bits wide, so if there are 32 1-bit-wide chips in the array, there will probably be 2 banks, each 16 bits wide.

From there, there are two basic techniques to populate the table:

The choice of which one to use will depend on personal preferences, along with factors such as if the chips are socketed (common on MOS memory boards, to simplify repairs).

Pulling chips

The general approach is to pull a chip, and then store data in memory, and read it back, to try and work out which bit is stored in that chip. Having done that, then repeat with other chips, to try and work out which bits are stored in which chips. (Unless the designers were doing something very strange, each chip will hold the same bit in all the words in that bank.)

Usually a missing chip results in bits stored in that chip reading as '0', but it's possible they will read back as '1'. (The MSV11-J QBUS memory operates in the latter way, for instance.) To test for the first possibility, start by finding a location in each bank that can be written to all 0's and all 1's (read back after writing, to verify).

Then pull a chip, and then write all 1's to that word in each bank, and read it back. If one bank now has a 0 bit, congratulations: i) that verified that missing chips read as '0', ii) indicates which bank that chip is part of, and iii) the 0 bit indicates which bit that chip is - fill in that entry in the chip<->bit chart.

If not, try writing 0's to the words in each bank, and check for a '1' bit: if so, i) missing chips read as 1, etc. If neither this or the above is true, there's an issue.

Otherwise, try pulling another chip, and work out which bit that one is, and add it to the chart. Repeat for all the memory chips - although if you're lucky, after a couple you might find a pattern, and be able to predict which chips hold which bits. (But not always; many memories are random; see e.g. Q-RAM 11 and NS23M for boards in which the bit assignment to chips is fairly random.) If there does seems to be a pattern, do a few spot tests of the predictions, to make sure the hypothesized pattern is correct.

Using a program

The previous technique is viable if the memory chips are in sockets, which makes the chip removal (if needed) less painful; if the chips are soldered in, that technique is not really feasible.

For such boards, storing a word with a single '1' bit can be used to the same effect. To start, the data sheet for the memory chips used will indicate which pin is the 'data in' pin. Connect the test device (either a logic analyzer, or oscilloscope, can be used) to that pin.

Write a very short program which loops, storing a word in memory. Start with a word containing only a single '1' bit, and try using a destination location in each bank, looking for a '1' being written to that chip (you can trigger off the 'write' input to the chip; although some boards, e.g. the MSV11-Q, send a 'write' to all the chips, and select the one to actually use by use of the RAS or CAS signal) to see if i) your probe is on the chip which corresponds to that bit, and ii) the chip is in the bank currently under test.

If not, use a word with a different bit set, and repeat. (Changing the word contents, while remaining on the same chip, will almost certainly be easier than keeping the data constant, and trying different chips.) By 'floating' the 1 bit along the word, it should be possible to work out which bit is stored in that chip.

Once that chip has been identified, move onto the next chip. Again, after a few chips have been done, it may be possible to see a pattern. (This technique was used to produce the bit charts for the MSV11-J and MSV11-Q.)

Repair

With the completed chart in hand, given a failing word (address and bad data), it is possible to work out which chip is at fault, and it can be replaced.