What happens to instructions given to the Little man in the LMC that begin with 4?

Question

This might be a really strange question, but I have been doing some work on the little man computer and it mildly annoys me that not only is there no operation code number 4, but there is absolutely no information on the internet as to why.

The opcodes go 0-9 but skip 4. Are there never any three digit codes that start with 4? What happens if there is?

Is there anyone out there that would be able to help answer this question? I just find it so strange. Thank you!

Maybe because the chinese word for death sounds similar to the word for `4`. Or maybe not. — Kayaman, Mar 03 '17 at 13:10

trincot · Answer 1 · 2020-10-29T10:44:01.623

The Little Man Computer (LMC) was initially not presented as a complete specification. It is more a model, a paradigm. There are several things left undefined, like for instance what should happen when the unused opcode 4 is encountered. The aim of LMC was to introduce students to the concepts of machine code and instruction sets and demonstrate that the power of a computer does not come from complexity. The aim was not so much to explain all the details of what happens with badly designed code.

It is not defined what should happen. In concrete implementations, choices are made: either it will lead to abnormal program termination (like stated here), or it will be executed as a no-operation instruction, or it will still do something else (very unlikely). The main message is that programs should not rely on a certain implementation choice, and should never run into such an opcode. If you really want to know what would happen, then this is a specification that should be found in the documentation that comes with a specific implementation (emulator).

There is no particular reason why 4 is unused. The initial LCM (in 1965) had a slightly different set, where it seems that opcode 4 was used. The more popular set was introduced later, and is also presented in "The Architecture of Computer Hardware and System Software" (Irv Englander). Several other implementations of LCM describe extensions (like here) where opcode 4 gets a use.

One (accidental?) benefit

There is one benefit I have found with the undefined 4 opcode, although I consider this benefit unintended, and it is only applicable when an LMC implementation aborts (with an error message) when it bumps into a 4 opcode:

When a program needs to manage an array, it will have to use self-modifying code in order to achieve indirect addressing. Such programs may not have code to detect that the array is overflowing the number of available mailboxes, and in that case the invalid 4-opcode "feature" will greatly help debug what went wrong.

Take for instance the below program, which reads in a variable number of inputs, given a first input that indicates how many more inputs follow. It stores these inputs as an array. I don't include the processing of this array, which could be anything... like for instance sorting:

#input: 90 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 
         INP
         BRZ zero ; nothing to do
         STA size ; for later reference
loop     SUB one
         BRP continue
         HLT ; placeholder for some processing
continue STA counter
         INP
dynamic  STA array
         LDA dynamic
         ADD one
         STA dynamic
         LDA counter
         BRA loop
zero     DAT 0
one      DAT 1
size     DAT
counter  DAT
array    DAT 

<script src="https://cdn.jsdelivr.net/gh/trincot/lmc@v0.77/lmc.js"></script>

Notice what happens when you run this snippet with the 91 inputs that are provided by default. Focus in particular on the line labelled "dynamic". This is the line that is subject to self-modification: it is a STA instruction responsible for storing the last input at the next slot in the growing array. Its opcode is 4xx. But when this xx becomes too large, then the opcode moves from 399 to 400, and suddenly it has become an invalid instruction. This is actually good, because now the program will halt immediately.

If the 4xx opcode would have been valid, the program execution would have continued, but it would certainly not do what was expected from it, and it would be harder to find out why.

score 0 · Answer 2 · answered Mar 03 '17 at 14:18

0

https://web.archive.org/web/20131211112403/http://www.acs.ilstu.edu/faculty/javila/lmc/

This lists a totally different instruction set (e.g. input/output is 500/600).

answered Mar 03 '17 at 14:18

Mark Jeronimus

9,278
3
37
50

This lists 4 for this particular emulator. But many other emulators don't use 4 at all. – sleepy_gamer Mar 03 '17 at 16:20

What happens to instructions given to the Little man in the LMC that begin with 4?

2 Answers2

One (accidental?) benefit

Linked