5

I am looking to create a minimal, computationally universal subset of alphanumeric x86 opcodes. Eventually I want the subset to contain as few instructions as possible, and if there are multiple minimal subsets I want to know that as well. The subset should be able to simulate any program that could be written with the entire set of alphanumeric instructions. Instructions should only cover the instructions that correspond to the characters "A-Z", "a-z", and "0-9".

So far I think that a push, pop, inc, dec, cmp, and je would suffice, but I'm sure there is a smaller set. How could I go about proving that a set I generate is able to simulate any program using all of the alphanumeric instructions? How could I prove that such a set is minimal? Does anyone know if such an instruction subset exists?

cytinus
  • 5,467
  • 8
  • 36
  • 47
  • 3
    You can surely drop either `inc` or `dec` from the list, you don't have to have both. :) – Alexey Frunze May 23 '12 at 10:41
  • 2
    Can't `inc` and `dec` be replaced by one negative-number-accepting `add`? – Nyerguds Jul 04 '12 at 17:26
  • 1
    Like Alexey said, only one of `inc` or `dec` is necessary, because eventually overflow will occur. – cytinus Jul 05 '12 at 00:03
  • It depends how inefficient you want to allow; e.g. [Why is mov turing complete?](https://stackoverflow.com/q/61048788). If you want to actually emulate other x86 instructions so you can work with similar values in registers (instead of having to break things down to maybe single bits), you'll need more than just mov. At least one boolean (maybe BMI1 `andn`). You could emulate add with a loop that generates and propagates carry until you've handled all the carries. – Peter Cordes Feb 09 '21 at 11:26
  • IDK what you mean with the A-Za-z0-9 stuff. That character set covers every x86 instruction, including stuff like AVX [`vextractf128`](https://www.felixcloutier.com/x86/vextractf128:vextractf32x4:vextractf64x2:vextractf32x8:vextractf64x4), and [`xchg`](https://www.felixcloutier.com/x86/xchg) which has an implicit `lock` prefix making it an atomic RMW. (Unlike `cmpxchg8b` and other instructions that accept `lock`.) And [`vfmadd132ps`](https://www.felixcloutier.com/x86/vfmadd132ps:vfmadd213ps:vfmadd231ps) FP FMA. Did you meant you're disallowing space in the mnemonic, i.e. no prefixes? – Peter Cordes Feb 09 '21 at 11:29

2 Answers2

1

I’m not sure I get your question, especially the part about “alphanumeric”, but I’d like to point out that it is well known that both mov and xor are Turing complete.

kirelagin
  • 13,248
  • 2
  • 42
  • 57
0

It's just ONE instruction! here the proof

http://en.wikipedia.org/wiki/One_instruction_set_computer

Why? Just because "instruction" is a machine-dependent concept. You cannot define a small set of instructions just because there are no such universal/absolute/atomic instructions: it all depends on the underlying hardware: In fact the "real" turing machine is a methematical concept (a set of rules) not a physical machine

Gianluca Ghettini
  • 11,129
  • 19
  • 93
  • 159
  • This has nothing to do with my question, which is very specific. – cytinus Nov 08 '12 at 18:17
  • but you can take the instructions there and split into x86 instructions. For example SBNZ can be achieved with `sub` and `jne`, `SUBNEG` can be emulated with `sub` and `js`... – phuclv Apr 22 '15 at 04:37