3

I've been thinking of how machine code is specific to architecture and how Javascript works in (nearly)every browser. I've been working on a project that has to do some serious calculations and it is Javascript based and takes a full minute to finish the calculating. It makes me long for the speed of C. But the whole reason the project is in Javascript is for simplicity and portability.

Which gave me the idea, what if there was a language similar to Javascript that was just as portable and ran as an executable file on every architecture? Most people would point to Java, but I'm thinking something with less overhead and handled by the OS. Not byte code but native machine code.

Did a bit of researching and thinking to come to the impossibility of this task. How would you make an executable file as small as the normal C written application for a specific architecture that works on every architecture with the same speed as if it were natively compiled in C for that architecture?

Which comes to my next idea. Native machine code is specific to architecture, each architecture has certain special features and sometimes handles the same task differently. Also certain optimizations are specific per architecture. What if there was a Universal Machine Code? When the OS loads the instructions into ram, it automagically converts the instructions to adapt to the architecture. Or perhaps (crazier idea) the CPU could contain the ability to receive the universal machine code and automatically adapts the universal machine code into it's native machine code?

The Universal Machine Code specification would have to be generic enough to cover the normal Machine Code functions.

Of course if the Universal Machine Code did work, people would probably want a universal executable format that's handled by all OSes. That way the executable need not change across OSes. Which leads to frameworks that need to be made specifically to be universal across machines. And more nitty gritty would be OS specific features, and the ability for input and output which is going beyond what I know.

Universal Machine Code compiled executable:

Pros:

  • Nearly same size as a native compiled executable
  • Nearly if not the same performance of a native compiled executable

Cons:

  • Slightly(hopefully non-existantly) slower in loading as it converts the universal machine code to native machine code when loading the executable in ram

Is it feasible?

Edit:

I have used Java, made a game in it. It's not as *universal as I'd like, nor as friendly. * It is it's own programming language, maintained by Oracle. Proprietary and a bit too massive. Requires installation on some machines.

And to be more specific I'm not talking about having a new programming language. I'm talking about having a new machine code language that holds enough extra information that when executed, a very thin process of translating it to the architecture's machine code happens. That way C compilers can just compile their executables into the universal machine code and the executables could run everywhere.

user2030360
  • 153
  • 1
  • 6
  • "not byte code but native machine code" - but you've already recognised that different machines have *different* machine code. So, at the least, you're talking about some level of translation occurring. – Damien_The_Unbeliever Jan 31 '13 at 20:19
  • 1
    And how do you see "Slightly(hopefully non-existantly) slower in loading as it converts the universal machine code to native machine code when loading the executable in ram" as, qualitatively, different to "load (Java?) byte code, compile to machine code" – Damien_The_Unbeliever Jan 31 '13 at 20:21
  • 2
    C compilers do this already, correct? There are compilers for each respective OS, correct? – King Friday Jan 31 '13 at 20:23
  • 3
    This is what the Java Virtual Machine was intended to be. It didn't live up to the ideals. – Barmar Jan 31 '13 at 20:24
  • Oh, this is the question that keeps on giving (me more issues to raise). Different OSes provide different facilities to their user-mode programs. What may be "built-in" to one OS may be something only available in a library on another. How do you expect to deal with that? – Damien_The_Unbeliever Jan 31 '13 at 20:28
  • The first step is the thin translation layer. On program execution, the program would be put into ram as native machine code, translated from universal. One translation, all it takes. That first step would allow allow compilers to be made to support that universal machine code. Of course then comes the second step. Step 2: Without the worry of supporting architectures, programs written on a Intel mac could theoretically work on a PowerPC Mac and vice versa. First benefit. Though people would look into having them cross platform. And that'd require another specification. A new executable format – user2030360 Jan 31 '13 at 21:52
  • Have you tried [LLVM](http://llvm.org/)? – nrz Jan 31 '13 at 22:06

3 Answers3

6

there's already something like that, its called p-code

http://en.wikipedia.org/wiki/P-code_machine

these days, VMs basically act in that role

Keith Nicholas
  • 43,549
  • 15
  • 93
  • 156
  • Not quite what I was asking, I'm thinking even lower. That requires no interpreter but rather it translates the universal machine code to native machine code as it loads the instruction sets into ram on application execution. The idea is to have it running at the same speed as native compiled machine code that doesn't need translating. – user2030360 Jan 31 '13 at 21:42
  • 1
    you can do that with both p-code and also thats pretty much what VMs do, however, VMs often allow capabilities where it can't do straight conversion to machine code straight away ( things like reflection, dynamic code) – Keith Nicholas Jan 31 '13 at 21:49
  • Virtual machines seem a bit extreme. I'm thinking a very thin layer to translate from one machine code type to another. Also from my understanding, Virtual Machines are much slower running the program. – user2030360 Jan 31 '13 at 22:01
  • most VMs use JIT, which can give native speed performance. see http://stackoverflow.com/questions/145110/c-performance-vs-java-c – Keith Nicholas Jan 31 '13 at 22:05
  • That's useful knowledge. Now if only someone would make a VM for a lower level language - like assembly that can achieve the same performance as native code. That could pretty much solve all compatibility problems. If compilers for C/C++/Haskell/ect could compile their programs to that assembly VM. Runs everywhere, native speed, tiny file size. – user2030360 Jan 31 '13 at 22:11
  • the problem is, you can't get the very best optimization for that CPU platform doing that. A C compiler will make CPU based decisions on how to best use the resources available. If its abstracted away from that, it can't do that anymore. CPUs are complicated beasts these days, optimization is not just about the machine code, but taking into account pipelining and caching. – Keith Nicholas Jan 31 '13 at 22:18
  • not to mention issues of CPU platforms that have conflicting ideas on things, like endianness – Keith Nicholas Jan 31 '13 at 22:20
4

Lambda calculus and/or Turing machines. Everything else is just syntactic sugar.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
3

You wont get chip vendors to agree on a universal machine code. And one vendor cannot take over the world with a single new machine code otherwise they would have (ARM almost has but really in the end it wont).

With todays patent litigation you cant even make a VLIW processor and in a thin software layer mimic various instruction sets, and you couldnt take over the world with that anyway, can you say transmeta?

So you are stuck with an interpreter...in software. So it goes back to the "already tried that" of JAVA and the p-code from decades past (pascal), and whatever python is doing, etc. Even llvms bytecode for that matter.

Your requirement of "same speed on every machine" that wont happen.

You are already using the most portable language in javascript, and it is as fast as it is. C is the universal language you are after and it is fairly fast and runs everywhere (more than any other language), problem is the operating systems and more than that the user interfaces.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • Agreed that it's implausible vendors would agree on a specification but couldn't there be a way of having a form of universal machine code(that's software translated on load) be as fast as native compiled? Like suppose there was a C compiler that outputted native machine code and an identical C compiler that outputted universal machine code, that on application execution, the instructions would be loaded into ram and (the universal machine code)converted to native architecture, on runtime wouldn't both have the same performance? – user2030360 Jan 31 '13 at 21:40
  • dynamic binary translation has been around for a while and can work, wont be as fast as natively compiled though, but faster than interpreted and faster than simulating another instruction set. I remember a demo of x86 windows programs running on an alpha windows machine, translating a little each run, until eventually the whole thing was translated, at least that is how the marketed it. – old_timer Feb 01 '13 at 03:42
  • essentially what you are asking is what llvm does, just in time compiling, the bytecode is generic, but source to bytecode can/does vary based on the ultimate target. You could probably pick a generic enough target or whatever to have good enough/portable bytecode that works pretty well everywhere. – old_timer Feb 01 '13 at 03:44
  • remember that the machine code is only part of the problem, the operating system calls and the peripherals also play a role you would have to make a generic system layer with a hal or a vm to go with the generic instruction set. (here again, see JAVA, been there done that, it didnt take over) – old_timer Feb 01 '13 at 03:47
  • That sounds pretty cool. I can only hope things will change to universal when quantum computing hits. – user2030360 Feb 01 '13 at 04:31