How is JavaScript code transformed into Machine Code? Or why is it not?

Question

I'm trying to understand the process of how a piece of JavaScript code is executed. So far, I've managed to have most of the layout pictured out, but there's a few gaps that I wish to cover.

I know that a computer's CPU only understands 0's and 1's. So eventually, any code we write (in a high level language) gets transformed into 0's and 1's and is then executed by the CPU. In the case of JavaScript, the main character that makes this journey possible is the JavaScript engine. So that's what I looked into and just picked chrome's V8 to help me picture the whole thing.

So the JavaScript engine first parses the code and eventually generates AST's (Abstract Syntax Tree). Those are then transformed into Bytecode by Ignition, which is v8's bytecode generator AND also the bytecode interpreter. Next comes the step where the code is actually executed and here's where i have trouble understanding what's going on. I've found out that the same Ignition "executes the bytecodes" and at the same time an Optimizing Compiler, Turbofan in this case, improves the speed of execution by better handling repeating code and then returning the optimized code as Machine Code back.

I thought that executing the bytecodes means converting them to machine code which the CPU will then run, but that's not the case. Since Turbofan is only an optimizing compiler, I've wondered what is it that converts the bytecodes to machine code? I've then found that V8 doesn't compile all functions to machine code, only those that run hot enough for optimized compilation to (likely) be worth the time investment

So, what does it mean for bytecode to be "executed"? The CPU doesn't understand bytecode and the bytecode is not transformed into machine code either. Can anyone explain in simple terms what's going on?

"*Any code we write [… gets transformed …] and is then executed by the CPU*" - no. Code is executed by an interpreter for the respective language. A CPU can interpret machine code, but there are many other ways to interpret code, usually by simulating the execution in a more or less abstract fashion. — Bergi, Aug 28 '22 at 23:23
So the idea that I should take is that a CPU does only understand machine code, but not all the code that we write has to be run by the CPU so it doesn't necessarily need to be converted into machine code? — luckyy13, Aug 28 '22 at 23:49
Ah ok. So things aren't as straightforward as some people say. Thank you. — luckyy13, Aug 28 '22 at 23:54
@bergi so does the javascript interpreter receive the bytecode and call compiled c libraries in its engine ? How can this code that is not run in the CPU given instructions to the underlying hardware ? thanks — Kevin Greetham, Feb 02 '23 at 23:09
@KevinGreetham A JavaScript interpreter receives JavaScript, not bytecode. And the JavaScript does not give instructions to the hardware - it gives instructions for JavaScript values, which might be part of the interpreter environment. The interpreter engine may decide to call compiled c libraries to give instructions to external hardware when a certain function exposed to the javascript is evaluated. But the CPU first and foremost runs the machine code of the interpreter itself. — Bergi, Feb 03 '23 at 00:08
@bergi thanks. It’s confusing because everything online seems to state that the JavaScript interpreter turns the source code to machine code like it’s magic and then passes it to the cpu to come up with this magic functionality without the JavaScript engine having to do anything else. There is never a mention of the javascript interpreter reads the source code and performs operations on it which could consist of envoking compiled internal functions originally written in C consisting of making system calls. Consequently I just always seem to be battling between the is this simple 1/2 — Kevin Greetham, Feb 03 '23 at 03:29
@Bergi Is this a simple process that people seem to think it is or is it far more complex. So therefore I often change my mind about how the whole thing works and barely know if I’m ever right. I mean did what I just say even make sense!! My god I wish I was content with just writing JavaScript/python and just accepting it works but I just can’t — Kevin Greetham, Feb 03 '23 at 03:32
@KevinGreetham I've never met anyone who thinks it's simple, a modern JS engine with an optimising compiler is indeed a most complex beast that is hard to summarise in the 500 chars of a comment (and actively being worked on, so ever-changing). JS code is definitely parsed and interpreted expression by expression, only those bits that run often enough to be optimised are compiled to machine code. But for IO, even the optimised code needs to call into the compiled library functions made available to it, and it's far from "*[not] having [the JS engine] do anything else*". — Bergi, Feb 03 '23 at 09:56
@bergi oh I have but it’s mainly through ignorance. It’s always described as “ the Js engine looks at your code and creates machine code out of it.” Many people have wrongly described it to me that the interpreter “ looks at your code, parses it and creates an AST and certain characters on the AST correspond to different binary instructions”. Simple explanations us novices see online often think this is how it works. It’s only when you go that extra step that’ll you see things like how an interpreter uses compiled c functions which make system calls/ learn about device drivers and how C 1/2 — Kevin Greetham, Feb 03 '23 at 14:26
2/2 has access to these routines through an ApI and this is part of the journey to “produce that machine code”. I’ve worked with 10yr + developers that literally think the Js code that is written is just magically inerpreted to the exact instructions that the cpu needs and when I mention compiled c functions I hear “ c and Js are totally different”. The which I reply “ well the lines are blurred because C made the Js engine as part of the browsers implementation and this comes with in built compiled c functions. Only to get told 2/3 — Kevin Greetham, Feb 03 '23 at 14:31
@bergi “ only to get told “ your thinking too much into it.” People seem to want to accept this as magic. I wish I could be one of them! — Kevin Greetham, Feb 03 '23 at 14:33
@KevinGreetham "*It’s always described as “ the Js engine looks at your code and creates machine code out of it.”*" - I've never seen that (at least in this simplification level). Sure, follow jmrk's answers and read https://v8.dev/blog to understand in which cases, when, why, and how machine code is generated, but the primary (and really only) purpose of a JS engines is to execute/interpret JS code, totally ignoring *how* it does that. — Bergi, Feb 03 '23 at 19:25
@KevinGreetham But I would not say "*the linees [between C and JS] are blurred*" - they *are* totally different languages. But nonetheless, since both are turing-complete, you can implement a JS interpreter in C, and a C interpreter in JS. — Bergi, Feb 03 '23 at 19:27
@bergi do most JavaScript interpreters access ISA documentation to provide the equivalent machine code like maybe A c compiler would or do they call pre compiled engine internals and libraries? I can’t seem to find the answer anywhere !? — Kevin Greetham, Feb 07 '23 at 22:09
@KevinGreetham Both, and more. Have a skim over https://wingolog.org/archives/2013/04/18/inside-full-codegen-v8s-baseline-compiler https://v8.dev/docs/turbofan https://stackoverflow.com/q/277423/1048572 https://v8.dev/docs/wasm-compilation-pipeline It's ever-changing! — Bergi, Feb 07 '23 at 23:12
@bergi one last question I Promise. If a function is stored in memory in a compiled language does the CPU fetch it from memory and execute it once the function gets called and the internals of these functions consist of micro-operations to be performed, but in javascript our functions that are stored in memory dont exist of these operations , just encodings to provide a way to store them in bits. So they need the javascript interpeter to execute what is effectively still the source code. So how can we create and environment that these are run by the interpreter which I assume 1/2 — Kevin Greetham, Feb 07 '23 at 23:43
@bergi is just a component that exists as part of the browsers implementation. I saw your comment in Aug22 where you said you need to 'simulate the execution in an abstract way". Is that part of the interprets funcitonality as a whole , that they have achieved this environment to execute code by simulating a cpu. If so , how is that possible ? thanks — Kevin Greetham, Feb 07 '23 at 23:45

score 3 · Answer 1 · answered Aug 28 '22 at 22:07

3

An interpreter is a program that executes another program; that doesn't require translating that other program to machine code first.
The bytecode interpreter in V8 consists of machine instructions itself, so the CPU executes the interpreter.

To illustrate, imagine we wanted to implement our own programming language. To keep it simple, suppose this language's purpose was to execute arithmetic instructions written in plain English; and we're writing an interpreter for it in JavaScript.
A valid program in our language would be "three plus two". A first version of an interpreter for it might be something like:

function interpret(program) {
  let instructions = program.split(" ");
  let current = 0;
  function LiteralValue(inst) {
    switch (inst) {
      case "one": return 1;
      case "two": return 2;
      case "three": return 3;
      // TODO: add other numbers
    }
  }
  for (let i = 0; i < instructions.length; i++) {
    switch (instructions[i]) {
      case "one":
      case "two":
      case "three":
      // TODO: add other numbers
        current = LiteralValue(instructions[i]);
        break;
      
      case "plus":
        current = current + LiteralValue(instructions[i+1]);
        i++;  // We've just consumed the next instruction.
        break;

      // TODO: add support for "minus" etc.
    }
  }
  return current;
}

This isn't a very good interpreter, but it demonstrates the principle of executing a program by interpreting it: an interpreter "looks at the program", sees what the program wants to get done, and does that. It doesn't convert the program to machine code first; it sees "plus" and executes +.

One could say that "plus" is one of our "bytecodes" (so dead simple that it's actually just the same as the keyword), and the snippet current = current + ... is its "bytecode handler".

Since we used JavaScript for this example, which itself is executed by an interpreter (at least before optimization kicks in), we even get three levels of stacking here: "five plus two" is a program in our custom language that's executed by another program (the function interpret(...)) that's executed by another program (the JS engine in your browser) that's finally executed by the CPU.

answered Aug 28 '22 at 22:07

jmrk

34,271
7
59
74

"*finally executed by the CPU*" - you've forgotten a few levels of virtualisation by the OS/hypervisor :-) – Bergi Aug 28 '22 at 23:25
I kind of get the idea. But "It doesn't convert the program to machine code **first**" would mean that initially, the interpreter only converts the code to instructions and later those instructions are passed on as machine code to the CPU ? Or that the CPU can understand and run those instructions ? – luckyy13 Aug 29 '22 at 00:14
2

@luckyy13 as you can see in the example, nothing gets converted to machine code ever ("first" meaning "before executing" -- of course it doesn't convert to machine code *after* executing either). The CPU executes the JS interpreter, the JS interpreter executes the program, the program in this case is an interpreter itself that executes our `"three plus two"` program. It's a layering concept. Bytecode is neither translated to machine code nor understood/executed by the CPU. – jmrk Aug 29 '22 at 09:55
@Bergi: I wanted to keep it simple. Yes, you can add another layer of virtualization by running the whole stack in a VM, in which case the virtual guest CPU would in fact be an interpreter (at least conceptually...) running on the host CPU. (The OS doesn't generally virtualize machine instructions, aside from corner cases like Rosetta, if you want to count that as part of the OS. You could talk about microcode in CISC CPUs forming a VM of sorts, but I'd be surprised if that reduced confusion for a novice.) – jmrk Aug 29 '22 at 10:01
@jmrk I thought the JavaScript engine Although labelled an engine is just a component of the browser which will include a set of functions. Part of this set are functions which work together to interpret the JavaScript code. If that’s the case they are stored on the ram and when invoked are indeed executed on the cpu like any invoked function on the ram does? Is this incorrect ? – Kevin Greetham Feb 07 '23 at 22:54
@KevinGreetham it's not clear to me what you're asking. Being an engine and being a component of the browser isn't a contradiction. And yes, when you start a program, it's loaded into memory and executed on the CPU, that applies to JS engines just like to any other program. – jmrk Feb 08 '23 at 09:44
Thanks for reply. I’m not saying it’s a contradiction. You said the cpu doesn’t execute JavaScript code but the engine does. But the engine is a host of functions on the ram right rather than a standalone piece of software. I thought the cpu executes functions that are invoked by fetching them from the Ram, but in this case the function that holds an encoded version of the source code in the function rather then machine code the can operate on the cpu. So how then does the interpreter ( a host of functions) execute the code straight from the ram without the cpu executing it ? – Kevin Greetham Feb 08 '23 at 17:34
1

@KevinGreetham: the CPU executes the interpreter, the interpreter executes the JavaScript. See my answer for an example how that works one level higher, where the JS is itself another interpreter for yet another language, which can only be executed by the interpreter written for it, because no other software or hardware component would know what to do with it. – jmrk Feb 09 '23 at 15:50

How is JavaScript code transformed into Machine Code? Or why is it not?

1 Answers1

Linked