5

I know both clang (by use target=wasm32) and emscripten can compile C code into webassembly, but how are they different?

It looks like they both use LLVM as backend. Actually, I do not even quite understand the relation between llvm and clang...

I have been reading WebAssembly for a while, but I lack low-level understanding of it. Thank you so much for your time!!

yeehaw
  • 159
  • 2
  • 11

1 Answers1

12

clang is a compiler built on llvm technology so you often hear clang and llvm used interchangeably. clang is a just one component of the llvm project.

emscripten is a compiler that uses clang for most of the heavy lifting of actually compiling to WebAssembly but also add a lot of features and functionality on top of that, mostly related to seamless integration with JavaScript and the web and emulation of POSIX and other standards.

emscripten runs clang internally with --target=wasm32-unknown-emscripten which has some very minor differences to the regular --target=wasm32.

If you run emscripten with -v it will print the full clang command line its uses under the hood.

sbc100
  • 2,619
  • 14
  • 11
  • Thank you so much! If I understand you correctly. Emscripten can generate something as glue code so that we can work with WASM easily. May I keep asking: in glue code, we have some JS API such as WebAssembly.instantiate to compile wasm to machine code during interpreting JS glue code. But in some other runtimes, such as wasmtime and Lucet, they can run standalone WASM. Is there still glue code in those runtimes, just hidden from us by simply providing us cli to use. If yes, what languages (C/rust I gusee) glue code in wasmtiem and lucet written with? – yeehaw Nov 05 '20 at 18:07
  • Runtimes like wasmtime and Lucet don't have glue code per say, they just expose the platform capabilities of WASI and a few other APIs maybe. They are far more limited today in what you can do. e.g. there is no way you could take something like a video game and run it on wasmtime, whereas emscripten is designed with that in mind. – sbc100 Nov 06 '20 at 19:05
  • The system call APIs in wasmtime are implemented in rust. – sbc100 Nov 09 '20 at 03:40
  • Thank you!! I am now very confused. It looks like wasm is generated from LLVM IR, and use Emscripten or clang, we can compile that IR into wasm. But what do we use to generate machine code from wasm? I also heard that LLVM is used for generating machine code from WASM, for example (does js API WebAssembly.instantiate() use clang for compiling wasm into machine code?) There is another thing called Cranelift, do you know what is it used for? generating machine code from wasm or generating wasm from high-level language? Sorry for the mess, maybe I should open a new problem. – yeehaw Nov 09 '20 at 21:44
  • The webassembly runtime is responsible to compiling the webassembly bytecode to machine code (or interpreting it). In the web this is done by V8 or spidermonkey or whatever other engine is in your browser. Of the web this is done by runtimes such as Cranelift or wasmer. – sbc100 Nov 10 '20 at 23:04
  • Thank you! May I recheck if my understanding is correct: (1) clang is front-end that compiles code into llvm IR, while llvm backend is responsible for generating machine code from IR. but now we often clang to represent the whole compiler (both front and back end) (2) from wasm to machine code, it can be done via V8 if browser environment, and you mentioned that wasmtime uses cranelift to compile wasm into machine code, but I wonder if this is the defaulted option in wasmtime? Can compile wasm to machine code be done from llvm? If yes, does wasmtime or any other wasm runtimes support it? – yeehaw Nov 17 '20 at 18:24
  • Yes your understanding sounds correct. Regarding (2) that are any number of ways to go from wasm to machine code. The most commonly used ones today are the JIT engines embedded in v8 and other JavaScript engines. There is also a tool for converting wasm to machine code using llvm (called wavm). There are also a lot interpreters out there too. – sbc100 Nov 18 '20 at 19:06
  • There is another thing called binaryen in emscripten I wonder if you have heard of. It looks like a compiler that compiles c code into wasm, if yes, does emscripten provide two choice, clang and binaryen, both aim for generating wasm? – yeehaw Nov 20 '20 at 03:23
  • No, binaryen is used to optimize the webassembly. I can't compile C or C++. It acts as a post-link tool to shrink the output llvm. – sbc100 Nov 21 '20 at 04:28
  • I have asked too much, sorry( and this question may be too big to ask: why do you think there were already several webassembly interpreters such as v8 (liftoff/turbofan) exist, llvm, while people still wanted to develop cranelift, does cranelift work better than v8 and llvm in going from wasm to machine code? – yeehaw Nov 23 '20 at 05:35
  • I imagine there are many answers to that question.. maybe ask on the cranelift repository, or read some of the cranelift docs. Two reason I can think of (1) cranelift is written rust which gives certain assurances regarding VM bugs (2) not all users want JS integration and v8 historically is JS engine and potentially has baggage associated with that. – sbc100 Nov 24 '20 at 17:39
  • Sir, could you have a look on this question: I dont quiet understand why wasm is secure. Thanks a lot! https://stackoverflow.com/questions/64763007/why-is-webassembly-safe-and-what-is-linear-memory-model – yeehaw Dec 11 '20 at 22:46