I have an application that requires pluggable modules to implement arbitrary functions from one array of bytes to another. Some of the functions could be computationally intensive. One example of a function might involve summing many numbers - each of which, in Javascript, would require a BigNum... I can imagine an implementation in assembly language making extensive use of ADC (add with carry).
I found this post and comments from 2017 - much of which seems to echo what I'm thinking today. I am not aware of any progress towards supporting carry/overflow flags in WASM (or WASM2). Please correct me if I've overlooked something.
I'm aware that WASM is usually generated by compiling higher-level languages... like Rust or C... but I envision hand-coding Web-Assemly-Text to implement my plugins - so I'm not especially interested optimisations (when compiling to WASM) in LLVM etc. I would like to discover a model for the computational cost to execute each WASM instruction... in order to guide design of maximally efficient wasm modules - each of which will implement a relatively simple, but perhaps computationally expensive, algorithm. I'm aware that there are complications with distinctions between interpretation, JIT and AOT compilation - and I know that different target hardware will have different characteristics for each scenario. Despite this, I feel it would be extremely useful to have at least an estimate of the relative execution cost between different web-assembly code fragments. For example, what are the likely relative costs of i32.add and v128.add; is it cheaper to multiply by 2 or shift-left by 1... etc.
Is solid information currently available? Are there efforts to implement benchmarks with an ambition to provide helpful execution cost estimates for a range of target hardware? Are there any tools that can provide execution cost estimates for me when provided a WASM code fragment?