3

I'm trying to understand the bytecode emitted by solc. My source program is the following; the assignments dummy = 0x1234567890 and z = 0xdeadbeef are to make it easier to recognize the start of the function in the generated bytecode.

pragma solidity ^0.8.4;
contract C {
  function f(uint x, uint y) public pure {
    uint dummy = 0x1234567890;
    uint z = 0xdeadbeef;
    if (x != 0) {
      z = x * y;
    }
  }
}

This is the bytecode I get using solc --bin --asm --opcodes --optimize tmp.sol -o .:

0000:    60  PUSH1 0x80
0002:    60  PUSH1 0x40
0004:    52  MSTORE
...
0056:    56  JUMP
0057:    5b  JUMPDEST
0058:    60  PUSH1 0x3e
005a:    56  JUMP
005b:    5b  JUMPDEST
005c:    00  STOP
005d:    5b  JUMPDEST             //  <-- start of function f() (?)
005e:    64  PUSH5 0x1234567890
0064:    63  PUSH4 0xdeadbeef
0069:    83  DUP4
006a:    15  ISZERO               // x != 0 check?
006b:    60  PUSH1 0x59
006d:    57  JUMPI        // <<< Q1: no instruction at 0x59???
006e:    60  PUSH1 0x56
0070:    83  DUP4
...
00c3:    5b  JUMPDEST
00c4:    50  POP
00c5:    02  MUL
00c6:    90  SWAP1
00c7:    56  JUMP
00c8:    fe  INVALID     // <<< Q2: why an invalid opcode?
00c9:    a2  LOG2
00ca:    64  PUSH5 0x6970667358
00d0:    22  INVALID     // <<< Q2: why an invalid opcode?
00d1:    12  SLT
00d2:    20  KECCAK256

My questions:

  1. The JUMPI instruction at offset 0x6d seems to have 0x59 as its target, but there is no instruction at offset 0x59. Are offsets not counted starting at 0? Or is the problem that I'm looking at creation bytecode rather than the deployed bytecode?
  2. The generated code contains bytes that don't decode to valid opcodes (INVALID). These show up in the output of solc --opcodes as well so it's not a disassembler error. What is the purpose of these bytes?

Thanks.

debray
  • 131
  • 2
  • 1
    The latest bytes in the EVM bytecode is metadata: https://docs.soliditylang.org/en/v0.8.14/metadata.html#encoding-of-the-metadata-hash-in-the-bytecode. It could contain some random bytes which will be incorrectly interpreted if represent them as opcodes. So, you see some INVALID codes and random noise there. – jubnzv May 24 '22 at 18:00

0 Answers0