0

I'm trying to understand following inline assembly code, it took from https://elixir.bootlin.com/linux/v3.16.82/source/arch/x86/include/asm/checksum_32.h at line 114

how it works please....

static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
                    unsigned short len, unsigned short proto,           __wsum sum)

{

    asm("addl %1, %0    ;\n"
        "adcl %2, %0    ;\n"
        "adcl %3, %0    ;\n"
        "adcl $0, %0    ;\n"
        : "=r" (sum)
        : "g" (daddr), "g"(saddr),
          "g" ((len + proto) << 8), "0" (sum));
    return sum;

}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
HSan
  • 1
  • 1
    What **exactly** is not clear in this function? `adcl` instruction? Parameters specification? – Tsyvarev Mar 16 '21 at 09:19
  • Interesting; are these functions actually hot? Like the x86-64 version of `ip_fast_csum` (https://elixir.bootlin.com/linux/v5.12-rc3/source/arch/x86/include/asm/checksum_64.h#L46) in current Linux 5.12-rc3 could be using qword `adcq` to do 8 bytes at a time, with only a bit more shift/add reduction at the end, especially if we can rule out a length of 4, or handle it outside the inline asm statement. (e.g. BMI2 rorx $32 / adcl would be great). But if these functions aren't normally hot, maybe not worth optimizing. – Peter Cordes Mar 16 '21 at 09:45
  • I'm also surprised that even older kernels like 3.16 have a `dec`/`jnz` loop around one adcl; that caused a partial-flag stall on Intel CPUs before Sandybridge. ([Problems with ADC/SBB and INC/DEC in tight loops on some CPUs](https://stackoverflow.com/q/32084204)). It's a good choice these days, so not something we should change now. – Peter Cordes Mar 16 '21 at 09:47
  • (For the record, I don't see any perf problem with this function, especially if the inputs are all naturally in separate registers / memory locations anyway, not two qword regs. But you linked the 32-bit version which couldn't take advantage anyway.) – Peter Cordes Mar 16 '21 at 09:51
  • 1
    Maybe the C version of the function in [lib/checksum.c](https://elixir.bootlin.com/linux/v3.16.82/source/lib/checksum.c) will help you understand what it is doing. – Ian Abbott Mar 16 '21 at 11:49

0 Answers0