2

Compiling 7-zip source code with MinGW GCC 4.8.2 on Windows64 I found a crash.

Here is a code snippet of 7-zip 9.20, Sha1.cpp:

const unsigned kBlockSize = 64;

...

void CContext::UpdateRar(Byte *data, size_t size, bool rar350Mode) {
   ...
   for (int i = 0; i < kBlockSizeInWords; i++) {
       UInt32 d = _buffer[i];
       data[i * 4 + 0 - kBlockSize] = (Byte)(d);      // <<= DISASSEMBLED
       data[i * 4 + 1 - kBlockSize] = (Byte)(d >>  8);
       data[i * 4 + 2 - kBlockSize] = (Byte)(d >> 16);
       data[i * 4 + 3 - kBlockSize] = (Byte)(d >> 24);
   }
}

Compiling with MinGW GCC 4.8.2 (on Windows) gets:

0x0000000045b65f75 <+149>:   mov    eax,0xffffffc0
0x0000000045b65f7a <+154>:   nop    WORD PTR [rax+rax*1+0x0]
0x0000000045b65f80 <+160>:   mov    r10d,DWORD PTR [r11+0x24]
0x0000000045b65f84 <+164>:   mov    edx,eax
0x0000000045b65f86 <+166>:   add    r11,0x4
0x0000000045b65f8a <+170>:   mov    ecx,r10d
0x0000000045b65f8d <+173>:   mov    BYTE PTR [rbx+rdx*1],r10b

In the last line rbx points to the data and 64 should be subtracted from this address. This is done by adding -64. In the first line of the assembler snippet 0xffffffc0 (-64 (Int32)) saved into eax and then moved to edx. But in the last line

0x0000000045b65f8d <+173>:   mov    BYTE PTR [rbx+rdx*1],r10b

the rdx register instead of the edx register used. At this point the top part of the rdx is 0, so +64 would work just fine. But adding -64 requires the top part of the rdx register to hold 0xFFFFFFFF. Adding &data+0xffffffc0 in 64bit mode produces an invalid address and crashes.

The problem is fixed by changing the assignment (adding (int) cast):

       data[i * 4 + 0 - (int)kBlockSize] = (Byte)(d);
       data[i * 4 + 1 - (int)kBlockSize] = (Byte)(d >>  8);
       data[i * 4 + 2 - (int)kBlockSize] = (Byte)(d >> 16);
       data[i * 4 + 3 - (int)kBlockSize] = (Byte)(d >> 24);

My questions are:

  • Is it a bug or feature?
  • Are there some GCC flags to prevent this behavior?
  • How can I make sure, that entire 7-zip binary of mine is free of such errors?

UPDATE: It turn out, that this behavior doesn't depend on optimization flags and is only observed on Windows

Boris Brodski
  • 8,425
  • 4
  • 40
  • 55
  • 1
    What is the type of `kBlockSize`? – Chad Aug 24 '15 at 21:09
  • @Chad: It's `unsigned int` (top of the first code snippet). – Blastfurnace Aug 24 '15 at 21:20
  • 3
    The code is mixing signed and unsigned. The result will then be unsigned, and cannot be -64. [What happens when I mix signed and unsigned types](http://stackoverflow.com/questions/25609091/what-happens-when-i-mix-signed-and-unsigned-types) – Bo Persson Aug 24 '15 at 21:20
  • "How can I make sure, that entire 7-zip binary of mine is free of such errors?" - there's no simple answer to that but you could start by enabling all warnings. Unfortunately in my experience most open source projects emit screeds of warnings even on the default compiler settings (let alone on high warning level), so you probably have some work ahead of you to fix them all up! – M.M Aug 24 '15 at 21:48
  • 1
    This example is actually pretty good ammunition for those who say that `unsigned` is evil and `size_t` should have been signed. Although we could also say that if you intentionally write code that uses negative array indexing you are asking for trouble and had better be very careful. A more robust fix might be to switch to using pointer notation , `*(data + i * 4 - kBlockSize) = (Byte)d;` (what was the `+ 0` doing anyway???) – M.M Aug 24 '15 at 21:51
  • So you all consider this behavior of gcc as a feature? Surprising for me, because I see no interpretation of this `signed int`/`unsigned int` conflict, that results in such a weird address. @MattMcNabb I added more lines of code, so this should answer the question about `+0` :) – Boris Brodski Aug 24 '15 at 22:59
  • @BorisBrodski It's nothing to do with compiler features, it is how the C++ standard defines the language. `0 - 1u` gives `4294967295u` , not `-1`. This code has always offset the pointer by 4 billion places, even in 32bit mode. – M.M Aug 24 '15 at 23:09
  • @MattMcNabb my problem understanding this is, that `unsigned int` on 64bit platforms 8 byte long. It should be `0xffffffffffffffc0` instead of `0xffffffc0`. And then, adding `0xffffffffffffffc0` is equivalent to adding `-64`. – Boris Brodski Aug 25 '15 at 07:38
  • 1
    @BorisBrodski some 64 bit platforms have `unsigned int` as 32bit. This is called I32LP64. All 64-bit Windows compilers do this (AFAIK). – M.M Aug 25 '15 at 08:04
  • @MattMcNabb Now I understand! Thank you. Post your comment as an answer and I will accept it :) – Boris Brodski Aug 25 '15 at 08:23
  • @BorisBrodski Ross Ridge's answer says much the same thing, you can accept that – M.M Aug 25 '15 at 08:48

2 Answers2

3

It's standard required behaviour when int and unsigned int are 32 bits wide. The expression i * 4 + 0 - kBlockSize is evaluated as (i * 4 + 0) - kBlockSize. The type on the left side of the minus operator is int while on the right side it's unsigned int. These need to converted to a common type and standard C says that type is unsigned int. So during the first iteration of the loop this becomes (unsigned) 0 - 64U which is 0xffffffc0. If pointers were 32 bits then data[0xffffffc0] would "wrap around" and be the same as data[-64], but they're 64 bits so you get a crash instead.

Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
  • In other words, the code used to work by sheer luck. – David Schwartz Aug 24 '15 at 23:22
  • 1
    To be clear it is still undefined behaviour in 32-bit mode, but it just happens that the code generated by the compiler for `data[0xffffffc0]` was the same (or has the same effect) as it would have generated for `data[-64]` – M.M Aug 24 '15 at 23:37
  • I agree, that the common type is `unsigned int`, but `(unsigned int)-64` on 64bit architecture is `0xffffffffffffffc0` ! And this will work. Also if this assembly isn't buggy, the top part of `rdx` should be set to `0`. But it seems missing. – Boris Brodski Aug 25 '15 at 07:31
  • @BorisBrodski I've rolled back your edit as it's misleading. The problem occurs on any target where `int` is 32 bits, which is pretty much every target these days, not just I32LP64 systems. ILP64 targets, with a 64-bit int are rare, and is not what you're using. When `int` is 32 bits then `(unsigned int)-64 == 0xffffffc0`. – Ross Ridge Aug 25 '15 at 15:10
  • @RossRidge Ok. After reading more about the topic I agree with you on this. Also your last edit makes it pretty clear. Thank you! – Boris Brodski Aug 25 '15 at 15:16
  • 1
    @BorisBrodski Yah, sorry, I thought I had put something in my original revision saying `int` needing to be 32 bits. – Ross Ridge Aug 25 '15 at 15:20
2
   data[i * 4 + 0 - kBlockSize] = (Byte)(d);      // <<= DISASSEMBLED

This is wrong. When i is zero, the subtraction tries to produce a negative, unsigned number. There is no such thing.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278